AI Training and Fine-Tuning Data License – General Use

AI TRAINING AND FINE TUNING DATA LICENSE – GENERAL USE, VERSION 1.0

Version 1.0

For internal AI training, evaluation, and research; not for commercial deployment or monetisation.

Platform-Specific License Notice: This License is issued by Opendatabay (www.opendatabay.com) to support both Data Suppliers and Data Buyers. Its use outside the Opendatabay platform, or without Opendatabay’s knowledge and authorization, is strictly prohibited. This License applies to Data Products listed on Opendatabay or provided via the platform, including samples, previews, or full datasets.

1. Definitions

"Data Product": The collection of bundled and packaged data being licensed, which may include text, audio, images, video, raw or structured data, as described in the accompanying documentation, together with any provenance, transparency, or compliance information provided by the Licensor to support AI training, evaluation, and model documentation (e.g., model cards or safety assessments).

"Licensee": The individual or entity receiving rights to use the Data Product under this License.

"Licensor / Data Provider": The individual or entity that collected, prepared, and owns the Data Product and is granting this License.

"AI / ML / LLM": Artificial Intelligence, Machine Learning, and Large Language Models, including neural networks or other model architectures trained or fine-tuned using the Data Product.

"Models": Machine learning models created, trained, or fine-tuned using the Data Product.

"Derivative Works": Outputs of Models, including model weights, embeddings, fine-tuned versions, or other representations derived from the Data Product.

"Outputs": Model predictions, generated text, audio, embeddings, or other results produced using the Data Product.

"Permitted Uses": Activities expressly allowed under this License, including Training, Fine-Tuning, Evaluation, and Research purposes. Commercial deployment or monetisation is not permitted under this General Use license.


2. License Grant and Scope (Permitted Uses)

The Licensor grants the Licensee a non-exclusive, worldwide, perpetual right to use this Data Product solely for internal research, evaluation, and educational purposes. Commercial deployment or monetisation is not permitted under this license.

The Licensee may:

  • Train or fine-tune models, including LLMs and AI agents, AI applications, and systems for internal use only.

  • Create and use model outputs, embeddings, weights, or other Derivative Works strictly for non-commercial purposes.

  • Evaluate models and publish research or benchmarks without redistributing the Data Product.

The Licensor represents that, to the best of its knowledge:

  • it has the rights and authority necessary to license the Data Product for these permitted uses;

  • the Data Product was collected and provided in compliance with applicable data protection and related laws (including GDPR, EU AI Act, and similar regimes);

  • the Data Product does not intentionally include personal data/PII or special category sensitive data;

  • the Licensor is not aware of any malware, viruses, or malicious code in the Data Product;

  • the Data Product has not been intentionally altered to include 'poisoned' samples, trigger phrases, or adversarial inputs designed to compromise model alignment or security.


3. Restrictions (Prohibited Uses)

This license covers internal research, evaluation, and educational use only.

Licensee may not:

  • Use the Data Product for any commercial purposes, monetisation, or deployment in AI services.

  • Resell, redistribute, or sublicense the Data Product.

  • Share Derivative Works if doing so exposes, reconstructs, or reveals the underlying Data Product.

  • Remove, obscure, or alter any digital watermarks, cryptographic signatures, or provenance metadata embedded within the Data Product.

  • Use the Data Product in ways that violate applicable laws or regulations.


4. Outputs, Models, and Derivatives (Rights and Limits)

Licensee may create Models using the Data Product for internal purposes only. Licensee is granted a non-exclusive, worldwide, and perpetual right to:

  • Use Model outputs (predictions, text, audio, embeddings, images, or other results) for internal research or educational purposes.

  • Create and use Derivative Works such as fine-tuned weights, embeddings, or model variants for internal purposes.

  • Evaluate models and publish benchmarks or research findings without redistributing the underlying Data Product.

Restrictions:

  • Redistribution or sharing of the original Data Product is prohibited.

  • Any sharing of Derivative Works must not expose or reconstruct the underlying Data Product.

  • Licensee bears responsibility for ethical use and compliance with applicable laws for internal research.


5. Redistribution and Sublicensing

  • Licensee may not redistribute or sublicense the Data Product itself.

  • Licensee may not share Models or Derivative Works outside their organization if it exposes the Data Product.

  • Internal sharing for research or educational purposes is permitted only within the Licensee’s organization.

Example:

  • Allowed: Training a model internally for research, then publishing research results without sharing the Data Product.

  • Not Allowed: Providing a commercial AI service or selling the model to a third party.


6. Compliance (Law, Privacy, Security, Responsible Use)

  • Licensee must comply with all applicable laws, including GDPR, copyright, and anti-discrimination regulations.

  • Licensee must maintain reasonable technical and organizational measures to protect the Data Product.

  • Licensee may not use the Data Product for high-risk AI systems in production or commercial applications.


7. Data Retention, Deletion, and Audit

  • Retention: Licensee may keep copies only as needed for internal research or evaluation.

  • Deletion: Upon termination or request by the Licensor, Licensee must delete or irreversibly anonymize all copies of the Data Product.

  • Models & Derivatives: May be retained for internal use provided the Data Product itself cannot be reconstructed.

  • Audit: Licensor may request evidence of compliance no more than once per year with 30 days’ notice.


8. Intellectual Property, Attribution, and Trademarks

  • Ownership of the Data Product remains with the Licensor.

  • Licensee owns Models and Derivative Works for internal use, but not the underlying Data Product.

  • No attribution implies endorsement without written consent.

  • Third-party rights remain the responsibility of the Licensee.


9. Warranty Disclaimer and Liability Limits

  • Licensor warrants only that it has the right to license the Data Product.

  • The Data Product is provided “as is”; no guarantees of results, performance, or suitability.

  • Liability is limited to the greater of amounts paid or any fixed cap in the purchase agreement.

  • Licensor is not liable for indirect, incidental, or consequential damages.


10. Term, Termination, and General Terms

Term & Termination

  • Effective from the date of license grant until terminated.

  • Either party may terminate for material breach.

Effect of Termination

  • All rights to use the Data Product end; deletion/anonymization rules apply.

  • Licensee may continue using Models internally if the Data Product cannot be reconstructed.

Governing Law & Jurisdiction

  • Governed by laws of England and Wales; disputes resolved in London courts.

Notices & Assignment

  • Notices must be in writing. Licensee may not assign rights without Licensor consent.

Last updated