Ensuring the lawfulness of the data processing - Defining a legal basis

07 June 2024

An organisation that wishes to build a training dataset containing personal data and then use it to develop an AI system must ensure that the processing is lawful. The CNIL helps you determine your obligations based on your responsibility and the means of collecting or reusing the data.

This content is a courtesy translation of the original publication in French. In the event of any inconsistencies between the French version and this English translation, please note that the French version shall prevail.


The controller must in all cases define a legal basis and carry out, depending on the method of collection or re-use of the data, certain additional verifications.

There are several ways to build a  training dataset, which can be used cumulatively:

  • data is collected directly from individuals;
  • data is collected from open sources on the Internet for this purpose;
  • data was initially collected for another purpose by the controller itself (e.g. in the context of providing a service to its users) or by another controller. This involves taking additional precautions.







Define a legal basis

The legal basis for consent

The legal basis for the legitimate interest

The legal basis of the task carried out in the public interest

The legal basis of the contract

The basis of the legal obligation