Artificial intelligence: CNIL unveils its first answers for innovative and privacy-friendly AI

16 October 2023

By submitting for public consultation its first how-to sheets on the creation of datasets for the development of artificial intelligence systems, the CNIL responds to industry stakeholders and shows that the General Data Protection Regulation (GDPR) supports an innovative and responsible approach.

The CNIL is mobilising for an innovative AI that respects people

The development of artificial intelligence brings great technological opportunities in all areas of the economy and society: for instance, in health, for public services or business productivity. The CNIL wishes to support innovative actors and guarantee the protection of individual freedoms.

Indeed, the training of AI systems consumes a lot of data, especially personal data, the use of which is framed to protect the privacy of individuals. The use of such algorithms may, in some cases, affect the rights of individuals, for example by facilitating the creation of false information, by multiplying fully automated decision-making processes or by allowing new forms of monitoring and surveillance of individuals.

Faced with these new challenges, the CNIL promotes responsible innovation that explores the latest artificial intelligence technologies while protecting people. In January 2023 it set up a department, which is now operational, and launched an action plan in the spring to clarify the rules and support innovation in this field. Two support programmes dedicated to artificial intelligence have been launched to support French actors: a sandbox for three projects using artificial intelligence (AI) for the benefit of public services and an enhanced support program for three innovative mid-size companies (“scale-ups”) including one specialised in the provision of AI datasets and models.

The CNIL wants to bring legal certainty to artificial intelligence actors

The CNIL met with the main French players in the AI field, whether they are companies, laboratories or public authorities. All have raised a strong need for legal certainty. It also launched this summer a call for contributions on database building in order to inform its reflection.

To clarify the applicable rules and after these steps, the CNIL, today published a first set of guidelines for the use of AI that respects personal data regulation. It will be followed by others, which will complement them on other issues in the AI sector.

Participate in the public consultation

GDPR offers an innovative and protective framework for AI

The exchanges in recent months have raised concerns: according to some, the principles of purpose limitation, data minimisation, storage limitation and restricted reuse resulting from the GDPR would hinder or even prevent research, innovations or applications in the field of artificial intelligence.
The CNIL responds to these objections, confirming the compatibility of AI research and development with the GDPR, provided that it does not cross certain red lines and respects certain conditions.

The purpose principle also applies appropriately to general purpose AI (GPAI) systems

The purpose limitation principle requires the use of personal data only for a specific goal defined in advance. Regarding AI, the CNIL recognizes the fact that an operator cannot define at the training stage of the algorithm all its future applications, provided that the type of system and the main possible functionalities have been well defined.

The principle of data minimisation does not prevent the use of large datasets

The principle of data minimisation does not prevent, according to the CNIL, the training of algorithms on very large datasets. On the other hand, the data used must,in principle, have been selected to optimise the training of the algorithm while avoiding the use of unnecessary personal data. In any case, certain precautions to ensure data security are essential.

The retention period of training data may be long if justified

The principle of storage limitation will not prevent the definition of long durations for training datasets, which require significant scientific and financial investment and sometimes become standards widely used by the community.

Re-use of datasets is possible in many cases

Finally, the CNIL considers that the re-use of datasets, in particular publicly available data on the Internet, is possible to train AI systems, provided that the data has not been collected in a manifestly unlawful manner and that the purpose of re-use is compatible with the initial collection. In this regard, the CNIL considers that the provisions on research and innovation in the GDPR provide a regime for innovative AI actors who use third-party data.

The development of AI systems is compatible with privacy issues. Moreover, taking this imperative into account will enable the emergence of ethical devices, tools and applications that are faithful to European values. It is on this condition that citizens will trust these technologies.

Texte reference