LAION is a non-profit organization that provides datasets, which are structured sets of data that can be presented in various formats (text, numbers, images, videos, etc.) and types (tables, graphs, trees, etc.). These datasets are used by LAION to train generative AI models such as Stable Diffusion and Midjourney, thereby advancing research in machine learning. Christoph Schuhmann, its founder, has collected more than 5.8 billion images from various sources, freely accessible online on its website: https://laion.ai/.
In this case, LAION incorporated a photograph by Robert Kneschke into LAION-5B, one of its training datasets composed of links to images and associated textual descriptions. The photograph, in low resolution and watermarked, was sourced from the Bigstockphoto website, where Kneschke was commercially licensing it. The website’s terms of use explicitly prohibited the exploitation of images by 'automated programs'.
Summoned before the Regional Court of Hamburg by Robert Kneschke for copyright infringement, LAION notably invoked in its defense the data mining exceptions provided by Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market (hereinafter 'DSM Directive'). These exceptions have been transposed into Sections 60d and 44b of the German Copyright Act (UrhG) as well as Article L122-5-3 of the French Intellectual Property Code.
Data mining is crucial for training artificial intelligence models, such as those used by LAION. It is defined as the automated computational analysis of information in digital form (text, sounds, images, data), enabling the processing of large quantities of information.
The DSM Directive validates the practice of data mining in two scenarios:
In a decision dated September 27, 2024, the Regional Court of Hamburg delivered the first ruling in Europe regarding the two data mining exceptions invoked by LAION. The German judge addressed the application of the data mining exception for scientific research, transposed into Section 60d of the UrhG. The court first recognized that LAION’s creation of the dataset qualified as scientific research—even though it was not yet associated with an immediate gain in knowledge—after determining that the dataset had been made freely available to the scientific community. The court also ruled that LAION did not pursue commercial objectives by making the dataset freely available, even if it could be reused by commercial entities. Consequently, as LAION met the requirements for the scientific research exception, the court concluded that LAION had not infringed Robert Kneschke’s copyright.
Although the court was not required to rule on the general data mining exception transposed into Section 44b of the UrhG, given the scientific nature of the use, it nonetheless took the opportunity to offer interpretative guidance on the conditions of its application. The German judge asserted that the opt-out mechanism implemented by Kneschke in Bigstockphoto’s terms of use, although written in natural language, met the machine-readability requirements set forth by the DSM Directive. This reservation of rights being valid, the reproduction of Robert Kneschke's photograph in LAION’s dataset is not covered by this exception.
This decision is particularly significant as it is the first to rule on the application of the data mining exceptions provided by the DSM Directive. Notably, it takes a favorable position on whether the creation of datasets for AI training purposes falls under the scientific research data mining exception, even when commercial entities may indirectly benefit. It also emphasizes that the evolution of AI technologies must be considered when interpreting rights reservations, paving the way for future debates on the intersection of data mining, AI, and copyright law.
In conclusion, this German ruling reminds us that the application of data mining exceptions in the context of AI is often a complex and case-specific exercise.