AI Aspects Related to the Implementation of the DSM Directive in Norwegian Law
In November 2023 the Ministry of Culture and Equality sent its proposal for amendments to the Copyright Act ("CA")[1] (in Norwegian "åndsverkloven") for consultation. The proposal includes implementation of the Digital Singel Market Directive (EU) 2019/790 ("DSM") in Norwegian law.
New technology enables automated computer assisted analysis of information in digital form, such as text, sound, images or data. To analyse information, the information must obviously first be gathered. The UK Government has described text and data mining ("TDM") as "the process of deriving information from machine-read material. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns"[2]. As we will see below, the DSM uses a somewhat different definition of TDM, that is likely to include machine learning and the training of algorithms[3].
In this article we provide comments to the DSM and the accompanying proposed amendments in the CA relevant for TDM employed by artificial intelligence ("AI"). For simplicity we do not provide comments to what extent the proposed new CA sections should be interpreted to differentiate between TDM and training/machine learning, as it seems likely that these provisions are to include such training/learning. We will also highlight a few interpretational issues to be aware of.
AI and the CA
AI has been described as "the ability of a machine to display human-like capabilities such as reasoning, learning, planning and creativity"[4].
AI performs actions based on interpretation and processing of structured or unstructured data, for the purpose of achieving a given goal. AI systems are used for several purposes and in many contexts. A few examples where AI is used are online shopping and advertising, image processing, language processing, chat bots, autonomous cars, medical diagnostics, food distribution and identification of persons.
Examples of generative AI[5] include Open AI's ChatGPT, Google's Gemini and Microsoft's Copilot. Each of these applications provide a user interface allowing communication with the applications. The AIs' replies are based on the large amounts of text/data their models have been trained on. Other examples of generative AI are AI systems to create images, such as Stable Diffusion, Midjourney and Imagen.
As all AI models are required to be trained on large amounts of data, the AI text and data mining, training and subsequent use of AI raises several copyright issues, such as:
- Does the collection of information (mining) infringe copyrights?
- Does the analysis/training of the collected text/data infringe copyrights?
- Does the output from the AI infringe copyrights?
-
Can the output generated by the AI be copyright protected?
For databases copied/used in the AI development similar questions arise, cf. the sui generis protection of databases in the CA § 24.
The implementation of the DSM in Norwegian law will contribute to answering some of the above questions, but there are still several pending.
The Ministry's proposal
The Ministry's proposal was made to implement the DSM and the Web and Forwarding Directive (EU) 2019/789[6] in Norwegian law. The deadline for interested parties to comment on the proposals was 15 March 2024. The interest for the proposal was significant and to some extent critical. 80 organisations/parties provided comments.
For example, the Norwegian Writer’s Association stated in its response that[7]:
"In light of the […] absence of a clear definition of text and data mining as well as a grossly incorrect premise that there are effective methods for reservation against data extraction, [the Norwegian Writer's Association] believes the investigation in chapter 3 of the consultation note is a breach of the minimum requirements of the investigation instructions for such investigations.
[The Norwegian Writer's Association] also does not find that the proposals in chapter 3 strengthen the rights of creators and other artists, especially in the face of new technology and cross-border services […]. On the contrary, the proposed implementation of art. 3 and 4 appears so intrusive, while the rights holders' options to opt-out are minimal, that there is reason for raising questions about whether the proposal infringes the three-step test, cf. e.g. The Berne Convention art. 9 (2)."[8] (Our office translation.)
On the other hand, the Norwegian Directorate of Digitalisation was positive[9]:
"We […] would like to express our general support for the changes. Digdir is particularly positive about the introduction of the text and data extraction exception in Norwegian law." (Our office translation.)
It remains to be seen to what extent the comments lead to material amendments to the Ministry's proposals. The freedom to choose specific national regulations is limited, in particular due to the Berne Convention and Norway's EEA-obligations. Within such limits, the Ministry must seek to balance conflicting interests:
- Strict rules for TDM in Norway may hamper AI development in Norway and lead investors and developers to other countries with a less strict regime. In such case Norway’s competitiveness will be harmed.
Vs.
-
The rightholders' need for protection of their investments and efforts to develop the protected information and achieve a reasonable consideration for third parties' use of the same[10].
A consideration to take in this regard is that the DSM and the proposed CA amendments do not include any "fair use" doctrine, as we find in the USA[11]. If the "fair use" doctrine allows for TDM without consent from the rightholders, then the opt-out rules of the new CA may lead to USA being preferred for such activities. Whether or not the "fair use" doctrine allows for such use is however debated. This is displayed by a lawsuit from New York Times filed late 2023 against Open AI and Microsoft, claiming Chat GPT and Copilot used copyrighted material scraped from and trained on the huge New York Times article archive without permission[12].
Present CA with regulations
The present CA with regulations contain limited exceptions to the rightholders' exclusive right to produce copies of their copyrighted works. These exceptions are in essence limited to the production of temporary copies with no economic value in themselves and do not include production of copies of software and databases. (Certain specific regulations apply for special purposes such as archival, research and education.)
Proposed amendments
The DSM Article 2 No. 2 stipulates that "text and data mining” means:
"any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations"
To facilitate TDM, the DSM contains mandatory exceptions/restrictions on copyright, and the sui generis right for databases, which shall make it possible to perform TDM without right-clearance from the protected works/databases which are subject to TDM.
Certain exemptions shall apply for TDM for the purposes of scientific research pursuant to the DSM Art. 3. A broader exception is provided by Art. 4, set out in more detail below, which applies to anyone making use of TDM. If an exemption applies, then the TDM can be done without right clearance and payment to the rightholders.
For reproductions and extractions produced for TDM to be lawful under Art. 4, it is a condition that the works and other subject matter for TDM is "lawfully accessible" and that such use has not been expressly reserved by the rightholders in an appropriate manner. The rightholders cannot prohibit use under Art. 3 for scientific purposes by explicit reservation.
The Ministry's proposal solely addresses the question on whether it will be legal to perform TDM and train the AI-models on text and data to become "intelligent”, without this implying infringement of rights to the works on which TDM is performed / the AI-models are trained. Accordingly, the proposed amendments do not imply any contribution to the discussion on if output generated by AI may be copyrighted itself. The proposals do neither provide exceptions to the rightsholders' rights in case the output from the AI models in themselves represent infringement. The Ministry further emphasises that the proposals for limitations do not imply that copyrighted material may publicised.
The Ministry proposes to implement the definition of TDM and the mandatory rules on producing copies for TDM in three new sections of the CA, § 50 d, e and f.
§ 50 d includes the definition of TDM, which the Ministry emphasises shall be understood in the same way as the DSM Art 2 (2).
§ 50 e includes provisions for TDM purposes that anyone can carry out (DSM Art. 4). Note that it is not a requirement that the object for TDM is in digital form. Preparatory actions such as digitisation of analogue information, such as old paper books, required to make the information possible to perform TDM on is encompassed.
§ 50 f includes provisions for research institutions and cultural heritage institutions (DSM Art 3).
For the new provisions to apply the purpose must solely be TDM. Copies cannot be used for a different purpose and not be stored longer than what the new provisions allow for.
A question raised by several of the respondents is if the Ministry's interpretation of the DSM's requirement for "lawful access" in DSM Art. 3 and "lawfully accessible" in DSM Art. 4 is correct. The Ministry states that it cannot see reasons for interpreting the requirements differently, even though there is a slight linguistic difference. Accordingly, the Ministry concludes that protected information that is published on the Internet, but not lawfully, is not encompassed by Art. 4's TDM exception. However, an objection to such an interpretation is that the AI developers making use of the text and mining exception in CA § 50 e (DSM Art. 4) have no possibility in practise to assess whether information published on the Internet has been lawfully published. It is further argued that the Ministry's interpretation is contrary to the purpose of the DSM, as it in practise will hamper TDM.
Another objection to the Ministry's proposal that has been presented is that it does not clarify what is meant by the Ministry's statement that works from "illegal sources cannot be used for" TDM. (Our office translation.) According to the objection, the Ministry's statement could be understood as to prohibit TDM on published material which in itself is not "lawful" – e.g. due to defamatory content / call for terrorism etc.
Furthermore, some of the respondents argue that the proposed CA does not give sufficient guidance as to how the rightholders shall opt-out in an appropriate way. Contrary to the DSM, the proposed CA § 50 e does not include any examples as to how the rightholders can "appropriately" reserve against such use as Art. 4 allows for. Pursuant to Art. 4 an appropriate manner to make such reservation would be to make the reservation in the form of machine-readable means in the case of content made publicly available online. Several of the respondents have thus recommended to include the CA § 50 e with this exemplification, in order to ensure that the CA § 50 e is interpreted in accordance with the DSM and for educational purposes.
It should be noted that the Ministry does not propose to distinguish between photographs that are protected as such and photographs that are protected as works. Both are thus subject to the proposed new CA TDM provisions.
It should also be noted that the EU AI Act[13] includes provisions that oblige all AI providers of general-purpose AI models[14] to comply with the EU AI Act's Art. 53 (1) and Annex XI documentation obligations. Thus, the general-purpose AI providers must e.g. establish and run a copyright policy to identify and comply with the opt-out reservations of the DSM Art. 4 (3) and "draw up and make publicly available a sufficiently detailed summary about the content used for training of the general-purpose AI model"[15]. Thus, the AI Act must be expected to contribute to the efficiency of the DSM's limitations on TDM also in Norway – when implemented sometime in the future.
[1] The proposal also includes proposal for amendments to other Norwegian legislation.
[2] See: https://webarchive.nationalarchives.gov.uk/ukgwa/20140603093549/http://www.ipo.gov.uk/ipreview-doc-t.pdf
[3] Professor Inger Ørstavik at the University of Oslo raises the question if the DSM allows for training algorithms / machine learning or is limited to TDM but concludes that this is likely. However, note that the answer to the question is debated by the market players. See her publication section 4 and conclusion in section 5: https://www.duo.uio.no/bitstream/handle/10852/98788/20210218%2BOrstavik%2BII%2Bclean%2Bversion.pdf?sequence=4&isAllowed=y
[4] See: https://www.europarl.europa.eu/topics/en/article/20200827STO85804/what-is-artificial-intelligence-and-how-is-it-used
[5] In an article published by IBM generative AI is described as: "Generative AI, sometimes called gen AI, is artificial intelligence (AI) that can create original content—such as text, images, video, audio or software code—in response to a user’s prompt or request." See: https://www.ibm.com/topics/generative-ai.
[6] DIRECTIVE (EU) 2019/789 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 17 April 2019 laying down rules on the exercise of copyright and related rights applicable to certain online transmissions of broadcasting organisations and retransmissions of television and radio programmes, and amending Council Directive 93/83/EEC
[7] See: https://www.regjeringen.no/contentassets/b995e355a68043d19d6b990824ae1d59/den-norske-forfatterforening.pdf?uid=Den_norske_Forfatterforening
[8] For further discussions on the three-step test see Professor Rosati's publication on: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4629528
[9] See: https://www.regjeringen.no/no/dokumenter/horing-om-endringer-i-andsverkloven-digitalmarkedsdirektivet-m.v/id3013710/?uid=f42289b0-dfc8-4d5c-86ed-4777a1bd060e
[10] For further details see Professor Inger Ørstavik op cit. in section 5 where it is argued that "To allow individual authors to oppose use of their works to train AI when those works are included in a collection or database, cannot be explained by the economic incentive system of copyright".
[11] See: https://www.copyright.gov/fair-use/
[12] See: https://www.cmswire.com/digital-experience/ai-copyright-infringement-quandary-generative-ai-on-trial/ with further references and https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf, in particular section 160 forward.
[13] The AI Act is expected to enter into force the next few months, with the general-purpose AI rules being applicable 12 months thereafter, cf. the AI Act Art. 113 (b).
[14] "general-purpose AI model" is defined in the AI Act Art. 3 (63)
[15] Note that certain exceptions for AI models released under a free and open-source license apply, cf. the AI Act Art. 53 (2).
This article is intended to be a general summary of the law and does not constitute legal advice. You should consult with counsel to determine applicable legal requirements in a specific situation.