http://ipkitten.blogspot.com/2023/02/uk-government-axes-plans-to-broaden.html

Ready to mine …

As it was explained in the Impact Assessment accompanying the Commission’s Proposal for what would eventually become and be adopted by the Parliament and the Council as the DSM Directive [Katposts here], text and data mining (TDM) is a term commonly used to describe the automated processing (“machine reading”) of large volumes of text and data to uncover new knowledge or insights.

The overall use (and usefulness) of TDM lies in the possibility to analyze big corpuses of text and data such as scientific publications or research datasets. Although classical TDM and machine learning have different utility, both use the same key algorithms to discover patterns in data. TDM plays a significant role in the advancement of Artificial Intelligence applications too [see further here]
While TDM may be performed in different ways, the key value of predictive TDM processes thus lies in facilitating the treatment, recombination, and extraction of further knowledge from large amounts of data and text, allowing the identification of patterns and associations between seemingly unrelated pieces of information.
It is thus clear that TDM does matter – but what should its relationship with copyright and related rights be?
As it is discussed at greater length elsewhere (here), TDM is an example of an area in which legislative intervention has been broadly justified by reference to the need of freeing up certain copyright-covered spaces to facilitate research and increase innovation and competitiveness.
It should be noted at the outset that, on the one hand, some commentators hold the view that TDM would not even be covered by copyright law. On the other hand, the debate around TDM has not developed in a context devoid of licensing practices, at least in Europe. Especially in the aftermath of a 2013 stakeholder-led dialogue, Licences for Europe, scientific, technical, and medical publishers included TDM for non-commercial purposes in their subscription licences for academic institutions and developed common infrastructures to facilitate access to the content to be mined. This said, different contractual conditions and policies were found leading to uncertainty and, as a result, giving rise to transaction costs.

Policy and legislative discussion on the regulation of TDM outside of Europe

In some countries, existing systems of exceptions and limitations (E&L), including fair use under §107 of the US Copyright Act, have been deemed likely to accommodate certain unlicensed TDM activities, although recent and – at the time of writing – pending litigation will require a more substantive assessment as to whether that is in fact the case and to what extent. A class action (Andersen and Others v Stability AI Ltd and Others, Case 3:23-cv-00201, filed 13 January 2023) has been in fact recently filed before the US District Court for the Northern District of California, alleging infringement of copyright in the development and functioning of AI image generator Stable Diffusion [see also IPKat here].
In other legal systems, specific E&L relating to content to which lawful access has been secured have been adopted instead. This has been for example the case of:
  • Japan, which – since 2011 – has had in force an E&L (originally introduced in 2009) specifically allowing TDM;
  • Some EU Member States individually at first and then through action at the EU level (see further below);
  • Singapore, which has an E&L (Section 244 of the Copyright Act) that does not discriminate between commercial and non-commercial TDM, does not pose restrictions in terms of beneficiaries, and allows the making of copies of works and recordings of protected performances for the purpose of computational data analysis or preparatory activities thereto.
The introduction of a specific E&L for TDM has also featured in Hong Kong copyright reform discourse [see IPKat here]. However, the most recent governmental position is that, given the diversity of views expressed by concerned stakeholders, “rushing into incorporating these issues in the amendment bill” is not recommended.

The European approach(es)

During its tenure as an EU Member State, the UK was the first to rely on the EU copyright acquis as it existed in 2014 – specifically: the research exception in Article 5(3)(a) of the InfoSoc Directive – to legislate and adopt an express defence, which cannot be overridden by contract and is not restricted to any particular beneficiary, allowing text and data analysis for non-commercial research (Section 29A CDPA). It is evident that the eventual scope of Section 29A owed to the possibilities and constraints under Article 5(3)(a) of the InfoSoc Directive.
… but, first, a “little” break

The 2011 Hargreaves Review, from which inter alia that reform stemmed, expressly noted how “the law can block valuable new technologies like text and data mining, simply because those technologies were not imagined when the law was formed” and all this whilst the resulting activities would “not prejudice the central objective of copyright, namely the provision of incentives to creators”.

Further to the UK initiative, other EU Member States (France, Estonia, Germany, and Ireland) also considered legislating or legislated in the field of TDM.
In 2019, however, two new mandatory EU-wide exceptions for TDM were adopted as part of the DSM Directive [I analyze and discuss them in detail in my Commentary here].
It was within the Council – that is where EU Member States find their representation in the EU law-making process – that the introduction of a further E&L (besides the one now found under Article 3) without restrictions in terms of beneficiaries and purposes of the TDM (now Article 4) initially emerged.
The rationale of EU intervention in relation to unlicensed TDM is explained in the preamble to the DSM Directive. Recital 8 acknowledges, on the one hand, the value and potential of TDM but, on the other hand, notes the restrictions that copyright and related rights pose to the doing of TDM activities without a licence. Further to the latter, recital 10 highlights the insufficiency of the existing framework, due to both the optional nature of exceptions and limitations to copyright and related rights for scientific research purposes and the limitations of licensing agreements. As such, the intervention of the EU legislature would serve to remedy the legal uncertainties surrounding TDM activities (recital 11) through the introduction of a mandatory, non-compensated exception for the benefit of research organizations and cultural heritage institutions (Article 3 of the DSM Directive) and a mandatory exception or limitation, the one in Article 4, without any particular restrictions in terms of beneficiaries.
In any event, Article 25 and recital 5 of the DSM Directive expressly allow EU Member States to adopt or maintain broader provisions, compatible with the E&L provided for in the Database Directive and the InfoSoc Directive, including exceptions and limitations allowing TDM pursuant to Article 6(2)(b) of the former and Article 5(3)(a) of the latter.

Broadening the UK E&L?

Further to the UK departure from the EU and the obvious fact that the UK did not transpose the DSM Directive into its own law, a debate emerged as to whether the scope of the 2014 E&L should be broadened.
In mid-2022, the UK Intellectual Property Office (IPO) announced that Government would consider broadening the scope for unlicensed TDM activities and introduce a new E&L that would allow TDM for any purpose (including commercial TDM), subject to a lawful access requirement to the relevant copyright works and other protected subject-matter.
The latest news, however, is that such a reform will not go ahead.
Indeed, yesterday the UK Minister for Science, Research and Innovation confirmed that any such plans have now been axed:
Although the Government need to be on the front foot in anticipating the regulatory framework and getting it right, the proposals have clearly elicited a response that we did not hear when they were being drafted. We have taken the responses seriously […] [W]e do not want to proceed with the original proposals. We will engage seriously, cross-party and with the industry, through the IPO, to ensure that we can, when needed, frame proposals that will command the support required.
Is this the end of the road? Most likely not. Besides ongoing litigation in multiple jurisdictions, the discussion of the relationship between copyright and content mining will continue – if not increase in intensity – for the foreseeable future. Stay tuned then!

Content reproduced from The IPKat as permitted under the Creative Commons Licence (UK).