Study  |  07/28/2020

Identifying and Measuring Artificial Intelligence – Making the Impossible Possible

Researchers of the Institute and the OECD have published a new study on how to identify and measure AI-related developments in science, algorithms and technologies. Using information from scientific publications, open source software (OSS) and patents, they find a marked increase in AI-related developments over recent years. The growing role of China in the AI space emerges throughout.

Artificial Intelligence (AI) is a term commonly used to describe machines performing human-like cognitive functions (e.g., learning, understanding, reasoning, and interacting). AI is expected to have far-ranging economic repercussions, as it has the potential to revolutionize production, to influence the behavior of economic actors and to transform economies and societies.

The vast potential of this (now considered) general purpose technology has led OECD countries and G20 economies to agree on key principles aimed at fostering the development of ethical and trustworthy AI. The practical implementation of such principles nevertheless requires a common understanding of what AI is and is made of, in terms of both scientific and technological developments, as well as possible applications.

Addressing the challenges inherent in delineating the boundaries of such a complex subject matter, the study proposes an operational definition of AI, based on the identification and measurement of AI-related developments in science, algorithms and technologies. The analysis draws on information contained in scientific publications, open source software and patents.

Approach of the study

The three-pronged approach of the study relies on an array of established bibliometric and patent-based methods, and is complemented by an experimental machine learning (ML) approach implemented on purposely collected open source software data:

  • The identification of the science behind AI developments builds on a bibliometric two-step approach, whereby a first set of AI-relevant keywords is extracted from scientific publications classified as AI in the Elsevier’s Scopus® database. This set is then augmented and refined using text mining techniques and expert validation.
  • As AI is ultimately implemented in the form of algorithms, the authors use open-source software’s information about software commits (i.e., contributions) posted on GitHub (an online hosting platform) to track AI-related software developments and applications. Such data are combined with information from papers presented at key AI conferences to identify “core” AI repositories. Machine learning techniques trained using information for the thus identified core set are used to explore the whole set of software contributions in GitHub to identify all AI-related repositories.
  • Information contained in patent data serves to identify and map AI-related inventions and new technological developments embedding AI-related components. Text mining techniques are used to search abstracts and patent documents referring to AI-related papers.

Selected findings of the study

  • The authors find an acceleration in the number of publications in AI in the early 2000s, followed by a steady growth of 10% a year on average until 2015, before accelerating again at a pace of 23% a year since then. The share of AI-related publications in total publications increased to over 2.2% of all publications in 2018.
  • 28% of the world AI-related papers published in 2016-18 belongs to authors with affiliations in China. Over time, the share of AI publications originating from EU28, the United States and Japan has been decreasing, as compared to the levels observed ten years earlier.
  • Since 2014, the number of open-source software repositories related to AI has grown about three times as much as the rest of open-source software.
  • There is a marked increase in the proportion of AI-related inventions over the total number of inventions after 2015. This ratio averaged to more than 2.3% in 2017.
  • “Neural networks” and “image processing” are the most frequent terms appearing in the abstracts of AI-related patents.
  • In AI-related patents, the contribution of China-based inventors multiplied more than six fold since the mid-2000s, reaching nearly 13% in the mid-2010s.

For more facts and detailed information, see the publication:

Stefano Baruffaldi, Brigitte van Beuzekom, Hélène Dernis, Dietmar Harhoffi, Nandan Rao, David Rosenfeld, Mariagrazia Squicciarini (2020).
Identifying and Measuring Developments in Artificial Intelligence: Making the Impossible Possible.
OECD Science, Technology and Industry Working Papers No. 2020/05.

Stefano Baruffaldi is Affiliated Research Fellow in the department Innovation and Entrepreneurship Research and Assistant Professor at the University of Bath.

Dietmar Harhoff is director at the Max Planck Institute for Innovation and Competition.