RDP 2023-10: Adoption of Emerging Digital General-purpose Technologies: Determinants and Effects 2. Data

2.1 Adoption data

Measuring adoption directly is challenging. Various sources of data have been explored in the literature, including surveys (McMillan et al 2022; Calvino and Fontanelli 2023), job advertisements (Alekseeva et al 2021), patents, and firms' websites (Calvino et al 2022). Each of these sources has advantages and disadvantages. For example, surveys and websites can be representative of the broader business population, but often have limited longitudinal firm information. Hiring and patent data may have more longitudinal information, but potentially only capture a narrow slice of the adoption process.

We take an alternative approach by using the text of annual reports and earnings calls of listed firms, obtained via Refinitiv. This provides a potentially broad measure of adoption with good longitudinal information, though the sample will necessarily be skewed towards larger firms. Similar approaches have been applied in the literature to measure other metrics such as firms' climate change exposures (Sautner et al 2023) and firms' debt covenant exposures (Nguyen 2022).

We look for mentions of words related to each of the technologies of interest in the text. The list of words we use is taken from Bloom et al (2021), who identify word pairings, or ‘bigrams’, that appear to capture technological advancements that have influenced businesses based on patents and US firms' earnings calls. They then group these bigrams into technology clusters. While they identify several clusters, we focus on two for this paper: AI/ML and cloud computing. See Appendix A for a list of the bigrams contained in each technology cluster.

We identify usage of one of these technologies based on the inclusion of one of these bigrams in the text and identify adoption as the first reference to the technology. Rather than simply taking any mentions of the terms as evidence of technology use, we apply several extra filters to ensure that we are legitimately identifying adoption or use of the technology. First, we remove pages with three or more proper nouns, which we identify using in-built parts of speech identification functions in Python's Natural Language Toolkit. This allows us to remove Board member biography pages, which often have references to a director's experience.[1] We also require the annual report or earnings call to have at least two references to the technology. Having applied these filters, we then spot check 10 reports or calls each year and find no false positives where references do not appear to reflect use. As a further robustness, we also consider the results focusing only on annual reports and removing earnings calls where there may be more references to ‘hot topics’. Doing so does not change the results.

While we are confident that we have removed all false positives, we cannot rule out the possibility of false negatives where firms are using the technology but not reporting it, or not reporting it in a way that our method identifies. Assuming these false negatives occur randomly, and we are not consistently missing certain types of adoption or certain types of firms' adoption, the false negatives will tend to be ‘noise’ biasing our analysis towards finding no results. That said, we would expect our approach to mainly pick up substantive implementations of the technology, rather than, for example, simply shifting to a cloud version of word processing software. As such, our results should be interpreted as relating to substantive implementations of these technologies.

2.2 Firm-level financials

To examine the characteristics and performance of adopting firms, we merge our adoption data with financial details sourced from Morningstar. Table 1 displays the median values of various financial statistics for firms outside of the IT sector (which we abstract from for most of the paper) in 2022 as an example, comparing those adopting in that year to those who are yet to adopt.[2]

Table 1: Descriptive Statistics of Firm Characteristics
By whether a firm adopted GPT in 2022, median
  Adopters in 2022 Firms yet to adopt in 2022
Return on assets (%) 0.8 −3.9
Revenue ($m) 108.3 1.6
Assets ($m) 413.9 42.9
Cash ratio (%) 0.1 4.2
Debt ($m) 54.0 0.7
Gearing ratio 0.4 0.1
Number of observations 70 928

Notes: Firms in the IT sector are excluded, as are firms that have adopted prior to 2022. Return on assets is measured as EBITDA/assets*100, cash ratio is measured as cash/assets*100 and gearing ratio is measured as debt/equity*100.

Sources: Authors' calculations; Morningstar; Refinitiv.

Generally, adopters are larger in terms of both revenue and assets. However, they have lower levels of liquid assets. They also hold a greater volume of debt, with a higher gearing ratio. Importantly, adopters appear considerably more profitable than non-adopters, as evident from their much higher return on assets.

2.3 Board of Directors characteristics

To consider the role of the Board of Directors, we use information from S&P Capital IQ. Specifically, this database has information and biographies on the past and present Board members at Australian listed companies. Using a snapshot of the Board members as at March 2023, we derive variables for each individual, covering: age; gender; whether they have a Master of Business Administration degree; whether their study was in a STEM-related field[3]; whether they were previously a Board member or a ‘key professional’ in the IT industry; whether they have experience with GPT, as indicated by reference to any of the technology-related words in their biography. For each firm, we create metrics capturing the share of Board members with certain characteristics, based on the position held as of March 2023.[4]

Table 2 provides a snapshot of the data, splitting the sample into those firms adopting technologies in 2022 and those that have not adopted to that point, again focusing on non-IT firms. On average, adopting firms' boards are more likely to have a female member, an individual with an MBA or STEM degree and previous Board positions in IT firms. Adopters are also more likely to have a director with experience in emerging digital GPT, based on the relevant keywords appearing in their biography.[5] As discussed below, given the nature of the board data, we cannot say definitively whether the companies had Board members with these skills at the time of adoption, or whether they joined subsequently to facilitate effective adoption.

Table 2: Descriptive Statistics of whether at Least One Member of the Board of Directors have a Particular Characteristic
By whether a firm adopted in 2022
  Adopters in 2022 Firms yet to adopt in 2022
Female (%) 49 36
STEM degree (%) 39 28
MBA degree (%) 42 25
Experience with GPT (%) 25 10
Experience in IT industry (%) 9 2
Age (average, years) 60 60
Number of observations 67 787

Notes: Based on March 2023 snapshot of Board members. Firms in the IT sector are excluded, as are firms that have adopted prior to 2022. The number of observations that have information about age of Board members is smaller (41 adopters and 423 non-adopters).

Sources: Authors' calculations; Morningstar; Refinitiv; S&P Capital IQ.

2.4 Hiring data

As a final dataset, we also incorporate information on firms' job advertisements and whether they mention GPT, which we take as an indicator of trying to bring in those skills. We use the database constructed by Bahar and Lane (2022), which uses Lightcast job ad data from 2012 to 2020 to create indicators identifying whether firms are advertising for GPT-related skills. We match these to our database using a combination of ASX tickers and company names, along with fuzzy matching. Similar data were used by Bloom et al (2021), as well as several papers looking at AI use (e.g. Alekseeva et al 2021; Calvino et al 2022; Babina et al 2023, 2024).

Footnotes

Initial analysis of the unfiltered data indicated that a very large share of false positives came from these biographies. We also use other cut-offs, such as the 90th percentile of the number of proper nouns within each page in the document. Results are similar. Our approach may lead to some false negatives, for example if Amazon Web services is identified as a proper noun. That said, most of the results are robust to including these pages so this does not appear to be a major issue. [1]

As such, those firms that previously adopted are not included. We take this approach to think about the characteristics of firms when they adopt, rather than after adoption, given adoption could be associated with firm growth. That said, the patterns are similar if we take the simpler approach of just looking at firms that ever adopted, versus those that have not (Table B1). [2]

A STEM degree is a degree in the fields of science, technology, engineering or mathematics. [3]

Ideally, we would create a panel dataset capturing those holding a position in each year. While there is some data on past employment, data on the years at other listed companies is not readily available. We experimented with constructing a panel using names in company reports and work histories in S&P Capital IQ, but the match rate was too low. [4]

One concern with this metric is that references to the technology might reflect the adoption of technology by the firm in question. However, spot checks of the data suggest this is not the case, and that references tend to relate to past work or educational background. [5]