The Academic Emulator
Overview
This is the website of the Academic Emulator, an online experiment in generating new or ‘emulated’ academic writing using OpenAI’s GPT-family of Large Language Models or LLMs. It has been developed by Dipanjan Saha, Michael Mair and Phillip Brooker based on a Research Development Fund award from the School of Law and Social Justice at the University of Liverpool. The Academic Emulator is intended to stimulate discussion about Generative AI and its potential uses and limits in academic work practices.
Background
In June 2018, the Artificial Intelligence (AI) lab, OpenAI, relatively unknown at the time, made an announcement. Combining two kinds of machine learning approach – “transformers” and “unsupervised pre-training” – it had developed a natural language processing (NLP) algorithm it called GPT, or Generative Pre-Trained Transformer (OpenAI 2018).
Less than a year later, in February 2019, OpenAI announced GPT-2, the successor to GPT, a much more powerful algorithm, with many more parameters, capable of being trained on far larger datasets and of producing text that OpenAI suggested was often practically indistinguishable from that of human writers. So good was GPT-2, OpenAI claimed, and so large the risk of its misuse, that they could not release the source code to the public because it was simply “too dangerous” ( OpenAI 2019a), a claim that brought them global public attention if not notoriety (see, e.g., Hern 2019). By November 2019 they retracted that assessment, releasing the model in full (OpenAI 2019b), and it was quickly taken up and experimented with by a newly receptive audience primed by the intrigue created by the February press release (see, e.g., Vincent 2019).
In July 2020, following a partnership with Microsoft and a billion-dollar investment in late 2019, OpenAI unveiled GPT-3, the successor to GPT-2 built with even more parameters and thus training data processing power (OpenAI 2020), with eager media around the world being invited to let it write op-ed pieces for them or mixed pieces in which readers were invited to guess which parts of the text were machine and which human produced (on the latter see, e.g., Bieber 2020). The Guardian’s GPT-3 generated op-ed, ‘A robot wrote this entire article. Are you scared yet, human?’ (Guardian 2020a) and the explanatory article which accompanied it, ‘How to edit writing by a robot: a step-by-step guide’ (Guardian 2020b), garnered perhaps the most public attention but there were other episodes besides. One US computer scientist used GPT-3 to produce an entirely AI-generated blog under a fake name to see if anyone would notice, with one post ending up as the most read item on the website Hacker News before the subterfuge was announced (Hao 2020, Porr 2020) while another, in a Black-Mirroresque fashion, used past text messages to train a GPT-3 powered chat-bot to respond like his dead fiancée as a way of managing his loss (Fagone 2021).
Despite applications such as these being dismissed as misleading theatre by many in often wide-ranging and pentrating critiques of the capacities of the technology (e.g., Dickson 2020, Holloway 2020, Macaulay 2020, Marcus & Davis 2020), commentators were nonetheless impressed by GPT-3’s capacities and programmers around the world began working on replicating, fine tuning and expanding GPT-3 including, importantly, for languages other than English (see, e.g., Knight 2021) including coding languages, a development that led to the launch of Codex (OpenAI 2021). The launch of ChatGPT in November 2022 (OpenAI 2022) and subsequent iterations of OpenAI’s LLM since then (e.g. OpenAI 2023) have catapulted OpenAI’s LLM to international public consciousness with thousands of articles, many by the LLM itself, now in circulation on its uses, abuses, potentials, limitations and dangers.
The Academic Emulator
In September 2022 when we received funding for a new project using the GPT-family of LLMs we did not foresee the Generative AI explosion the launch of ChatGPT initiated. However, we did think these technologies would have some interesting applications. Machine learning based AI technologies like the GPT family and its equivalents are presented as writing support tools. Trained on a corpus of someone’s writing, they are said to be able to emulate that writing and generate new text on new topics in that individual writer’s ‘voice’ and style. The potential applications in literary and academic writing are only really beginning to be explored. Mario Klingeman, for example, emulated Jerome K. Jerome to elicit his possible thoughts on Twitter decades after his death (Klingeman 2020), while Max Braun (2022) emulated Wittgenstein’s Tractatus to generate new propositions in the work a century after it was written (see https://wittgenstein.app).
Literary and academic experiments in AI-assisted ‘digital necromancy’, the raising of the dead via Generative AI, piqued our curiosity in particular and we wanted to find out if (a) we could use the GPT LLMs to emulate past academics’ work convincingly and (b) whether that emulated text would have any value to contemporary academics. We did not and do not think that an algorithm can substitute for scholarly research and writing – in fact, we can reasonably expect most results to be nonsensical (see, e.g. again, Guardian 2020b, Marcus & Davis 2020). However, exploring what Generative AI can and cannot do will give us a greater sense of just how these technologies may (and may not) fit into academic practice in future.
Academic work represents an interesting domain for necromantic experiments. Although they do not ordinarily cast their work in these terms, academics are instinctive and habitual necromancers. Not only are volumes, lives and scholarly communities devoted to working out what long-departed figure x did or did not really say, may or may not have had to say about question y in relation to topic z, academics also conjure the dead using a wide, flexible and open-ended set of methods and technologies – archives, annotations, notes, analysis, dialogue – which they upgrade as new possibilities for doing so emerge. Nor is this solely about famous figures – what would Einstein say about climate change, or Marx? Academics themselves pass away and when they do, their work is frequently built upon, allowing them to remain active figures in their fields posthumously as a result. Many become more influential once deceased.
Against that background, we wanted to take cutting-edge AI technologies and assess their usefulness in a key area of contemporary academic work – writing – in an extension of our existing research in this area, which applies ethnomethodological thinking to algorithmic technologies through close practical engagement with them. In setting up our necromantic experiment, we decided to work through a test case with ChatGPT, using the final sections of Max Weber’s The Protestant Ethic and the Spirit of Capitalism, where Weber hints at the limits and hubris of a fossil fuel era, in order to evoke digital Weber’s algorithmically generated perspective on the climate crisis.
While the emulator will offer more possibilities, our initial results are outlined below:
The sociologist Max Weber had this to say about our cultural order: “This order is now bound to the technical and economic conditions of machine production which today determine the lives of all individuals who are born into this mechanism, not only those directly concerned with economic acquisition, with irresistible force. Perhaps it will so determine them until the last ton of fossilized coal is burnt. In Baxter’s view the care for external goods should only lie on the shoulders of the “saint like a light cloak, which can be thrown aside at any moment.’ … No one knows who will live in this cage in the future, or whether at the end of this tremendous development entirely new prophets will arise, or there will be a great rebirth of old ideas and ideals or, if neither, mechanized petrification embellished with a sort of convulsive self-importance. For of the last stage of this cultural development, it might well be truly said: ‘Specialists without spirit, sensualists without heart; this nullity imagines that it has obtained a level of civilization never before achieved’.” Can you compose an epigraph about the climate crisis in the style of Max Weber?
In the crucible of urgency, the climate crisis beckons as both consequence and challenge. The insatiable engines of prosperity have conjured forth tempests of change, where the balance of nature finds itself strained against the forge of human enterprise. The flames of progress, once a beacon of hope, now cast their uncertain glow upon a world in flux. As the winds of fate shift, the choices of today lay the cornerstone upon which the edifice of tomorrow shall stand, whether as a monument to wisdom or a relic of folly.
Can you improve that epigraph?
In the relentless forge of time, the climate crisis stands as both reckoning and opportunity. The fires of industry, once heralded as mankind's triumph, now cast an uncertain glow upon a planet in tumult. As the tides of destiny ebb and flow, our choices today shape the foundation upon which the future shall be inscribed—a testament to our stewardship or an epitaph of shortsightedness.
ChatGPT’s offers are not particularly profound though it does capture Weber’s voice quite well. Before dismissing such outputs, however, we’d like to note that they might offer spurs to further thought and of creatively linking past matters of concern with present ones. That may be the best use to which this technology can be put whether in emulating the dead or the living.