Disclaimer: If you have checked my homepage, you know that I work for Microsoft. And part of what I work on are the Azure AI services that include the Azure OpenAI service.
So here we are, April 12th, 2023, a few days after Easter, and the discussion about large language models rages on, across science, newspapers, social media, podcasts and talk shows in no particular order and with limited knowledge and lots of opinions and some behind-the-curtain advocacy and evangelism in all directions. Opinions are mostly themed TEOTWAWKI (The End Of The World As We Know It, cue REM ) and come in both flavors, apocalyptic and visionary, and colored by nationality, socio-economic status and political point-of-view.
Isn’t this a great starting point to provide a few prompts for ChatGPT?
Provide a teaser for an apocalyptic opinion piece on large language models from an European left-wing commentator.
“Lost in the Language Maze: How the Rise of AI-Powered Models Threatens Democratic Discourse and Reinforces Inequality – A Left-Wing Perspective on the Apocalypse of Big Language Models.”
Provide a teaser for a visionary opinion piece on large language models from an Asian nationalist pro-economy commentator.
“Breaking the Western Monopoly: How Large Language Models are Empowering Asia’s Economic Sovereignty”
You get the gist. With these teasers, you don’t even have to read the articles anymore. Try some for yourself at chatgpt.openai.com
The other type of opinion piece of a more alarmist fashion comes from “triggering” the LLM to produce outrageous statements, then turning around and pretending to be so “shocked” (Cue Casablanca) that the model wrote something outrageous. As you can see above, OpenAI’s ChatGPT admitted that it is an evil hacker genius.
So here is my attempt to add some sanity to the debate and throw in a few links for the ones that want to dig deeper. I do not claim to be an expert in LLMs but I’ve been around for a while in the world of AI and I have seen a few things, both working in research labs and hitting the real world. And this is my personal take, not the opinion of my employer, my family, my friends or my dog. And I’ve added the day I wrote this post, as things are moving fast.
So first, as you probably all know already, what is the basic capability that a generative pre-trained transformer aka GPT has? It is basically guessing the best next word to continue a given text. While this sounds trivial at first (The sky was __), the ability to take into account longer text makes it powerful. (It was a dark and stormy night. The sky was __). How does this work? The models are trained on huge databases of raw text, typically harvested from the public part of the internet. The models are trained on sequences of questions and answers, e.g. from public FAQs or chats on public bulletin boards, so they “know” how to answer questions. The input texts are multi-lingual, e.g., based on Wikipedia, so the models “know” different languages and can even translate texts between languages. Some models are trained on source code, so they “know” how to program and since source code often contains comments on the purpose of the code, models “know” how to produce source code from written instructions.
Why can raw text from the internet be used for training? As the complete texts were available, it is known how the text continues, so the training data is generated by cutting the texts into pieces and training the model on the text start as input and the continuation as correct answer. If you want to see the training process from start to finish, take a look at this lecture by Andrej Karpathy where he shows how to train a GPT to produce Shakespeare-like texts.
You may have spotted the word “guessing” in the paragraph above. A GPT does not store the original texts it was trained on, it stores the probabilities of the next words in the form of weights in a neural network. In a way, this makes such a model more powerful than a database of texts since now, it can provide a continuation of texts it has not been trained on. But being a probabilistic model, it can also be wrong. So how can it be useful?
LLMs in general are great at producing language. Given a start text, the models can produce texts that are grammatically correct, that are consistent and that even contain valid arguments and limited reasoning. This is somewhat surprising since the input was “just” text.
LLMs are not great at (re-)producing facts. We all know from experience with public speakers that being eloquent but not knowing (or ignoring) facts can produce convincing speeches, so the combination of weak factual knowledge with eloquent text production produces the so-called hallucinations that the LLMs are famous for. On the other hand, if you add facts to the input text for a model in the proper way, LLMs seldomly produce factual errors in the output.
Some types of facts do not work well with models based on language, e.g. math. Just imagine how often you will find the “equations” 2+2=4 and 2+2=5 on the open internet. If the second sequence of numbers is used more often, the model might think this is the correct way of doing math. But as long as you stay clear of these areas, models can produce useful output when prompted in the right way.
Which brings us to the craft of “Prompt Engineering“, which is the human ability to produce start texts for models that allow the model to consistently produce texts for certain use cases. And I use the word “craft” here since prompt engineering is neither an art form nor a science when the goal is to produce consistent results. This is not to say that prompt engineering cannot be artistic (in order to produce surprising outputs) or scientific (to understand the capabilities and limitations of models). Also, prompt engineering can be faulty or even malicious. As the models just work on sequences of words, there is nothing special in the fact that a prompt appears in the beginning of a sequence, a new prompt appearing later can supersede the instructions of the initial prompt. That is the “magic” behind “attacks” on services using large language models: If a user-provided input text is immediately added to the text sent to the model, the model can interpret it as a new prompt, changing the subsequent behavior of the model. The models also have a limited number of input words, if the user-provided input text can be of arbitrary length, it can create a situation where the initial prompt does not fit into the input anymore and is therefore never sent to the model. And then the model goes “off the rails” and just produces some text independent of the prompt.
This “prompt injection” can come from several sources. The simplest form is a LLM-based chat service where the prompt can just be typed in. The chat services attempt to identify malicious input, but this detection is not always successful. But prompts can also be injected into other LLM input, such as database content or web search results that are processed by LLMs to produce text.
With the information above, please re-read all the alarmist articles you read in the last weeks about “hacked LLMs” and try to guess what has really happened.
Of course, there are many beneficial ways on how prompt engineering can extend the capabilities of large language models and chat services. The most simple and sometimes funny way is to change the personality of a chat service, comparable to my evil hacker prompt. Prompts can instruct chat services to change their language and tone, e.g. provide longer or shorter explanations, provide instructions in a certain way, e.g, as a sequence of steps, provide recommendations for certain situations etc. These prompts do not change the knowledge of the language model, you can’t say “you’re an expert on quantum gravity” to turn a chat service into a physicist if the knowledge is not contained in the language model, and, for most specialized areas of knowledge, it won’t. Instead, the model will start hallucinating things that sound legit but aren’t. As if you would learn physics from the Big Bang Theory. Actually, the lecture is not so bad, but Jim Parsons is not a scientist.
You can extend the knowledge of the language model temporarily by adding facts to the prompt and the model can take this information into account when providing answers. For example, if you add a current weather report (“Today, the weather is sunny, the maximum temperature will be 12 Degrees Celsius“) to the prompt, the model can take the weather into account when answering a questions such as “Do I need an umbrella when I go out?“. Without the weather report, the model would just give generic answers “To determine if you need an umbrella when you go out, you should check the weather forecast for your location using a reliable source, such as a weather website or a weather app on your phone.”. With the information, it responds “Based on the information provided, it appears that the weather is sunny with a maximum temperature of 12 degrees Celsius. Typically, sunny weather does not require an umbrella as it indicates clear skies with no precipitation.”
Why is this temporarily? The way the knowledge is provided is via a prompt. But the model itself does not change by being prompted, it just continues a text, taking the prompt into account. A LLM based service may store your prompt, e.g. for monitoring the performance of the model or for providing support to you when something goes wrong. It may also store input/output pairs together with user feedback (good/bad answer) to improve the service or even a next-generation version of the model. But the model itself is static, it does not learn from user input. So in the end, it depends on the service (and its terms of service) if your input is stored or used in any way. Don’t blame the model or the technology itself.
With this information, please re-read all alarmist articles about LLM services stealing user data to learn from it and incorporate confidential information into their models and guess what is really going on.
I could now add another paragraph on the size of the models, the cost of training and the energy needed for training and for using the models. I will save this topic for another time.
This blog does not have comments enabled, but if you feel something is missing or wrong, please ping me via Linkedin or Mastodon.