There might be disagreement among people on whether ChatGPT is truly intelligent or not. Still, most people will agree that ChatGPT displays many capabilities considered intelligent when done by humans. Most of my web queries now start with a ChatGPT-like system, and it has become my go-to place for Internet searches. Like everyone else, I have also marveled at ChatGPT’s ability to recollect millions of facts instantaneously, construct grammatically correct and meaningful sentences, put sentences in the proper order, keep track of contexts, and many other capabilities. ChatGPT seemed to perform logical reasoning in many of my examples as if it knew how to make logical deductions. It also seemed to understand the meanings of the words that it processed. ChatGPT’s main competency is described as its ability to predict the following few most likely words in a sequence when an earlier part of the sequence is known to it. All of us have wondered if ChatGPT’s capabilities can emanate only from its ability to predict the next word in a sequence. We, for example, want to know if ChatGPT knows the meanings of the words it processes.
A little incident I was a part of more than two decades ago showed me whether search engines understood the meanings of the words they processed. In the late 1990s, search engines defined cutting-edge technology, and whether they understood the meaning of the words was also a matter of inquiry in those days. The incident that I will describe soon made it clear to me that, given the core search engine algorithm, the search engines did not have to understand the meanings of the words to find the relevant documents. The search engine algorithm performs basic set-theoretic operations like counting the number and frequencies of the common words between the documents and queries. Given how we use human languages, the incident illustrated that it was inevitable that an algorithm employing these simple operations would find documents relevant to the queries, whether they knew the meanings of the words or not. The incident also explained the reason behind this inevitability.
The argument I used two decades ago to explain the behavior of search engines extends to generative language models used by ChatGPT-like systems. However, this article’s purpose is not to claim that search engines and language models are not intelligent. I only aim to explore the source of their intelligence and how to enhance it.
In the late 1990s, search engines were in their infancy, and people were slowly getting used to them. At that time, my daughter was in elementary school. While driving around, my daughter and I often had to create games to keep us entertained. One such afternoon, I pulled a small exercise from my own elementary school to make a game for us. I gave my daughter an English word and asked her to return me a sentence that employed that word. Ours was a good-faith game, and we were not expecting to trick each other. Nothing surprising happened here; my daughter did the task well, and the game soon got boring. I hoped to make the game slightly more challenging, so I gave her two words to knit into one sentence. But soon, the game became dull again. So, I tried the three-word version of the game. And, then, four. When I upped the game to the four-word level, I noticed something that surprised me (like a mini eureka moment). I noticed that with each additional word, instead of getting more challenging, the game was getting more straightforward for my daughter. More importantly, the sentences she made using my words were nearly the same as those in my mind when I gave the words to her. I was familiar with search engine technology and how natural languages work. Still, the fact that the game got easier with additional words and my daughter’s sentences became nearly the same as mine caught me off guard.
The purpose of writing using natural language is to capture known situations in words. We can say that the meanings described by collections of words are nothing but the scenes or scenarios we see or sense around us. A picture is said to be worth a thousand words, and we use thousands to describe the images our minds observe and store. All humans on the planet collectively know billions of such scenes. The natural-language descriptions that people create are like catalogs or index entries to the scenes that they have observed. The scenes of individual interests stay in people’s memories, diaries, and letters. The scenes of common interests become parts of books and articles. We find them using search engines.
My elementary school-age daughter and I probably shared the knowledge of about a few thousand scenes between us. Any four words I would pick from my memory identified one of those shared scenes. Given the words I gave to my daughter, constructing a sentence using those words was nothing but a simple index lookup for my daughter to get to the scene and use her knowledge of grammar to build a sentence. Arguably, any person, not just my daughter, would write the same sentence given those words if they had also known the shared scene.
This phenomenon happens at a global scale in the world of search engines. When authors write documents, they use words to describe scenes of human interest using people’s shared concepts and vocabularies. When search engine users type queries using shared vocabularies, the words act like index entries into the documents written about those scenes. Mostly, there is no other way to describe the shared scenes except using those words. The set of words in a query freezes the set of target documents, and the exact meaning of the words is abstracted away and plays no further role. The search engine neither knows nor needs to know the meaning. This process holds as long as the authors and the searchers share the same language, vocabulary, meanings, and scenes, and they agree not to trick each other, as my daughter and I agreed.
While the above incident made me understand a few things about search engines, I did not learn as much from it -before the fact- as I should have. I was not visionary enough to foresee that the same argument could lay the foundation of language models that would come two decades later. I failed to notice that my daughter did not merely retrieve the scenes shared between her and me. She also acted like ChatGPT. She picked the correct grammatical structures to form her sentences and filled in additional details by adding new words (beyond the four words I gave her). For example, if I gave her the words bake, cake, chef, and sugar, she added ‘chocolate’ to a sentence she returned to me. At that time, I failed to notice that she was likelier to add the word ‘chocolate’ to her sentences than Funfetti, a lesser-known variety of cakes than chocolate. Humans use probabilistic models in their reasoning. I failed to conjecture that computer models could do the same if they knew many probabilities related to many words. Computer models could generate (some, not all) sentences if they knew enough probabilities and some grammar.
The examples and arguments here do not imply that meaning is not essential for human beings or computer systems. The truth is far from it. These examples only illustrate that some algorithms can behave as if they know the meaning of words and paragraphs without knowing the meaning.
We are now led back to our original question: does ChatGPT know the meanings of the words it processes? Do LLMs discover meanings? Human learning uses probabilities; we know what is more likely to happen with what else and what is not. This kind of knowledge constantly guides us in the world we live in. Let us grant that probabilities constitute one component of meaning. If so, we must acknowledge that large language models have discovered that meaning component. Meaning in the context of cognition is a far more complex subject than what our casual conversations imply it to be. That discussion, for now, is beyond the scope of this article.