Large Language Models and the Quest for Artificial General Intelligence

“Tracking progress is getting increasingly hard, because progress is accelerating. This progress is unlocking things critical to economic and national security –and if you don’t skim [papers] each day, you will miss important trends that your rivals will notice and exploit.” ​Jack Clarke Cofounder at Anthropic, Former Policy Director at OPENAI​

Nowadays, I read and watch anything I can find related to large language models and, of course, ChatGPT. For me, the improvements in the model are fascinating. I use it daily for my work; it corrects my codes, converts one programming language to another, and is amazing for content creation. Before the ChatGPT, of course, I knew the chatbots, but we all know they were at a different level, and I did not specifically skim articles in the literature related to large language models. For this reason, I did not see this revolution coming. I call the new large language models “revolution” because I believe they will change how we do our jobs and create new ones. When we think about these kinds of technologies, we always need to consider the hype factor, so I am trying to be cool about this, but it is so hard for my inner nerd. For this reason, today, I allow myself to be a dreamer and think about the distant future of large language models. 

Let’s start with the basics; what is artificial intelligence (AI)? 

It is a field of computer science dedicated to creating systems capable of performing tasks that typically require human intelligence. These tasks include learning from experience, recognizing patterns, understanding natural language, making decisions, and more. In general, anything we do as artificial intelligence is a representation of Narrow AI, but actually, there is more. In science fiction movies, we see some examples of general ai or super intelligent ai. General AI can understand, learn, and apply knowledge across a wide range of tasks at a level equal to or beyond a human being. Super intelligent AI is a hypothetical future form of AI that does not just mimic or understand human intelligence but significantly surpasses it, potentially leading to rapid, unprecedented advancements. 

Some people think that we might enter the general ai era with the advancements of large language models. When I read and listened to their ideas, I thought, ok, but how can I understand if we enter that era? There are some hypothetical tests to evaluate the intelligence level of models. Below, I will explain a few of them. 

  • Winograd Schema Challenge: This is a test of machine intelligence proposed by Hector Levesque, a computer scientist. The challenge consists of a multiple-choice questionnaire based on Winograd schemas and sentences with ambiguous pronouns. The correct interpretation of the pronouns depends on understanding the sentence, requiring common sense reasoning. For example, “The trophy would not fit in the suitcase because it was too big.” What does “it” refer to in this sentence? An AI might have trouble answering correctly without knowledge of real-world contexts and understanding that trophies go in suitcases.
  • Coffee Test: This was proposed by AI pioneer Ben Goertzel. In this test, an AI agent is tasked with entering an average American home and figuring out how to make coffee, including identifying the coffee machine, determining what the buttons do, finding the coffee in the pantry, etc. It tests the AI’s ability to navigate a complex, unfamiliar environment and complete a multi-step task it has not been specifically programmed to do, thereby demonstrating generalized intelligence.
  • The Robot College Student Test: Proposed by AI researcher Stuart Russell, in this test, a robot would need to enroll in a university, attend classes, take exams, and graduate just like a human student.
  • The Employment Test: In this scenario, an AI would be capable of performing any job as well as a human can. 
  • The Immigration Test: This hypothetical test, imagined by AI researcher Roman Yampolskiy, entails an AI successfully impersonating a human to pass a border patrol interview, thus testing its ability to understand and emulate human behavior and norms.
  • The Turing Test, Extended: One of the most famous AI tests, the original Turing Test involves an AI conversing with a human. The AI passes if the human cannot distinguish the AI from a human. A complete test version would involve the AI interpreting and generating speech, facial expressions, body language, and other human communicative signals, not just text-based chat.

These are all good, but when we question what will happen next, what is the endgame here? We encounter another term, which is a singularity. Singularity is a theoretical point in the future when technological growth, particularly in artificial intelligence, becomes uncontrollable and irreversible, leading to unforeseeable changes to human civilization. This idea rests on the premise that future AI will be able to improve on its designs and capabilities, leading to a rapid, exponential increase in intelligence, a phenomenon also referred to as “intelligence explosion.” As a result, this super-intelligent AI would far surpass all human intelligence. However, thankfully, we are not there yet. While large language models like ChatGPT are impressive and have significantly advanced the field of natural language processing, they are still far from achieving the type of general intelligence or self-improvement ability that’s often discussed in scenarios leading to a technological singularity. A large language model such as GPT-3 or GPT-4 operates by generating language-based responses based on patterns learned during its training process on a vast corpus of text data. However, it does not truly understand the text as humans do, nor does it have the ability to learn or think independently or improve its algorithms.

It is also important to note that AI models, including language models, are tools created and managed by humans. They do not have desires, goals, or the ability to make decisions outside of their programmed responses. Therefore, the concept of AI developing the ability to self-improve or to set its own objectives independently of human intervention (critical attributes for a technological singularity) remains firmly in the realm of speculation and theoretical discussion rather than current reality.

Even though we are still at the narrow AI level, it is better to be prepared than be sorry. For this reason, the development and use of AI technologies raise important questions about ethics, privacy, security, employment, and societal changes, all active discussion and research areas. We will see together what will be the decision of the lawmaker and the social effects of large language models.