Latest AI Model Shows Signs of ‘Human-Level’ Intelligence: Microsoft Research
Latest AI Model Shows Signs of ‘Human-Level’ Intelligence: Microsoft Research

By Naveen Athrappully

OpenAI’s artificial intelligence system GPT-4 has shown “sparks of artificial general intelligence (AGI),” displaying abilities in a wide range of knowledge domains with a performance that is almost at the “human-level,” according to a paper by Microsoft Research.

An AGI would be able to understand the world as human beings do and have a similar capacity to learn how to carry out various tasks. An early version of GPT-4 tested by Microsoft researchers showed “more general intelligence than previous AI models,” according to the March 22 paper (pdf). “Beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology, and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4’s performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT.”

“Given the breadth and depth of GPT-4’s capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.”

GPT-4 was found to be able to code at a “very high level” both in terms of writing code from instructions as well as understanding existing code. Though the AI is “not perfect in coding yet” since it generates semantically incorrect or syntactically invalid code, GPT-4 is able to improve the code by responding to human feedback.

When it came to math, GPT-4 was able to express mathematical concepts and solve related problems. The AI was able to apply quantitative reasoning when faced with problems that required mathematical thinking and model building. However, GPT-4 is still “quite far from the level of experts” as it does not have the ability to conduct mathematical research, the study admitted.

A key aspect of intelligence is interactivity, which is basically the ability of an intelligent entity to communicate and respond to feedback from other entities, environments, and tools. “Interactivity requires an agent to comprehend complex ideas, learn quickly, and learn from experience, and thus it is closely tied to our definition of intelligence.”

GPT-4 was able to use tools like APIs or search engines to overcome some of its limitations. It employed tools with “minimal instruction” and generated output “appropriately.” In one search result containing potentially conflicting information, the AI was still able to infer the right answer.

“GPT-4 is capable of both identifying and using external tools on its own in order to improve its performance. It is able to reason about which tools it needs, effectively parse the output of these tools and respond appropriately (i.e., interact with them appropriately), all without any specialized training or fine-tuning,” the study stated.

Integrative Abilities

GPT-4 demonstrated the ability to combine skills and concepts from several knowledge domains fluidly, showcasing “an impressive comprehension of complex ideas.”

For example, GPT-4 was able to combine art and programming by producing a code that generated random images in the style of painter Kandinsky. Then, in the literary style of Shakespeare, it produced proof of the fact that there are an infinite amount of prime numbers.

The AI combined knowledge of history and physics by writing a letter supporting an electron as a potential U.S. presidential candidate. The letter was to be written by Mahatma Gandhi and addressed to his wife.

GPT-4 also coded a program that accepts an individual’s age, sex, weight, height, and blood test results to indicate whether the person is at higher risk of diabetes.

“These examples suggest that GPT-4 has not only learned some general principles and patterns of different domains and styles but can also synthesize them in creative and novel ways,” the study said.

Theory of Mind

Researchers conducted multiple tests to evaluate the “Theory of Mind” capabilities of GPT-4. In psychology, Theory of Mind is a field of research that describes how an entity ascribes mental states to individuals and how the entity uses such states to explain and predict the actions of those individuals.

In the field of artificial intelligence, an AI equipped with the Theory of Mind qualities will be better able to understand the humans it interacts with.

In one test, GPT-4 was found to infer the mental states of multiple characters as well as discern misunderstandings and miscommunication.

GPT-4 outperformed two other AI models in “both basic and realistic scenarios that require reasoning about the mental states of others, and in proposing actions for cooperation towards common goals in social situations.”

“Our findings suggest that GPT-4 has a very advanced level of theory of mind. While ChatGPT also does well on the basic tests, it seems that GPT-4 has more nuance and is able to reason better about multiple actors, and how various actions might impact their mental states, especially on more realistic scenarios,” the study said.

Hyped Up?

Despite detailing the impressive abilities of GPT-4 and claiming that it could be viewed as an early version of AGI, the researchers admit in the paper that their approach to testing GPT-4 was “somewhat subjective and informal, and that it may not satisfy the rigorous standards of scientific evaluation.”

In a March 23 tweet, Michael Timothy Bennett, an AGI researcher who works at the Australian National University, said that “there’s a reason they’re not submitting their findings to something like the AGI conference.”

“The authors point out in the paper that GPT-4 does not satisfy what I would consider the only compelling notion of generally intelligent; the ability to generalize from limited information,” he said. “GPT-4 just doesn’t satisfy that or any other definition of AGI I would consider meaningful.”

“I am particularly suspicious in this case because the claim comes bundled with commercial interests. Wouldn’t you agree that it is at least plausible that Microsoft is using this preprint to promote its products and keep anyone from paying much attention to the competition? That perhaps the paper is entirely motivated by that end?”

Human Advancement or Technological Competition?

Many experts have been lately sounding alarms regarding the rapid progress of AI. Geoffrey Hinton, the computer scientist called the “Godfather of AI” recently left his prestigious position at Google to speak out against the technology, which he admitted could not be done while working for Google.

“It is hard to see how you can prevent the bad actors from using it for bad things,” Hinton told The New York Times in an interview.

Hinton warned that competition between big tech companies regarding AI advancements could spiral out of control resulting in widespread damages. The way AI is developing and its unprecedented ability to create content like images and text will get to a point where an average individual will not be able to distinguish “what is true anymore.”

Going a bit further, AI will replace humans for many tasks, and be able to create fully autonomous weapons.

“The idea that this stuff could actually get smarter than people—a few people believed that,” Hinton said. “But most people thought it was way off. And I thought it was way off. I thought it was 30 to 50 years or even longer away. Obviously, I no longer think that.”

Meanwhile, Google’s parent company Alphabet is combining two of its AI research units—DeepMind and Google Brain—in a bid to “significantly accelerate” the company’s progress in the field.

“Combining all this talent into one focused team, backed by the computational resources of Google, will significantly accelerate our progress in AI,” said Alphabet CEO Sundar Pichai.

In February, Alphabet had launched its Bard AI chatbot to compete with OpenAI’s ChatGPT. Funded by Microsoft, OpenAI powers the Bing search engine.