On Language And Intelligence

A revolution is taking place, but we seem to not yet realize it.
Paradigm shifting technologies often produce an abrupt transition when they get adopted. However, that transition is not easy to recognize early on: the effects of an exponential trend appear linear at the begninning, so the explosive force of the transition that occurs a little later takes many by surprise.

Let us look at the status of development of large language models. This is a relatively new technology that is powered by recent advances in machine learning - in particular, the capability we have acquired to train very large neural networks tasked with producing meaningful text in answer to arbitrarily complex questions. The networks that perform this task today have billions or trillions of parameters, whose values are learned by processing huge datasets of text mined from the internet. The training of these models takes enormous amounts of computing power, and correspondingly large amounts of money (of the order of 10 million dollars per training, and up).

The scaling up of the size of these large language models, in particular ChatGPT3 and 4, has brought to what appears like a phase transition in their performance. Yet we have grown accustomed to treat artificial intelligence developments with a contempt: every time something new comes up which in the past used to be considered a far, hard-to-achieve target, we react by some shoulder-shrugging. Self-driving cars? Just a dumb neural network trained with lots of images. Speech recognition? Nothing but mathematical transformation of sound into time series. Computers beating humans at chess and go? Only an effect of CPU scaling. We continue to shift the bar up, and keep claiming that artificial intelligence is "something else", which is yet to come. But is it?

I am not a true expert in artificial intelligence - I am a physicist, for goodness' sake! But I do work with complex machine learning systems, and I have been an observer of the field for several decades now. So I feel entitled to tell you what I think about the matter. What I see is that the sensation produced by the recently made available ChatGPT models mostly lays in observing the wealth of applications that these tools have, and their game-changing effect on our society; but we should look further in it.

The potential dangers of unrestrained, uncontrolled use of the new technology is a real concern which has brought to the open letter arguing for a 6-months pause on the development of these models, by the "Future of Life" institute. It seems indeed a reasonable course of action to wait before developing further more powerful language models, and use the time to try to assess the situation and create a system of checks and balances to prevent damage to crucial elements of our civilizations: in particular, the exploitation of these AI technologies might result in manipulation and reshaping public opinion, for the purpose of gaining political control. But there are also other potential threats.

If you have never had a conversation with ChatGPT I suggest that you try it out for yourself. The system is capable to not only correctly interpret quite complex questions, but to produce text and answers that are of very high quality. After a while, it feels like you are really talking to a sentient being. Now, we must be careful here - of course, we cannot call "sentient" a computer program that puts together words according to mathematical recipes, can we? And by the way, it is not difficult to get ChatGPT produce false statements, or completely made up references. But so can we when we talk with other humans!

I have started to use ChatGPT as a companion in my studies, a better, smarter, faster, more powerful version of Google. Yesterday I tested it by formalizing in seven lines of text a problem that would probably have taken twenty minutes to precisely explain to a colleague - those seven lines of text were quite thick with math, written as you would write math in an email ("Consider a likelihood ratio of Poisson measurements, R= L_1(Poisson(N_i|mu_i,1)) / L_0(Poisson(N_i|mu_i,0) where i runs on a set of observed counts ...."). Well, ChatGPT not only provided me with a correct answer to my question, but it also used the same kind of language in its answer; and when I asked it to produce code that performed the operations leading to the solution of my problem, it did so flawlessly. Of course you have to be careful when using these outputs: there is absolutely no guarantee that the programs will be correct, or that the answers are correct. But neither can you say that about the answer of a colleague!

Intelligence is a concept very hard to define: there is a huge literature on what it is, what are its components, how we can quantify it or recognize it. I won't get into that matter, but I want to observe that one of the ways we typically assess an individual as an intelligent person is by hearing he or she talk. The capability to produce complex language and elaborate abstract concepts is undoubtedly a mark of intelligent beings. And when we are hit by a stroke, maybe a small hemorrage in our brain, we may temporarily lose our ability to speak or to put together meaningful sentences.

Further, consider Alzheimer: people who are progressively hit harder and harder by that impairing condition gradually lose their ability to put together correct sentences. I lost my mother that way six years ago, and I remember observing that in very close connection with her capability to speak, came a gradual deterioration of her intelligence. The two things are inextricably linked: we appear to be able to put together intelligent thought by processing text in our brain, even if we do not speak.

Because of the above, I believe that we must acknowledge that these large language models possess distinct traits of intelligence. It does not matter to me much if they put together their flawless answers by mathematical operations between large matrices of weights and biases: what matters is the result of those operations, and the fact that it is hard to distinguish -if not superior- to what a human mind can produce.

Of course, large language models are static systems: once they are trained -as I said, with considerable effort and expense, not to mention CO2 impact- they do not further "learn" by interacting with their users. They also do not have any means of acquiring information and processing it by sensory inputs. These limitations make these systems quite different to what we have always considered could be the capabilities of a true "artificial general intelligence". Indeed, a world-class expert on the matter like Yann Le Cun insists on saying that "on the way to AGI, large language models are an off-track" in twitter and in other venues, and he is of course right: these instruments will never "come alive" and become independent. They will be limited to one task: producing text in response to a prompt. Not real intelligence, not really. And yet...

Yet I cannot help thinking that we have to rethink what we call "intelligence" in the light of the capabilities of these systems. If they match our speech and writing skills, they have to be credited to be reasoning. The reasoning they perform is different from the reasoning that takes place in our brains to some extent, but not overly so after all: we also reason by using weights and biases that are encoded in our neurons. So we are not that different from large language models, at least in how we produce language.

Whether humanity will benefit and exploit for good causes the empowerment provided by ChatGPT and its successors - because I am convinced that there will be further more powerful models in our near future - or whether it will succumb to this new technology, the jury is still out on. But for sure these are interesting times!

Related articles

Comments

Know Science And Want To Write?

Donate or Buy SWAG