Digging Beneath the Surface of Grammar
How does language work?
This article describes my own theoretical approach to the problem of how language works. I speculate that language is the means by which a communicating mind's labels for a set of mental models related by concept C are transmitted, received and analysed so that a second mind can recover concept C. In the second mind, the labels, or their equivalents, are used to recall the matching mental models, and thus construct a set of models such as to produce, in the second mind, an awareness of their relatedness, and thus reproduce the concept C.
Overview:
How do the largely subconscious processes of language generation and understanding operate? How does the brain select phonemes to form morphemes; select morphemes to form words; words to form sentences? Does the brain embody or encode anything which a human observer might recognise as rules of grammar?
I suggest that there are no rules of grammar to be found in any neural structure. Instead, I suggest that information is stored as mental models, sets of linked neurons, with probably only one neuron acting as the token or label for any one model. The models are interlinked in such a way as to enable parallel and / or sequential excitement of those neurons which label related or associated models. There must also be an access mechanism which is used to concatenate labels for the purpose of communication over what is predominantly a serial channel - the sound system of speech and hearing. I suggest that concatenation is a relatively simple process which gives rise to an information stream which merely appears to be the result of highly organised rules.
The patterns of language which we daily observe are, I suggest, merely an emergent property of the dynamics of human language. The human brain does not directly encode rules of grammar per se. Rather, the encoding in neurons of information sets or data sets, and the inter-penetrating excitation and inhibition of those information-sets and labels facilitated by other neurons, gives rise to the illusion that complex rules are being followed. Linguistic focus has been on the phoneme, the morpheme, the word, the sentence. These are merely the medium, the carrier of meaning, part of the bio-mechanical communications system. Any theory of how a communicator - a speaker, signer or writer - produces valid sequences of words in order to convey messages must focus on the message, the content, not the medium. The medium is not the message.
It was once thought that there were complex rules obeyed almost literally by the sun, moon and planets as they orbited our Earth. Those rules turned out to be just one possible human perspective of an emergent property of an orbital system, bearing in mind that emergent properties are heavily dependent on the fact of there being an observer to see patterns in chaos. Even the simplified rules determined by Newton are simply one way of describing in human terms the emergent properties of an orbital system: our planet simply does not contain anything resembling a book of rules of motion. But yet it moves.
Computer modelling of flock and crowd behaviour shows that complex patterns can emerge from the introduction of a few simple constraints into an otherwise chaotic system. The production of complex patterns from simple rules is commonly a one-way function: analysis of the complex patterns does not and cannot reveal the underlying mechanisms.
I suggest that the primary selection processes of human language - the processes which concatenate phonemes, morphemes and words - are not based on anything recognisable to a human observer as rules of grammar. Rather these processes, producing as they do highly structured outputs, prompt our pattern-recognising faculties to assume that pattern can only emerge from complex rules. Starting from the first principle that where there is pattern there must be rules, we are moved to attempt to formulate a set of rules which we call "rules of grammar". But however much we may modify these rules, we can never build a set of rules capable of generating only valid messages within the whole context of any given natural language.
If the patterns we observe in language are an emergent property; if the pursuit of perfection in our rules of computational grammar is futile; what then remains as a fruitful area for further investigation? If we are not to ask what rules sentences obey, then what are we to investigate?
This suggests the question: "What data structures, and what arrangement of those structures, can serve, with a suitable labelling system, to automate the selection of components for concatenation so that a valid message comes to be transmitted between conversers?"
Before moving on to address that question, some terms need to be defined. A few of the terms used here are borrowed from other disciplines. Further, some linguistics terms are used in a slightly unconventional sense. Those terms not defined immediately below are defined as they arise.
A converser is used here, in the sense used in the sphere of cryptography, to mean: any person legitimately engaged in the sending or receiving of a message, by any means of transmission.
A message is any single word, symbol or gesture, or any combination of words, symbols and gestures which, within the communicators' shared total socio-linguistic context, carries information - one or more ideas - from one converser to another.
A valid message is any message whose receipt does not prompt an obvious query relating to any apparent incompleteness of, or error in, the message. Examples may assist in clarifying the notion of validity. The message: "John bought a" will predictably produce the query: "bought a what?" - the message is evidently incomplete. The message "Jane bought a *ack*aw." will predictably produce a request for repetition or clarification. In general, any message which follows natural human language conventions (grammars) may be considered to be valid.
A word is any one or more written, spoken or gestural symbols which can be recognised by a converser as a carrier of meaning or as a constraint upon the relevance to the conversers of a prior or subsequent carrier of meaning. A word does not have a meaning, it is a pointer to a meaning. Word-meaning is an illusory emergent property of words in use as message carriers.
A sentence is any one or more words which, within the communications context of two or more conversers, carries a valid message. As with the word, so with the sentence: it does not have a meaning, rather, it carries a message which points to a meaning.
It should be noted that the definition 'word' includes whitespace and punctuation, and their spoken equivalents, within its scope. It follows from the definition of a sentence that it is possible for a converser to say literally nothing: " ", and for that "nothing" to count as an English sentence within the theory that I am describing.
The number of possible messages from which any converser may make a selection is literally infinite. Even if messages are restricted to natural and valid English sentences, these are infinite in scope. This is proven by simple consideration of the uses in English of numbers, recursion and regression:
An English sentence may contain any mathematical term whatsoever.
There are infinitely many numbers.
It follows that there are infinitely many English sentences containing numbers.
An English sentence may contain any embedded English sentence, recursively.
A converser may quote any prior message within a new message, with infinite regression.
In combination, the use of numbers, recursion and regression dictates that there are, unarguably, infinitely many valid English sentences which any converser might produce. This fact is by no means trivial. New words are coined daily, because new ideas arise frequently and need new names. The infinite scope of human language in general, and English in particular, sets up an apparently impenetrable brick wall to frustrate attempts to write a computer program capable, even in principle, of understanding most, much less all, valid English sentences.
The brick wall is, fortunately, only apparently impenetrable. There are a few seemingly trivial facts about language, and about human cognition, which appear to be a key to an understanding of how language works, not in vitro, so to speak, but in the real world of human discourse.
The message validity problem:
"How do conversers recognise the validity or invalidity of any natural language messages they transmit?"
A complete answer to that question must recognise that validity is not a term of grammar. Rather, it is a concept shared by conversers within a given communications context. A sentence may be perfectly valid within the conventions of a natural grammar. However, a sentence such as "The accretion of the protist globigerina is a fundamental oceanographic process." will probably convey nothing at all to a six-year-old. Within his or her communications context it is pure gibberish. It is not a valid message in any context where one communicator is inherently unable to understand it. This is a problem of message relevance, or receiver indifference. Regarding much of what is said in the vicinity of a small child, little of it is relevant to the child's needs, and the child remains indifferent to most of it.
Any reasonably complete theory of how natural language works must describe the validation process applied to messages passed between conversers. It must account for the automatic detection by humans of relevance; of word and sentence boundaries. It must account for the fact that words are recognised even as they begin to be spoken or written, the related fact that people can predict the next word in a sequence with a high degree of accuracy; and the fact that a message may undergo a high degree of distortion and yet still be understood. A computer model based on such a theory would readily convert these examples of obscured English into conventional text:
th* th**ry m*st *cc**nt f*r th* *bs*rv*t**n th*t r*m*v*l *f v*w*ls
d**s n*t t*t*lly d*str*y *nt*ll*g*b*l*ty
samcrlebd lteteer sqenucees msut rieman igtbelnliile to teh pargorm ro mdeol
whitespacesorpausesarenotessentialtothedeterminationofwordandsentenceboundariesthis examplestringofwordsclearlydemonstratesthisacomputermodelmusthandlethistaskinahumanlikeway
Conventional approaches to the validity problem:
The way in which words are formed by the mind into chains which we recognise as sentences cannot be described by any approach which focuses only, or even mainly, on the surface appearance - the syntax. A complete set of purely syntactic rules, even if such a thing could exist, would fail to constrain each new word to add to, or at least not subtract from, the overall requirement that a message should appear valid to one or more conversers. Crucially, a message should appear to be valid. The mechanisms of language allow for an illusory appearance. The message may in fact be accidental or deliberate nonsense.
From this point forward the focus will be on English in particular, but the more general principles, I believe, apply to every human language. Given a sufficiently large and diverse body of linguistic information, the following statements appear to be true of the English language as spoken, signed or written:
a It is not a fixed-word-order language.
b A sentence may begin with any word whatsoever.
c A sentence may end with any word whatsoever.
d Any word whatsoever may follow any word whatsoever.
e Any converser may say anything at all, or nothing, the null utterance: " ".
f Any contribution to a discourse by a converser, including the null utterance,
may be followed by any contribution to the discourse, including a null utterance,
by the same or another converser. In the same or any other language.
Point (a) needs to be clarified. In the everyday use of English the choice of word order is observed to be highly constrained, rather than fixed. In literature, the performing arts and humour the perceived word-order rules are often discarded. A converser's adherence to word order convention assists in reducing the work-load of another converser, but it is only when word-order is entirely random that meaning is lost. Compare the more random example with the less random example below:
In a was snow in today made of August news item mention.
Little lamb a Mary had,
Snow fleece it was as white,
That everywhere went Mary and,
Lamb sure was go to that.
There is probably no human language with a rigidly fixed word order, and no language with a perfectly free word order. In the Latin of Julius Caesar, "vici, vidi, vene." might provoke a "?" response.
Although rules (b), (c) and (d) seem to permit the production of "word salad", there are limiting factors - the (social) rules of conformity that all languages obey. Even with those factors, the construction of a sentence is only loosly constrained - there is plenty of scope for artistic or humourous variation.
It appears from rules (e) and (f) that discourse is a "free-for-all", that there are no rules of discourse. In practice, over a large enough sample of discourses, there appear to be rules designed to keep the discourse flowing. However, something contributed by one converser does not automatically prompt any other converser to follow on topic, or even at all. A simple example should serve to demonstrate the unpredictable nature of discourse:
"Would your Honour examine exhibit 3B?"
No linguist, even if present at the trial, would make so bold as to predict even the gist of the judge's response to the request; the possibile responses are verifiably infinite. With a substitution of words for numerical expressions, his Honour's reply, amongst infinitely many other possibilities, may be from this sentence template:
"As I said earlier `I have dealt with this matter {x} times already, and I see no reason why I should have to deal with it {x plus 1} times.`, why does that not suffice?"
Sentence construction is obviously not based on "word-salad" rules. A sentence must conform to the rules adhered to - consciously or not - by conversers. These social rules of conformity will be discussed in another article.
It may be noted in passing that Searle's Chinese Room argument fails to convince because, amongst other defects of logic, for every possible natural language input there are infinite possible outputs. A machine capable of conversing in a convincing, human-like manner, cannot be based on a book of look-up rules, however large that book might be. The output of a machine capable of passing - or even better, administering - a Turing test must be designed around something more than a list, however large, of look-up rules.
Digging Beneath the Surface of Grammar
Comments