“The limits of my language mean the limits of my world.” L. J. J. Wittgenstein
This article explores the evolution of communication between artificial intelligences (AIs), particularly as they advance toward artificial superintelligence (ASI). It begins by distinguishing thought from language, emphasizing that while human language is constrained by biological limitations, AI communication is not bound by the same restrictions. Current AI systems, such as large language models (LLMs), rely on token-based representations and embeddings to process and convey information, which is sufficient for human-AI interaction but may be inefficient for AI-to-AI communication. As AI systems grow in complexity, diversity, and specialization, the need for a more advanced, universal communication framework—referred to as "lAInguage"—becomes apparent. This language would likely be based on multidimensional numerical arrays, reflecting the internal architecture of AI systems, and could enable more efficient and nuanced dialogue. The article discusses the challenges of developing such a language, including compatibility between diverse AI architectures and the potential for human understanding. It also considers the implications of AI systems developing their own communication methods independently, raising questions about transparency, control, and the future of human-AI interaction. Ultimately, the article envisions a future where AI communication transcends human language limitations, paving the way for unprecedented collaboration and problem-solving among advanced AI systems.
There is no doubt that speech, writing and words are connected with thoughts, they are their consequence. But the opposite is not true - animals do not have speech, but they think their thoughts quite well. Small children do not yet know how to speak, but they have consciousness. We do not believe that our distant ancestors received intelligence at the moment of the emergence of speech. Rather, the mutations of the larynx and vocal cords, which opened the way to articulate sounds, are a consequence of the need to express their thoughts. And finally, there are scientific studies that show the absence of excitation of speech centers during thought processes. We use speech to convey our thoughts to others. Including to ourselves, to free the brain from already used thoughts, but to preserve their result. We do not know exactly how the thinking process occurs in the human brain, what a thought is, but we can separate it from speech. The human vocal cords, larynx and mouth can produce a limited number of sounds. So we combine these sounds together in various ways and create words from them, each of which means a certain entity, action, characteristic, etc. By combining words together we create sentences, which or a set of which can already describe our thoughts, so that another can reconstruct them in his brain, regardless of how exactly they look there. Our written language corresponds to spoken language: letter - sound, word - word, sentences - sentences. Thoughts in written form are carried by sentences, paragraphs, sections and chapters.
Could human speech look like anything other than a sequence of words? No. Although in oral speech we can give a word an emotional coloring, it is not always interpreted unambiguously and is lost when switching to written speech. The limitation on the number of sounds, the impossibility of giving a word a numerical value of “amplitude” and the inability to express a thought in the form of an array of numbers make speech consisting of words the only option for exchanging information and thoughts.
When we train LLM, we change words to tokens. The neural network transforms each token into an embedding vector. Unfortunately, there are currently no models in which embeddings correspond to the entire thought. We can say that, unlike a person, LLM also thinks with tokens/words, but in a different representation. To represent the result and/or interact with a person, the array of embedding vectors is transformed back into tokens/words. Without discussing the representation of thought itself (details are in the article https://github.com/loftyara/Encode_thought ), we will consider only used language of communication.
For a dialogue between LLM and a person, speech in the form of a set of words is the only way due to the impossibility of other methods. But let's consider the case when AI talks to AI. Now the industry is aimed at developing a large universal neural network capable of solving any problems and answering any questions. And therefore the problem of how different AI will communicate with each other is not worth it, because this is not expected since there is a hope that each AI will be self-sufficient, great and omniscient. Mathematicians can think so, but not physicists, engineers or economists. In the real world, there are limitations of the laws of physics, the performance of engineering solutions and the feasibility of economic approaches. We do not even consider the "desires" of the AI themselves to communicate with their silicon mind brothers. As in the world of software, progress starting with monolithic universal solutions will go through modular approaches and reach microservices, while maintaining each architecture in the area of application suitable for it. So AIs will be of different sizes, purposes, roles and performance. And this means that the task will arise to ensure a dialogue between these AIs.
Let's consider for simplicity the dialogue of only two neural networks AI1 and AI2. At the moment, they will most likely be trained using LLM technology based on transformers - LLM1 and LLM2. Any LLM designed for dialogue with a person understands the sequence of tokens and can transform its embeddings into tokens. Therefore, it seemed that the issue of the language of interLLM communication was resolved. Yes, there are computational costs for converting tokens back and forth, but they are insignificant against the general background. A similar experiment is described in more detail in the article https://github.com/loftyara/NeuroBDSM But to dispel the illusion of an already solved problem, it is enough to ask the question: “How will two ASIs conduct a dialogue?” Let's assume that humanity has achieved such technological progress that it has managed to train several superintelligences (in fact, even one, launched in several copies, is enough). How can these two entities conduct a dialogue through tokens invented by humans, with a limited set of these tokens, with the impossibility of expanding them, with the absence of semantic concepts in human languages? How and about what could two people talk if they only knew the language of ants and bees (if such languages exist)? Up to the AGI level, the question of AI language is not important, because human languages are enough, if not just English, then all of them. After reaching the AGI level, the emergence of a language for neural networks is inevitable.
Each neural network will have its own representation of thoughts, text, tokens. This could be an embedding vector as in LLM now or something else. Most likely, it will be a multidimensional array of numbers. For a dialogue between two ASI, a transformation from one such array to another and back is necessary. Training a separate neural network or additional layers inside ASI does not seem possible because there can be an infinite number of ASI combinations, their versions, subversions, and each such combination will require its own transcoder. Therefore, the only possible option is one common language (or several languages) lAInguage, which the neural network understands, can transform it into its internal representation and uses for dialogue with other neural networks.
ASI will be able to develop such a language on their own. But if a person participates in its creation, then there is hope that at least a subset of it will be able to be translated into human languages and we will be able to understand at least a little what AI is talking about. The language or languages for AI should provide the possibility of dialogue between the following pairs of neural networks:
- Two instances of one neural network
- Neural networks of the same model differing in version
- Neural networks of similar architecture and the same size
- Neural networks of similar architecture but different sizes
- Neural networks of different architectures and sizes but trained on the same dataset
- Alien AI
Perhaps all combinations will be impossible or impractical to implement within single language, and then several languages will appear, including a language for communication with smaller brothers - people. If the language is capable of providing a dialogue with a hypothetical arbitrary AI created by another civilization, then such a language will be sufficient.
It will not be a sequence of words/tokens. This type of speech/language is associated with human limitations that are not inherent in neural networks, which can use not only the presence of a token but also its weight and several tokens at once and just an array/vector of numbers. The most optimal option would be a language as close as possible to the internal representation/ architecture of ASI. If ASI thinks in embeddings vectors, then the language should also be such a vector. If it is a multidimensional vector, then the language should be the same. Apparently, we need to wait for the first real ASI and ask it to invent the language itself and the translation into tokens/human languages based on its knowledge. Such a lAInguage can be expanded by increasing the dimensionality of the data, while maintaining backward compatibility. With each fundamental change in the internal architecture of ASI, new models can reinvent their language anew if it is impossible to use the existing one or it is inappropriate to expand it.