Botminds just announced that its AI platform can now read and understand Arabic. That’s excellent news for Botminds, a firm we will cover in this month’s published research. But the fact that this release is such a big deal illustrates an element of the AI world that many don’t like to spotlight; significant advances in language-based AI ignore most of the world’s languages and, by implication, most of the world’s population. In fairness, there are over 7,000 living languages in the world, and it’s unrealistic to expect all of these to be supported. But this trend of English first is a problematic one.
Many AI platforms claim to support a range of languages, but that obscures the fact that, for example, ChatGPT, which can converse in 95 languages, is, in fact, mono-cultural as it was trained on English language text. That same bias goes for many more such platforms.
Actually, it’s not just language that is heavily skewed; AI for facial recognition has a long history of doing a poor job when it comes to identifying those that are not caucasian. Unsurprisingly, they were heavily trained on regular white faces. It all gets a little more worrisome when we discover that AI systems are trained on one human language or type as they learn the gender and racial biases instilled within that type. Or, to put it less controversially, the AI learns the values of the English-speaking world and remains ignorant of other cultures. You might be able to converse with it in your regional language. Still, you are talking with a machine that sees the world from an English-speaking perspective and knows nothing about your cultural preferences and idiosyncrasies.
That’s a long way around saying that specialized, highly curated LLMs (large language models) have a distinct advantage over massive, generalized LLMs. That’s why at Deep Analysis, we champion them.
Languages other than English will likely always take a backseat in AI, as that is the quickest way to profit. But this approach will bring a lot of accumulative baggage that may prove impossible to unpack.
It’s great to see firms like Botminds recognizing and supporting an ever wider number of human languages. We hope to see more and more follow suit over time but let’s not fool ourselves that this is a soon-to-be-solved problem. It has many layers, and our industry doesn’t seem to be in a rush to address them.