If you say, “Alexa, faint o’r gloch yw hi?” the smart speaker will not understand that you are asking for the time of day. That’s because Welsh is not one of the eight languages currently supported by Amazon’s Alexa-enabled devices. Gareth Morlais, a Welsh language and digital media specialist for the Welsh government, has argued for years that this language gap is disturbing. In a 2017 presentation, Morlais noted that the Welsh language, then ranked 172nd in the world by number of speakers, was not supported by Alexa, Twitter, or Google’s search interface. At the time, Alexa only spoke and understood two languages: English and German. “The technology actually tells you which language your family can speak at home, which is a horror story,” Morlais said. “What we need to do here is try to shape the technology so that it speaks the same language that we want to speak.”
Although Alexa still does not speak or understand Welsh, the Celtic language’s presence in tech has increased dramatically within a short period. Google announced in February that it had expanded its offerings in Docs, Sheets, Slides, and Drive to include Welsh. And Google Translate—infamous since 2009 for its Scymraeg, or scummy Welsh—has, according to the BBC, recently taken a great leap forward in terms of the accuracy and quality of its Welsh translations. Morlais and others attribute this in part to the fact that there are now more than 100,000 articles on the Welsh version of Wikipedia, known as Wicipedia.
Like other language editions, Wicipedia is a separate website with its own content, not simply a translation of English Wikipedia, a distinction that matters for both users and big tech companies. Back in 2017, Morlais observed, “There appears to be an indication that there is a link between the languages with the most Wikipedia articles or pages and the languages that are supported by the digital giants.” Google Translate and other technologies use artificial neural networks to learn from example, training themselves with language data from rich internet sources like Welsh Wikipedia.
The Welsh community is not alone in using wiki-technology to promote its language. This year’s Celtic Knot conference in Cornwall, England, included several indigenous languages with their own Wikipedia editions. The original idea, as the name suggests, was to focus on Celtic languages, including Irish, Breton, Scottish Gaelic, Welsh, and Cornish (which was declared extinct merely a decade ago), as well as Scots.* But as word got out about a Wikipedia minority language conference, others began to join, representing, for example, the Sámi language spoken in parts of Norway, Finland, Sweden, and Russia; the Berber family of languages spoken in Northern Africa; and the Basque and Catalan communities. (In his 2017 presentation, Morlais noted that Catalan was one of the few minority languages supported by Google search, an accomplishment he linked to the fact that Catalan already had more than 500,000 articles on its language edition of Wikipedia.)
Read more: Slate