Google Translate is breaking new ground by adding 110 new languages to its translation capabilities. This expansion, announced by senior software engineer Isaac Caswell, significantly broadens the scope of the tool, allowing it to cater to over 614 million additional speakers worldwide. The update nearly doubles the number of supported languages, bringing the total to 243.
Leveraging AI for Language Expansion
Google’s PaLM 2 AI large language model plays a crucial role in this development. It has enabled Google Translate to learn new languages, particularly those closely related to existing ones. For example, the model excels in translating languages like Awadhi and Marwari, which are similar to Hindi, as well as French creoles like Seychellois Creole and Mauritian Creole.
Caswell highlighted that this update represents Google’s most extensive expansion of African languages to date, with about a quarter of the new additions coming from the continent. New African languages include Fon, Kikongo, Luo, Ga, Swati, Venda, and Wolof.
Addressing Long-Requested Languages
Cantonese, one of the most requested languages, is finally part of Google Translate. The complexity of Cantonese, which often overlaps with Mandarin in writing, made it challenging to gather the necessary data and train the models. However, with this update, Cantonese speakers can now benefit from Google Translate’s capabilities.
Highlighted New Languages
- Afar: Spoken in Djibouti, Eritrea, and Ethiopia, it received significant volunteer community contributions.
- Cantonese: Overlaps with Mandarin in writing, making it a complex addition.
- Manx: A Celtic language from the Isle of Man, revitalized after nearly going extinct in 1974.
- NKo: A standardized form of the West African Manding languages with a unique alphabet.
- Punjabi (Shahmukhi): The most spoken language in Pakistan, written in the Perso-Arabic script.
- Tamazight (Amazigh): A Berber language spoken across North Africa, supported in both Latin and Tifinagh scripts.
- Tok Pisin: An English-based creole and the lingua franca of Papua New Guinea.
Future Goals
This significant update is a step toward Google’s ambitious goal of supporting 1,000 languages through AI. The new languages represent diverse linguistic communities, from major world languages with over 100 million speakers to languages spoken by small Indigenous communities. Some of these languages have almost no native speakers but are part of active revitalization efforts.
Adding new languages involves careful consideration of dialects and spelling standards. Google prioritizes the most commonly used varieties, ensuring that the translations are as accurate and relevant as possible. For instance, the Romani language model focuses on Southern Vlax Romani but includes elements from other dialects like Northern Vlax and Balkan Romani.
Getting Started
Users can start exploring these new languages on Google Translate or the Google Translate app available on Android and iOS. Visit the Help Center for more details on the newly supported languages.