Meta unveils AI model enabling “seamless” multilingual communication


Meta Platforms, the parent company of (the company formerly known as) Facebook, unveiled on Tuesday an advanced AI model that boasts the capability to efficiently translate and transcribe speech in numerous languages. This technology holds the potential to facilitate real-time cross-lingual communication, which sci-fi fans will surely be excited to hear.

According to an official company blog post, the newly introduced “SeamlessM4T” AI model combines technologies to facilitate translations between text and speech across nearly 100 languages and can perform complete speech-to-speech translations for 35 languages.

“[SeamlessM4T model’s audio training data was derived from] raw audio originating from a publicly available repository of crawled web data.”

Meta engineer

Mark Zuckerberg, the CEO of Meta, envisions these tools playing a crucial role in fostering interactions among users from diverse parts of the world within the metaverse and asserts that embracing an open AI ecosystem is a strategic advantage for Meta. The company stands to gain more by harnessing the collective efforts of the community to craft user-centric tools for its social platforms, rather than pursuing a model that charges for access to these resources.

In a bid to encourage widespread use, Meta has made the SeamlessM4T model accessible to the public for non-commercial purposes, continuing the company’s trend of releasing mostly free AI models throughout the current year.

AI has generated new legal questions

However, Meta, like other industry players, is not immune to legal queries pertaining to the source data used in training its models. In a notable case from July, comedian Sarah Silverman and two other authors filed copyright infringement lawsuits against both Meta and OpenAI, alleging unauthorized use of their books as training material.

An illustrative image of an artificial intelligence (AI) bot. (credit: INGIMAGE)

As elucidated in a research paper by Meta’s researchers, the SeamlessM4T model’s audio training data was derived from an extensive pool of “raw audio originating from a publicly available repository of crawled web data.” The specifics of this repository remain undisclosed.

Regarding textual data, the research paper states that it was obtained from datasets constructed in the previous year, drawing content from Wikipedia and related online sources.

Meta’s recent release of the SeamlessM4T AI model marks a significant step towards achieving seamless cross-lingual communication. However, the company, like its competitors, will need to navigate the intricacies of training data usage within the legal framework if it hopes to achieve long-term success.