8月21日是什么星座| 暧昧什么意思| 子宫内膜增生是什么原因| 皓是什么意思| 乳酸菌是什么菌| 失恋是什么意思| 娣什么意思| 苦杏仁味是什么中毒| 4月20号是什么星座| 西装革履什么意思| 经常胃胀气是什么原因引起的| 单位时间是什么意思| nb是什么意思| 吃饭咬到舌头什么原因| 小孩吃什么提高免疫力| 6月17号什么星座| 男人要的归属感是什么| 手淫过度会导致什么| 唐氏筛查临界风险是什么意思| 查甲状腺功能挂什么科| 白是什么结构的字| 床单是什么| 什么茶降血脂最好| 守护神是什么意思| 农历9月21日是什么星座| 血糖高吃什么可以降下来| 主动脉夹层是什么病| 经期吃什么好排除瘀血| 贫血吃什么药最快| 五行中水是什么颜色| 长期熬夜有什么危害| 森达属于什么档次的鞋| 脑壳疼是什么原因| 绞股蓝有什么功效| 乙肝通过什么途径传染| 2013属什么生肖| 养胃喝什么茶好| 背上长痘痘是什么原因| 一九四六年属什么生肖| 警告处分有什么影响| 性状是什么意思| 梦见好多猫是什么预兆| 尿检挂什么科| 人间尤物什么意思| 吸毒是什么感觉| 化疗后白细胞低吃什么食物补得快| 四月初四是什么节日| 叶黄素是什么| 地素女装属于什么档次| 橙子什么季节成熟| 长期喝酒有什么危害| b型o型生出来的孩子什么血型| 一热就头疼是什么原因| 11月18日什么星座| 驾驶证照片是什么底色| 世界上最多的动物是什么| 低血压吃什么好| 手脚热是什么原因| 骨关节炎是什么原因引起的| 美国为什么不打朝鲜| 什么是党的性质和宗旨的体现| 出水痘不能吃什么食物| 肚子疼看什么科| 孩子低烧吃什么药| 向日葵什么时候种| 伤官见官什么意思| ccr是什么意思| 造纸术是什么时候发明的| 菠萝蜜不能跟什么一起吃| 氟利昂是什么味道| 甲状腺结节有什么症状| 女人为什么要少吃鳝鱼| 多吃菠萝有什么好处| 蚂蚁最怕什么东西| 总是放响屁是什么原因| 手术后吃什么补品好| bioisland是什么牌子| hitachi是什么品牌| 灌肠为什么能通输卵管| 吃靶向药不能吃什么| 肌膜炎是什么原因造成的| 权衡是什么意思| 什么是肿瘤| 安全期是什么| 孕晚期吃什么好| 漏斗胸是什么原因造成的| 阴茎插入阴道是什么感觉| 塞翁失马什么意思| 嘛呢是什么意思| 红粉是什么意思| 处女座女生和什么星座男生最配| 玉米水喝了有什么好处| 骨折和断了有什么区别| 发炎不能吃什么东西| 213是什么意思| svc是什么意思| 正确的三观是什么| 中医为什么下午不把脉| 十二年是什么婚| 什么是音序| 伶人是什么意思| 女人月经总是提前是什么原因| 肺部斑片状高密度影是什么意思| 男士皮带什么品牌好| 万事顺意是什么意思| 抹茶是什么茶叶做的| 梦见别人吐血是什么预兆| c6是什么| 一个田一个比读什么| swag是什么意思| 左是什么结构的字| 从子是什么意思| 晚上一点多是什么时辰| 男人脚肿是什么原因| 月经推迟是什么原因导致的| 故友是什么意思| 意外是什么意思| 临界心电图是什么意思| 来月经头疼吃什么药| 曦是什么意思| 真菌是什么| 乳白色是什么颜色| 黄鳝喜欢吃什么| 济南有什么景点| 尿酸高喝什么| 息肉是什么东西| mrd是什么意思| 赞字五行属什么| 生死有命富贵在天什么意思| 乙肝五项135阳性是什么意思| 什么叫k线| 前列腺吃什么药见效快| 什么床不能睡觉| 1212是什么星座| 什么叫美尼尔综合症| 病毒为什么会变异| 11月2号什么星座| 脓肿是什么病| 纪念什么意思| 部队政委是什么级别| 72年属什么生肖| lime是什么水果| 雷字五行属什么| 紫藤什么时候开花| 白参是什么参| 经常性偏头疼是什么原因| 阿修罗道是什么意思| 四五天不排便是什么原因| 冰箱里有什么细菌| 三点水一个兆读什么| 蚊子除了吸血还吃什么| 地中海贫血是什么| 拔牙有什么危害| 肝囊肿是什么病| 什么不及什么| 拉烂屎是什么原因| 打飞机是什么意思| 为什么睡觉会流口水| 经常喝苏打水有什么好处和坏处| 540是什么意思| 饿是什么感觉| 男人喝劲酒有什么好处| 什么是表达方式| le是什么元素| 子宫囊肿严重吗有什么危害| 脊髓炎是什么病| 什么时期最容易怀孕| 什么是割礼| 空调开不了机是什么原因| 甲状腺在什么位置图片| 九个月宝宝吃什么辅食| 养胃是什么意思| 更年期失眠吃什么药调理效果好| 白细胞酯酶阳性什么意思| hipanda是什么牌子| 周海媚什么病| 甲状腺结节吃什么药好| 经常催吐有什么危害| poison是什么意思| 人体最大的消化腺是什么| 来大姨妈吃什么好| 木元念什么| 木瓜是什么味道| 双侧甲状腺弥漫病变是什么意思| 口干口苦什么原因| 老抽是什么| 白莲子和红莲子有什么区别| 摸胸是什么感觉| 宝宝什么时候断奶最好| 吃桃子有什么好处| 微信加入黑名单和删除有什么区别| 手起皮是什么原因| 做梦梦见棺材和死人是什么意思| 什么的妈妈| 早上九点到十点是什么时辰| 破绽是什么意思| 5月5号什么星座| 平血头晕吃什么药最好| 住院门槛费是什么意思| 从商是什么意思| cm什么单位| 里急后重什么意思| 潘氏试验阳性说明什么| 美尼尔氏综合症是什么病| 口蘑是什么蘑菇| 药流可以吃什么水果| 双鱼座和什么星座最配| 窈窕淑女是什么意思| 巨蟹座前面是什么星座| 一贫如什么| saq是什么意思| 碳酸钙d3片什么时候吃最好| 咳嗽吃什么水果| 逍遥丸主治什么病| 创伤急救的原则是什么| 重度贫血是什么原因引起的| 身体游走性疼痛什么病| 农历8月20日是什么星座| 舌头两边锯齿状是什么原因| 龋齿是什么样子的图片| 总是耳鸣是什么原因| 貔貅什么人不能戴| 凉粉用什么做的| 小孩血糖高是什么原因引起的| 四不放过是指什么| 淄博有什么大学| 彩超跟b超有什么区别| 感冒吃什么食物比较好| 子宫肌瘤是什么意思| 为什么会莫名其妙流鼻血| 什么是支气管扩张| 查输卵管是否堵塞要做什么检查| 虐心是什么意思| 六味地黄丸是治什么的| 多心是什么意思| 小孩子注意力不集中是什么原因| 无济于事的意思是什么| 中耳炎吃什么药效果好| 困水是什么意思| 马铃薯什么时候传入中国| 米线是用什么做的| 什么还珠| 白露是什么季节| 结石吃什么食物好| 做梦捡到钱是什么预兆| 锦鲤跳缸是什么原因| o2o什么意思| 指甲长出来是白色的什么原因| 自由基是什么| 低烧是什么原因| 女性雄激素过高是什么原因引起的| 风寒感冒吃什么食物| 小孩个子矮小吃什么促进生长发育| 爱情的故事分分合合是什么歌| 什么是脱脂牛奶| ab和b型血生的孩子是什么血型| 低钠有什么症状和危害| 广州有什么好吃的| 怀二胎初期有什么症状| 后厨打荷是干什么的| 信球什么意思| 1962年属什么| 穿孔是什么意思| 吃丝瓜有什么好处| 百度

No Language Left Behind

No Language Left Behind

Driving inclusion through the power of AI translation

Driving inclusion through the power of AI translation

Watch the video
Watch the video

About No Language

Left Behind

No Language Left Behind (NLLB) is a first-of-its-kind, AI breakthrough project that open-sources models capable of delivering evaluated, high-quality translations directly between 200 languages—including low-resource languages like Asturian, Luganda, Urdu and more. It aims to give people the opportunity to access and share web content in their native language, and communicate with anyone, anywhere, regardless of their language preferences.

About No Language Left Behind

No Language Left Behind (NLLB) is a first-of-its-kind, AI breakthrough project that open-sources models capable of delivering evaluated, high-quality translations directly between 200 languages—including low-resource languages like Asturian, Luganda, Urdu and more. It aims to give people the opportunity to access and share web content in their native language, and communicate with anyone, anywhere, regardless of their language preferences.

ai research for real-world application

火速围观!云和梯田的美就这样征服了万千游客的心!

百度 学生生下一个女婴后患肺炎,不治身亡,年仅18岁。

We’re committed to bringing people together. That’s why we’re using modeling techniques and learnings from our NLLB research to improve translations of low-resource languages on Facebook and Instagram. By applying these techniques and learnings to our production translation systems, people will be able to make more authentic, more meaningful connections in their preferred or native languages. In the future, we hope to extend our learnings from NLLB to more Meta apps.

REAL-WORLD APPLICATION

Building for an inclusive metaverse

A translated metaverse: bringing people together on a global scale

As we build for the metaverse, integrating real-time AR/VR text translation in hundreds of languages is a priority. Our aim is to set a new standard of inclusion—where someday everyone can have access to virtual-world content, devices and experiences, with the ability to communicate with anyone, in any language in the metaverse. And over time, bring people together on a global scale.

REAL-WORLD APPLICATION

Translating Wikipedia for everyone

Helping volunteer editors make information available in more languages

The technology behind the NLLB-200 model, now available through the Wikimedia Foundation’s Content Translation tool, is supporting Wikipedia editors as they translate information into their native and preferred languages. Wikipedia editors are using the technology to more efficiently translate and edit articles originating in other under-represented languages, such as Luganda and Icelandic. This helps to make more knowledge available in more languages for Wikipedia readers around the world. The open-source NLLB-200 model will also help researchers and interested Wikipedia editor communities build on our work.

Experience the Tech

Stories Told Through Translation:

books from around the world translated into hundreds of languages

Stories Told Through Translation:

books from around the world translated into hundreds of languages

Experience the power of AI translation with Stories Told Through Translation, our demo that uses the latest AI advancements from the No Language Left Behind project. This demo translates books from their languages of origin such as Indonesian, Somali and Burmese, into more languages for readers—with hundreds available in the coming months. Through this initiative, the NLLB-200 will be the first-ever AI model able to translate literature at this scale.

The Rose Village

By Su Nyein Chan

A farmer lives in a village that only grows red roses. What will happen when he plants strange seeds from a box he finds in his basement?

Read Story
The Elephant in My House

By Prum Kunthearo

When a baby elephant runs into their house, Botom is jealous by how much attention he gets. Can Botom get rid of the elephant, or will she become friends with the lovable creature as well?

Read Story
What Could I Become?

By Nabila Adani

A girl is inspired by a school assignment to think about what she wants to be when she grows up. What will her dreams inspire her to become?

Read Story
Samad in the forest

By Mohammed Umar

Samad loved animals. His dream was to spend a whole day in a forest and sleep in the treehouse. Follow Samad as he embarked on this adventure where he made wonderful friends and amazing discoveries. Going into a forest has never been so much fun.

Read Story
The Prince and the Tiger

By Wulan Mulya Pratiwi

The prince is lost in the forest. A tiger is tracking him. What will he do?

Read Story

The Tech

Machine translation explained

How does the open-source NLLB model directly translate 200 languages?

STAGE 1

Automatic dataset construction

Stage 1: Automatic dataset construction

Training data is collected containing sentences in the input language and desired output language.

Something Went Wrong
We're having trouble playing this video.To watch the video, please upgrade your web browser.

STAGE 2

Training

Stage 2: Training

After creating aligned training data for thousands of training directions, this data is fed into our model training pipeline. These models are made up of two parts: the encoder, which converts the input sentence into an internal vector representation; and the decoder, which takes this internal vector representation and accurately generates the output sentence. By training on millions of example translations, models learn to generate more accurate translations.

Something Went Wrong
We're having trouble playing this video.To watch the video, please upgrade your web browser.

STAGE 3

Evaluation

Stage 3: Evaluation

Finally, we evaluate our model against a human-translated set of sentence translations to confirm that we are satisfied with the translation quality. This includes detecting and filtering out profanity and other offensive content through the use of toxicity lists we build for all supported languages. The result is a well-trained model that can directly translate a language.

Something Went Wrong
We're having trouble playing this video.To watch the video, please upgrade your web browser.

STAGE 1

Automatic dataset construction

STAGE 2

Training

STAGE 3

Evaluation

Stage 1: Automatic dataset construction

Training data is collected containing sentences in the input language and desired output language.

Something Went Wrong
We're having trouble playing this video.To watch the video, please upgrade your web browser.

Stage 2: Training

After creating aligned training data for thousands of training directions, this data is fed into our model training pipeline. These models are made up of two parts: the encoder, which converts the input sentence into an internal vector representation; and the decoder, which takes this internal vector representation and accurately generates the output sentence. By training on millions of example translations, models learn to generate more accurate translations.

Something Went Wrong
We're having trouble playing this video.To watch the video, please upgrade your web browser.

Stage 3: Evaluation

Finally, we evaluate our model against a human-translated set of sentence translations to confirm that we are satisfied with the translation quality. This includes detecting and filtering out profanity and other offensive content through the use of toxicity lists we build for all supported languages. The result is a well-trained model that can directly translate a language.

Something Went Wrong
We're having trouble playing this video.To watch the video, please upgrade your web browser.

The Innovations

The science behind the breakthrough

Most of today’s machine translation (MT) models work for mid- to high-resource languages—leaving most low-resource languages behind. AI at Meta researchers are addressing this issue with three significant AI innovations.

Automatic dataset construction for low-resource languages

The context

MT is a supervised learning task, which means the model needs data to learn from. Example translations from open-source data collections are often used. Our solution is to automatically construct translation pairs by pairing sentences in different collections of monolingual documents.

The challenge

The LASER models used for this dataset creation process primarily support mid- to high-resource languages, making it impossible to produce accurate translation pairs for low-resource languages.

The innovation

We solved this by investing in a teacher-student training procedure, making it possible to 1) extend LASER’s language coverage to 200 languages, and 2) produce a massive amount of data, even for low resource languages.

Modeling 200 languages

The context

Multilingual MT systems have been improved upon over bilingual systems. This is due to their ability to enable "transfer" from language pairs with plenty of training data, to other languages with fewer training resources.

The challenge

Jointly training hundreds of language pairs together has its disadvantages, as the same model must represent increasingly large numbers of languages with the same number of parameters. This is an issue when the dataset sizes are imbalanced, as it can cause overfitting.

The innovation

We’ve developed a Sparse Mixture-of-Experts model that has a shared and specialized capacity, so low-resource languages without much data can be automatically routed to the shared capacity. When combined with better regularization systems, this avoids overfitting. Further, we used self-supervised learning and large-scale data augmentation through multiple types of back translation.

Evaluating translation quality

The context

To know if a translation produced by our model meets our quality standards, we must evaluate it.

The challenge

Machine translation models are typically evaluated by comparing machine-translated sentences with human translations, however for many languages, reliable translation data is not available. So accurate evaluations are not possible.

The innovation

We extended 2x the coverage of FLORES, a human-translated evaluation benchmark, to now cover 200 languages. Through automatic metrics and human evaluation support, we’re able to extensively quantify the quality of our translations.
Learn more about the science behind NLLB by reading our whitepaper and blog, and by downloading the model to help us take this project further.

Learn more about the science behind NLLB by reading our whitepaper and blog, and by downloading the model to help us take this project further.

The Journey

Research milestones
Research milestones

AI at Meta has been advancing Machine Translation technology while successfully overcoming numerous industry challenges along the way—from the unavailability of data for low-resource languages to translation quality and accuracy. Our journey continues, as we drive inclusion through the power of AI translation.

AI at Meta has been advancing Machine Translation technology while successfully overcoming numerous industry challenges along the way—from the unavailability of data for low-resource languages to translation quality and accuracy. Our journey continues, as we drive inclusion through the power of AI translation.

See model milestones by # of languages released

< 50 languages

50-99 languages

100 languages

LASER (Language-agnostic sentence representations)

2018

The first successful exploration of massively multilingual sentence representations shared publicly with the NLP community. The encoder creates embeddings to automatically pair up sentences sharing the same meaning in 50 languages.

Data Encoders

WMT-19

2019

FB AI models outperformed all other models at WMT 2019, using large-scale sampled back-translation, noisy channel modeling and data cleaning techniques to help build a strong system.

Model

Flores V1

2019

A benchmarking dataset for MT between English and low-resource languages introducing a fair and rigorous evaluation process, starting with 2 languages.

Evaluation Dataset

WikiMatrix

2019

The largest extraction of parallel sentences across multiple languages: Bitext extraction of 135 million Wikipedia sentences in 1,620 language pairs for building better translation models.

Data Construction

M2M-100

2020

The first, single multilingual machine translation model to directly translate between any pair of 100 languages without relying on English data. Trained on 2,200 language directions —10x more than previous multilingual models.

Model

CCMatrix

2020

The largest dataset of high-quality, web-based bitexts for building better translation models that work with more languages, especially low-resource languages: 4.5 billion parallel sentences in 576 language pairs.

Data Construction

LASER 2

2020

Creates embeddings to automatically pair up sentences sharing the same meaning in 100 languages.

Data Encoders

WMT-21

2021

For the first time, a single multilingual model outperformed the best specially trained bilingual models across 10 out of 14 language pairs to win WMT 2021, providing the best translations for both low- and high-resource languages.

Model

FLORES-101

2021

FLORES-101 is the first-of-its-kind, many-to-many evaluation data set covering 101 languages, enabling researchers to rapidly test and improve upon multilingual translation models like M2M-100.

Evaluation Dataset

NLLB-200

2022

The NLLB model translates 200 languages.

Model

FLORES 200

2021

Expansion of FLORES evaluation data set now covering 200 languages

Evaluation Dataset

NLLB-Data-200

2022

Constructed and released training data for 200 languages

Evaluation Dataset

LASER 3

2022

Creates embeddings to automatically pair up sentences sharing the same meaning in 200 languages.

Data Encoders

< 50 languages

50-100 languages

100 languages

LASER (Language-agnostic sentence representations)

2018

The first successful exploration of massively multilingual sentence representations shared publicly with the NLP community. The encoder creates embeddings to automatically pair up sentences sharing the same meaning in 50 languages.

Data Encoders

WMT-19

2019

FB AI models outperformed all other models at WMT 2019, using large-scale sampled back-translation, noisy channel modeling and data cleaning techniques to help build a strong system.

Model

Flores V1

2019

A benchmarking dataset for MT between English and low-resource languages introducing a fair and rigorous evaluation process, starting with 2 languages.

Evaluation Dataset

WikiMatrix

2019

The largest extraction of parallel sentences across multiple languages: Bitext extraction of 135 million Wikipedia sentences in 1,620 language pairs for building better translation models.

Data Construction

M2M-100

2020

The first, single multilingual machine translation model to directly translate between any pair of 100 languages without relying on English data. Trained on 2,200 language directions —10x more than previous multilingual models.

Model

CCMatrix

2020

The largest dataset of high-quality, web-based bitexts for building better translation models that work with more languages, especially low-resource languages: 4.5 billion parallel sentences in 576 language pairs.

Data Construction

LASER 2

2020

Creates embeddings to automatically pair up sentences sharing the same meaning in 100 languages.

Data Encoders

WMT-21

2021

For the first time, a single multilingual model outperformed the best specially trained bilingual models across 10 out of 14 language pairs to win WMT 2021, providing the best translations for both low- and high-resource languages.

Model

FLORES-101

2021

FLORES-101 is the first-of-its-kind, many-to-many evaluation data set covering 101 languages, enabling researchers to rapidly test and improve upon multilingual translation models like M2M-100.

Evaluation Dataset

NLLB-200

2022

The NLLB model translates 200 languages.

Model

FLORES 200

2021

Expansion of FLORES evaluation data set now covering 200 languages

Evaluation Dataset

NLLB-Data-200

2022

Constructed and released training data for 200 languages

Evaluation Dataset

LASER 3

2022

Creates embeddings to automatically pair up sentences sharing the same meaning in 200 languages.

Data Encoders

From Assamese, Balinese and Estonian…to Icelandic, Igbo and more. 200 languages and counting…

Have a look at the full list of languages our NLLB-200 model supports—with 150 low-resource languages included. More will be added to this list as we, and our community, continue on this journey of inclusiveness through AI translation.

Full list of supported languages

Acehnese (Latin script)

Arabic (Iraqi/Mesopotamian)

Arabic (Yemen)

Arabic (Tunisia)

Afrikaans

Arabic (Jordan)

Akan

Amharic

Arabic (Lebanon)

Arabic (MSA)

Arabic (Modern Standard Arabic)

Arabic (Saudi Arabia)

Arabic (Morocco)

Arabic (Egypt)

Assamese

Asturian

Awadhi

Aymara

Crimean Tatar

Welsh

Danish

German

French

Friulian

Fulfulde

Dinka(Rek)

Dyula

Dzongkha

Greek

English

Esperanto

Estonian

Basque

Ewe

Faroese

Iranian Persian

Icelandic

Italian

Javanese

Japanese

Kabyle

Kachin | Jinghpo

Kamba

Kannada

Kashmiri (Arabic script)

Kashmiri (Devanagari script)

Georgian

Kanuri (Arabic script)

Kanuri (Latin script)

Kazakh

Kabiye

Thai

Khmer

Kikuyu

South Azerbaijani

North Azerbaijani

Bashkir

Bambara

Balinese

Belarusian

Bemba

Bengali

Bhojpuri

Banjar (Latin script)

Tibetan

Bosnian

Buginese

Bulgarian

Catalan

Cebuano

Czech

Chokwe

Central Kurdish

Fijian

Finnish

Fon

Scottish Gaelic

Irish

Galician

Guarani

Gujarati

Haitian Creole

Hausa

Hebrew

Hindi

Chhattisgarhi

Croatian

Hugarian

Armenian

Igobo

IIocano

Indonesian

Kinyarwanda

Kyrgyz

Kimbundu

Konga

Korean

Kurdish (Kurmanji)

Lao

Latvian (Standard)

Ligurian

Limburgish

Lingala

Lithuanian

Lombard

Latgalian

Luxembourgish

Luba-Kasai

Ganda

Dholuo

Mizo

Full list of supported languages

Acehnese (Latin script)

Arabic (Iraqi/Mesopotamian)

Arabic (Yemen)

Arabic (Tunisia)

Afrikaans

Arabic (Jordan)

Akan

Amharic

Arabic (Lebanon)

Arabic (MSA)

Arabic (Modern Standard Arabic)

Arabic (Saudi Arabia)

Arabic (Morocco)

Arabic (Egypt)

Assamese

Asturian

Awadhi

Aymara

Crimean Tatar

Welsh

Danish

German

French

Friulian

Fulfulde

Dinka(Rek)

Dyula

Dzongkha

Greek

English

Esperanto

Estonian

Basque

Ewe

Faroese

Iranian Persian

Icelandic

Italian

Javanese

Japanese

Kabyle

Kachin | Jinghpo

Kamba

Kannada

Kashmiri (Arabic script)

Kashmiri (Devanagari script)

Georgian

Kanuri (Arabic script)

Kanuri (Latin script)

Kazakh

Kabiye

Thai

Khmer

Kikuyu

South Azerbaijani

North Azerbaijani

Bashkir

Bambara

Balinese

Belarusian

Bemba

Bengali

Bhojpuri

Banjar (Latin script)

Tibetan

Bosnian

Buginese

Bulgarian

Catalan

Cebuano

Czech

Chokwe

Central Kurdish

Fijian

Finnish

Fon

Scottish Gaelic

Irish

Galician

Guarani

Gujarati

Haitian Creole

Hausa

Hebrew

Hindi

Chhattisgarhi

Croatian

Hugarian

Armenian

Igobo

IIocano

Indonesian

Kinyarwanda

Kyrgyz

Kimbundu

Konga

Korean

Kurdish (Kurmanji)

Lao

Latvian (Standard)

Ligurian

Limburgish

Lingala

Lithuanian

Lombard

Latgalian

Luxembourgish

Luba-Kasai

Ganda

Dholuo

Mizo

200 languages translated by NLLB-200 model, 2x our previous model

Our final model has +44% BLEU performance improvement over the previous state-of-the-art model

75 languages previously unsupported by commercial translation systems

18 billion parallel sentences, 2.5x more training data than previous model M2M-100 model

Largest open-source machine translation model 54B, 5x number of parameters bigger than previous M2M-100 model

40,000 translation directions supported by a single model—more than 4x the capability of previous benchmark

The research advancements from NLLB supports more than 25 billion translations served every day on Facebook News Feed, Instagram, and our other platforms

200 languages translated by NLLB-200 model, 2x our previous model

Our final model has +44% BLEU performance improvement over the previous state-of-the-art model

75 languages previously unsupported by commercial translation systems

18 billion parallel sentences, 2.5x more training data than previous model M2M-100 model

Largest open-source machine translation model 54B, 5x number of parameters bigger than previous M2M-100 model

40,000 translation directions supported by a single model—more than 4x the capability of previous benchmark

The research advancements from NLLB supports more than 25 billion translations served every day on Facebook News Feed, Instagram, and our other platforms

Learn More

Let's take No Language Left Behind further, together.

There’s more to learn about NLLB, and even more to accomplish with it. Read our whitepaper and blog for details, and download the model to help us take this project further. While we’ve reached 200 languages, we’ve only just begun. Join us, and build with us, as we continue on this important journey of translation and inclusion.

毛尖属于什么茶 教研是什么意思 mommy什么意思 霜和乳有什么区别 保卡是什么意思
pigeon是什么牌子自行车 四面楚歌是什么生肖 蛇属于什么动物 ABB的词语有什么 尿道口痛什么原因
内分泌紊乱有什么症状表现 经期量少吃什么来排血 上皮内低度病变是什么意思 12月31号什么星座 半夜十二点是什么时辰
觊觎什么意思 13岁属什么生肖 cpb是什么牌子 甲状腺结节不能吃什么食物 松花粉有什么功效
滂沱是什么意思hcv9jop0ns8r.cn 五级职员是什么级别hcv7jop5ns5r.cn hpv59阳性是什么意思hcv8jop1ns2r.cn oem贴牌是什么意思hcv9jop4ns3r.cn 尼姑是什么生肖hcv8jop9ns8r.cn
护理是做什么的hcv7jop5ns6r.cn 健字五行属什么yanzhenzixun.com 百合与什么搭配最好hcv9jop1ns2r.cn 亲吻是什么意思hcv9jop3ns3r.cn reald厅什么意思hcv8jop7ns9r.cn
6月6日是什么星座hkuteam.com nba是什么意思的缩写hcv8jop4ns5r.cn 鱼香肉丝是什么菜系hcv7jop5ns6r.cn 滨海新区有什么好玩的地方hcv8jop2ns1r.cn 女生适合什么工作hcv7jop7ns1r.cn
花开富贵是什么生肖hcv8jop5ns2r.cn 入定是什么意思hcv8jop1ns1r.cn 1955属什么生肖helloaicloud.com 笃笃是什么意思hcv9jop1ns2r.cn 尿精是什么原因造成的hcv7jop6ns6r.cn
百度