Philipp Koehn (Part 2) - How Neural Networks have Transformed Machine Translation
Manage episode 297999657 series 2954151
This is Part 2 of our conversation with Professor Philipp Koehn of Johns Hopkins University. Professor Koehn is one of the world’s leading experts in the field of Machine Translation & NLP.
In this episode we delve into commercial applications of machine translation, open source tools available and also take a look into what to expect in the field in the future.
Episode Summary:
- Typical datasets used for training models
- The role of infrastructure and technology in Machine Translation
- How the academic research in Machine Translation has manifested into industry applications
- Overview of what’s available in Open source tools for Machine Translation
- The Future of Machine Translation and can it pass a Turing test
Resources:
Philipp Koehn latest book - Neural Machine Translation - Amazon link:
https://www.amazon.com/Neural-Machine-Translation-Philipp-Koehn/dp/1108497322
Omniscien Technologies - Leading Enterprise Provider of machine translation services:
Open Source tools:
- Fairseq https://fairseq.readthedocs.io/en/latest/
- Marian https://marian-nmt.github.io/
- OpenNMT https://opennmt.net/
- Sockeye https://awslabs.github.io/sockeye/
Translated texts (parallel data) for training:
- OPUS http://opus.nlpl.eu/
- Paracrawl https://paracrawl.eu/
Two papers mentioned about excessive use of computing power to train NLP models:
- GPT-3 https://arxiv.org/abs/2005.14165
- Roberta https://arxiv.org/abs/1907.11692
26 bölüm