Τετάρτη, 26 Ιούλιος 2017
Τρίτη, 16 Μάιος 2017 22:16

Facebook A novel approach to neural machine translation

Language translation is important to Facebook's mission of making the world more open and connected, enabling everyone to consume posts or videos in their preferred language — all at the highest possible accuracy and speed........

Today, the Facebook Artificial Intelligence Research (FAIR) team published research results using a novel convolutional neural network (CNN) approach for language translation that achieves state-of-the-art accuracy at nine times the speed of recurrent neural systems.1 Additionally, the FAIR sequence modeling toolkit (fairseq) source code and the trained systems are available under an open source license on GitHub so that other researchers can build custom models for translation, text summarization, and other tasks.

Why convolutional neural networks?

Originally developed by Yann LeCun decades ago, CNNs have been very successful in several machine learning fields, such as image processing. However, recurrent neural networks (RNNs) are the incumbent technology for text applications and have been the top choice for language translation because of their high accuracy.

Though RNNs have historically outperformed CNNs at language translation tasks, their design has an inherent limitation, which can be understood by looking at how they process information. Computers translate text by reading a sentence in one language and predicting a sequence of words in another language with the same meaning. RNNs operate in a strict left-to-right or right-to-left order, one word at a time. This is a less natural fit to the highly parallel GPU hardware that powers modern machine learning. The computation cannot be fully parallelized, because each word must wait until the network is done with the previous word. In comparison, CNNs can compute all elements simultaneously, taking full advantage of GPU parallelism. They therefore are computationally more efficient. Another advantage of CNNs is that information is processed hierarchically, which makes it easier to capture complex relationships in the data.

In previous research, CNNs applied to translation have not outperformed RNNs. Nevertheless, because of the architectural potential of CNNs, FAIR began research that has led to a translation model design showing strong performance of CNNs for translation. The greater computational efficiency of CNNs has the potential to scale translation and cover more of the world’s 6,500 languages.

State-of-the-art results at record speed

Our results demonstrate a new state-of-the-art compared with RNNs2 on widely used public benchmark data sets provided by the Conference on Machine Translation (WMT). When the CNN and the best RNN of similar size are trained in the same way, the CNN outperforms it by 1.5 BLEU on the WMT 2014 English-French task, a widely used metric for judging the accuracy of machine translation. On WMT 2014 English-German, the improvement is 0.5 BLEU, and on WMT 2016 English-Romanian, we improve by 1.8 BLEU.

One consideration with neural machine translation for practical applications is how long it takes to get a translation once we show the system a sentence. The FAIR CNN model is computationally very efficient and is nine times faster than strong RNN systems. Much research has focused on speeding up neural networks through quantizing weights or distillation, to name a few methods, and those can be equally applied to the CNN model to increase speed even more, suggesting significant future potential.

Better translation with multi-hop attention and gating

A distinguishing component of our architecture is multi-hop attention. An attention mechanism is similar to the way a person would break down a sentence when translating it: Instead of looking at the sentence only once and then writing down the full translation without looking back, the network takes repeated “glimpses” at the sentence to choose which words it will translate next, much like a human occasionally looks back at specific keywords when writing down a translation.3 Multi-hop attention is an enhanced version of this mechanism, which allows the network to make multiple such glimpses to produce better translations. These glimpses also depend on each other. For example, the first glimpse could focus on a verb and the second glimpse on the associated auxiliary verb.

In the figure below, we show when the system reads a French phrase (encoding) and then outputs an English translation (decoding). We first run the encoder to create a vector for each French word using a CNN, and the computation is done simultaneously. Next, the decoder CNN produces English words, one at a time. At every step, the attention glimpses the French sentence to decide which words are most relevant to predict the next English word in the translation. There are two so-called layers in the decoder, and the animation illustrates how the attention is done once for each layer. The strength of the green lines indicates how much the network focuses on each French word. When the network is being trained, the translation is always available, and the computation for the English words also can be done simultaneously.

Another aspect of our system is gating, which controls the information flow in the neural network. In every neural network, information flows through so-called hidden units. Our gating mechanism controls exactly which information should be passed on to the next unit so that a good translation can be produced. For example, when predicting the next word, the network takes into account the translation it has produced so far. Gating allows it to zoom in on a particular aspect of the translation or to get a broader picture — all depending on what the network deems appropriate in the current context.

Future developments

This approach is an alternative architecture for machine translation that opens up new possibilities for other text processing tasks. For example, multi-hop attention in dialogue systems allows neural networks to focus on distinct parts of the conversation, such as two separate facts, and to tie them together in order to better respond to complex questions.


[1] Convolutional Sequence to Sequence Learning. Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin. arXiv, 2017

[2] Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. Technical Report, 2016.

[3] Neural Machine Translation by Jointly Learning to Align and Translate. Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. International Conference on Learning Representations, 2015.

Source: facebook



Ροή Ειδήσεων - Newsfeed

Ημερήσιες Προβλέψεις για όλα τα Ζώδια 27/7/2017

Ο Ήλιος με τον Άρη σχηματίζουν σήμερα μία μεγαλειώδη Σύνοδο που θα...

Τροπολογία για τις ταυτότητες στο Πολυνομοσχέδιο ΟΤΑ - Τι αλλάζει

Ο υπουργός Προστασίας του Πολίτη, Νίκος Τόσκας, κατέθεσε τροπολογία στο πολυνομοσχέδιο του...

Ανοίγει η Σχολή Ξεναγών της Αθήνας μετά από 7 χρόνια

Υλοποιούνται οι δεσμεύσεις της κυβέρνησης για την αναβάθμιση της δημόσιας τουριστικής...

Follow Mykonos Ticker


Best of Mykonos Ticker

Τέλειο σώμα στην παραλία; Τα 10 μυστικά διατροφής!!

Ποια λάθη πρέπει να αποφύγετε!! Το καλοκαίρι είναι η κλασική εποχή που το άγχος για τα περιττά κιλά και την εξωτερική εμφάνιση φτάνει στο ζενίθ!!.......

Διαζύγιο: Πρώτη φορά χωριστές διακοπές με το παιδί!!

Αν έχετε χωρίσει από τον σύζυγό σας πώς θα διαχειριστείτε τις χωριστές διακοπές με το παιδί σας; Κάντε τις διακοπές πιο εύκολες.......

Πώς θα προστατευθείτε από τις καλοκαιρινές μολύνσεις!!

Μύκητες, κολπίτιδες… Οι κολπίτιδες, οι ουρολοιμώξεις και τα σεξουαλικώς μεταδιδόμενα νοσήματα Κατά την διάρκεια του καλοκαιριού αποτελούν συχνά προβλήματα που απειλούν να σας διαταράξουν...

Εγωκεντρικοί Άνθρωποι!! Θεωρούν τον εαυτό τους το κέντρο του κόσμου!!

Dr Λίζα Βάρβογλη, Ph.D., Ψυχολόγος-Ψυχοθεραπεύτρια Οι εγωκεντρικοί άνθρωποι έχουν παρόμοια ψυχολογικά χαρακτηριστικά με τους νάρκισσους, αλλά.....

Τι είναι η αυτοσυμπόνοια και γιατί

Η Σάντυ Κουτσοσταμάτη, Ψυχολόγος-Ψυχοθεραπεύτρια, μας εισάγει στην έννοια και τη σημασία της αυτοσυμπόνοιας Η συμπόνια προς τον εαυτό μας είναι σημαντικότερη από την αυτοεκτίμηση........

Μύκονος: Celebrities, «Ράμπο» της ΕΛ.ΑΣ. και πληρωμένες καλλονές

Όλα όσα γίνονται στο νησί των ανέμων Διονύσης Θανάσουλας Οι «αδιάφθοροι» της ΕΛ.ΑΣ. έφτασαν αιφνιδιαστικά και περνούν από κόσκινο αστυνομικούς και δημόσιους λειτουργούς ελέγχοντας...

Μερομήνια: Τι καιρό θα κάνει όλο τον Ιούλιο

Μερομήνια ονομάζει ο λαός τις δώδεκα πρώτες ημέρες του όγδοου φεγγαριού του έτους, δηλαδή του Αυγούστου. Παρατηρώντας τα καιρικά φαινόμενα αυτών των πρώτων ημερών της, οι άνθρωποι.....

Ο ΕΟΦ απαγορεύει επικίνδυνη κρέμα της NESTLE για μωρά

Την απαγόρευση διακίνησης και διάθεσης της παρτίδας L-70080291 του προϊόντος «Μπισκοτοκρέμα NESTLE» λόγω της παρουσίας πολυκυκλικών αρωματικών υδρογονανθράκων......

Έως τις 31 Ιουλίου η υποβολή ΑΠΔ για τους δικηγόρους

Σύμφωνα με ανακοίνωση του ΕΦΚΑ Στις 31 Ιουλίου λήγει η προθεσμία υποβολής Αναλυτικών Περιοδικών Δηλώσεων (ΑΠΔ) και.........

ΣτΕ: Απορρίφθηκε αίτημα για αναστολή της εγκατάστασης POS

Μετά την προσφυγή δικηγόρων Απορρίφθηκε από το Συμβούλιο της Επικρατείας το αίτημα για αναστολή της εγκατάστασης συστημάτων πληρωμών με κάρτες, το οποίο είχαν καταθέσει.......