Embed, encode, attend, predict: The new deep learning formula for state-of-the-art NLP models

morganism

Embed, encode, attend, predict: The new deep learning formula for state-of-the-art NLP models

by morganism

1 min read12th Nov 20161 comment

9

Personal Blog

This is a linkpost for https://explosion.ai/blog/deep-learning-formula-nlp

New Comment

1 comment, sorted by

top scoring

Click to highlight new comments since: Today at 1:08 AM

[-]morganism7y30

"The main factor that drives the model's accuracy is the bidirectional LSTM encoder, to create the position-sensitive features. The authors demonstrate this by swapping the attention mechanism out for average pooling. With average pooling, the model still outperforms the previous state-of-the-art on all benchmarks. However, the attention mechanism improves performance further on all evaluations. I find this especially interesting. The implications are quite general — there are after all plenty of situations where you want to reduce a matrix to a vector for further prediction, without reference to any particular external context.

Reply

Moderation Log