![The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/elmo-word-embedding.png)
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.
![The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/transformer-ber-ulmfit-elmo.png)
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.
Catherine Yeo (she/her) on Twitter: "This trend started with ELMo (Embeddings from Language Models) in 2018 by Matthew Peters, @MarkNeumannnn, @MohitIyyer, @nlpmattg, et al. Outside NLP, @elmo likes surprises, pizza, and bubble
![The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/openai-input%20transformations.png)
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.
![The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/bert-transfer-learning.png)
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.
![Beyond Word Embeddings Part 2. A primer in the neural nlp model… | by Aaron (Ari) Bornstein | Towards Data Science Beyond Word Embeddings Part 2. A primer in the neural nlp model… | by Aaron (Ari) Bornstein | Towards Data Science](https://miro.medium.com/max/1160/1*euk-3hzyi9nJvTdWFmfrqQ.png)
Beyond Word Embeddings Part 2. A primer in the neural nlp model… | by Aaron (Ari) Bornstein | Towards Data Science
![The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time. The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.](https://jalammar.github.io/images/bert-tasks.png)
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – Jay Alammar – Visualizing machine learning one concept at a time.
![Beyond Word Embeddings Part 2. A primer in the neural nlp model… | by Aaron (Ari) Bornstein | Towards Data Science Beyond Word Embeddings Part 2. A primer in the neural nlp model… | by Aaron (Ari) Bornstein | Towards Data Science](https://miro.medium.com/max/1234/1*8WhXg3oXUC4s-m7F2ePLEA.png)