Transformer is a powerful architecture that can be difficult to understand. There are many great explanations on the web, each approaching the subject in a different way. Here I link the explanations I liked, and mention who I believe the target audience is for each one.

The goal is to provide a collection of links for you to choose from, but reading all of them is still helpful to engage with the concept from different perspectives & to cement your knowledge.

  • Transformers from scratch
    • Target audience: people with Machine Learning background
    • Comment/opinion: An all-around outstanding explanation that includes clear code & excellent illustrations. A personal favorite.

  • The transformer … “explained”?
    • Target audience: people with general Computer Science background
    • Comment/opinion: Excellent overview, motivation, and intuition. No pictures. No math. Short.

  • Formal Algorithms for Transformers
    • Target audience: mathematically-minded ML people
    • Comment/opinion: Very nice & clear formalism. No pictures.

  • The Illustrated Transformer
    • Target audience: ML people who know what an embedding is
    • Comment/opinion: Great illustrations. Explanation of self-attention was not intuitively clear to me.

  • Transformer - Illustration and code
    • Target audience: ML people who know what an embedding is & find reading code helpful
    • Description by the author: “This notebook combines the excellent illustration of the transfomer by Jay Alammar and the code annonation by harvardnlp lab.”
    • Opinion: Reading the code was very helpful for me. Math not rendered nicely. Should be read after The Illustrated Transformer.

  • The Annotated Transformer
    • Target audience: ML people who know what an embedding is, find reading code helpful, and are interested in full details of the original Transformer paper
    • Comment/opinion: This is a rearranged version of the paper intermingled with the code. Extensive. Math rendered nicely. Illustrations are ok.

That’s it! If you have suggestions on what else to include, add a comment!