An in depth, detailed review of some of the most famous policy gradient algorithms. Starting from vanilla policy gradient, then discussing actor-critic algorithms and finally visiting PPO. Reference implementations are shown and step-by-step improvements are discussed.
A blog post with a step-by-step implementation of the Transformer model with even more annotations. Specific design choices are discussed and hidden implementation details are highlighted. At the end you can see an example training on the Multi30k machine translation dataset.
A brief history of the most famous CNN architecture and how it was further improved. The evolution of the residual block is discussed and a procedure for designing the full residual network model is given.