An Introduction to Transformers and Large Language Models

Jasper Hoffmann

Neurorobotics Lab, University of Freiburg

Tuesday, March 14, 2023, 11:00

Room 02-012, Georges-Köhler Allee 102, Freiburg 79110, Germany

Recent large language models like Generative Pre-trained Transformer 3 (GPT3) and ChatGPT show great potential in generating poetry, source code, or even essays. In this talk, we will give a quick dive into large language models for people with no prior experience in transformers or language modelling. Starting from natural language generation and transformers, we further discuss the pre-training tasks of GPT3 as well as the reinforcement learning methods used to train ChatGPT. Finally, we conclude by exploring some of the limitations and potentials of GPT models.