Large Language Models - Bachelors Seminar (SS 2024)

Summary

Large Language Models (such as GPT2, GPT3, GPT4, RoBERTa, T5) and Intelligent Chatbots (such as ChatGPT, Bard and Claude) are a very timely topic.

Contents:

N-gram language models, neural language modeling, word2vec, RNNs, Transformers, BERT, RLHF, ChatGPT, multilingual alignment, prompting, transfer learning, domain adaptation, linguistic knowledge in large language models

Learning Goals:

The participants will first learn the basics of n-gram language models, neural language modeling, RNNs and Transformers. In the second half of the seminar, participants will present an application of a modern large language model, intelligent chatbot or similar system. This class will involve a large amount of reading on both the basics and advanced topics.

Instructors

Alexander Fraser
email: contact101@alexanderfraser.de

TUM


Dr. Marion Di Marco

TUM


Schedule

Tuesdays: 16:15-17:45, D.2.12



Read the materials *before* class!
Date Topic Materials
April 16th Introduction to the class (AF) and Language for LLMs (MDM) orientation.pdf
Linguistic Background for LLMs
April 23rd Dan Jurafsky and James H. Martin (2023). Speech and Language Processing (3rd ed. draft), Chapter 3, N-gram Language Models article lecture
April 30th Y Bengio, R Ducharme, P Vincent, C Jauvin (2003). A neural probabilistic language model. Journal of Machine Learning Research 3, 1137-1155 pdf
May 7th Noah A. Smith (2020). Contextual Word Representations: A Contextual Introduction. arXiv. pdf
May 14th Lena Voita. NLP Course: Neural Language Models, Sequence to Sequence (seq2seq) and Attention. Web Tutorial neural language models
seq2seq and attention
May 28th Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin (2017). Attention Is All You Need. NIPS paper
June 4th Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT paper
June 11th Project: structured prompting project-paper
project-data-english
project-data-hindi
project-data-telugu
June 18th Long Ouyang, Jeff Wu, et al. (2022). Training language models to follow instructions with human feedback. arXiv. paper (InstructGPT)
June 25th Project
July 2nd Lecture on Linguistic Information in Multilingual Language Models (MDM) pdf
June 9th Project Presentation