Large Language Models - Bachelors Seminar (SS 2024)

Prof. Dr. Alexander M. Fraser
TU Munich - Chair for Data Analytics & Statistics
Teaching Page

Large Language Models - Bachelors Seminar (SS 2024)

Summary

Large Language Models (such as GPT2, GPT3, GPT4, RoBERTa, T5) and Intelligent Chatbots (such as ChatGPT, Bard and Claude) are a very timely topic.

Contents:

N-gram language models, neural language modeling, word2vec, RNNs, Transformers, BERT, RLHF, ChatGPT, multilingual alignment, prompting, transfer learning, domain adaptation, linguistic knowledge in large language models

Learning Goals:

The participants will first learn the basics of n-gram language models, neural language modeling, RNNs and Transformers. In the second half of the seminar, participants will present an application of a modern large language model, intelligent chatbot or similar system. This class will involve a large amount of reading on both the basics and advanced topics.

Instructors

Prof. Dr. Alexander Fraser

TUM

Dr. Marion Di Marco, TUM
email: marion.dimarco -AT- tum.de

Schedule

Tuesdays: 16:15-17:45, D.2.12

Read the materials *before* class!

Date Topic Materials

April 16th Introduction to the class (AF) and Language for LLMs (MDM) orientation.pdf
Linguistic Background for LLMs

April 23rd Dan Jurafsky and James H. Martin (2023). Speech and Language Processing (3rd ed. draft), Chapter 3, N-gram Language Models article lecture

April 30th Y Bengio, R Ducharme, P Vincent, C Jauvin (2003). A neural probabilistic language model. Journal of Machine Learning Research 3, 1137-1155 pdf

May 7th Noah A. Smith (2020). Contextual Word Representations: A Contextual Introduction. arXiv. pdf

May 14th Lena Voita. NLP Course: Neural Language Models, Sequence to Sequence (seq2seq) and Attention. Web Tutorial neural language models
seq2seq and attention

May 28th Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin (2017). Attention Is All You Need. NIPS paper

June 4th Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT paper

June 11th Project: structured prompting project-paper
project-data-english
project-data-hindi
project-data-telugu

June 18th Long Ouyang, Jeff Wu, et al. (2022). Training language models to follow instructions with human feedback. arXiv. paper (InstructGPT)

June 25th Project

July 2nd Lecture on Linguistic Information in Multilingual Language Models (MDM) pdf

June 9th Project Presentation

Date	Topic	Materials
April 16th	Introduction to the class (AF) and Language for LLMs (MDM)	orientation.pdf Linguistic Background for LLMs
April 23rd	Dan Jurafsky and James H. Martin (2023). Speech and Language Processing (3rd ed. draft), Chapter 3, N-gram Language Models	article lecture
April 30th	Y Bengio, R Ducharme, P Vincent, C Jauvin (2003). A neural probabilistic language model. Journal of Machine Learning Research 3, 1137-1155	pdf
May 7th	Noah A. Smith (2020). Contextual Word Representations: A Contextual Introduction. arXiv.	pdf
May 14th	Lena Voita. NLP Course: Neural Language Models, Sequence to Sequence (seq2seq) and Attention. Web Tutorial	neural language models seq2seq and attention
May 28th	Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin (2017). Attention Is All You Need. NIPS	paper
June 4th	Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT	paper
June 11th	Project: structured prompting	project-paper project-data-english project-data-hindi project-data-telugu
June 18th	Long Ouyang, Jeff Wu, et al. (2022). Training language models to follow instructions with human feedback. arXiv.	paper (InstructGPT)
June 25th	Project
July 2nd	Lecture on Linguistic Information in Multilingual Language Models (MDM)	pdf
June 9th	Project Presentation