Large Language Models (such as GPT2, GPT3, GPT4, RoBERTa, T5) and Intelligent Chatbots (such as ChatGPT, Bard and Claude) are a very timely topic.
N-gram language models, neural language modeling, word2vec, RNNs, Transformers, BERT, RLHF, ChatGPT, multilingual alignment, prompting, transfer learning, domain adaptation, linguistic knowledge in large language models
Learning Goals:
The participants will first learn the basics of n-gram language models, neural language modeling, RNNs and Transformers. In the second half of the seminar, participants will present an application of a modern large language model, intelligent chatbot or similar system. This class will involve a large amount of reading on both the basics and advanced topics.
Alexander Fraser
Dr. Marion Di Marco
Tuesdays: 16:15-17:45, D.2.12
Read the materials *before* class!
Date | Topic | Materials |
April 16th | Introduction to the class (AF) and Language for LLMs (MDM) | orientation.pdf Linguistic Background for LLMs |
April 23rd | Dan Jurafsky and James H. Martin (2023). Speech and Language Processing (3rd ed. draft), Chapter 3, N-gram Language Models | article lecture |
April 30th | Y Bengio, R Ducharme, P Vincent, C Jauvin (2003). A neural probabilistic language model. Journal of Machine Learning Research 3, 1137-1155 | |
May 7th | Noah A. Smith (2020). Contextual Word Representations: A Contextual Introduction. arXiv. | |
May 14th | Lena Voita. NLP Course: Neural Language Models, Sequence to Sequence (seq2seq) and Attention. Web Tutorial | neural language models seq2seq and attention |
May 28th | Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin (2017). Attention Is All You Need. NIPS | paper |
June 4th | Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT |
paper |
June 11th | Project: structured prompting |
project-paper project-data-english project-data-hindi project-data-telugu |
June 18th | Long Ouyang, Jeff Wu, et al. (2022). Training language models to follow instructions with human feedback. arXiv. | paper (InstructGPT) |
June 25th | Project | |
July 2nd | Lecture on Linguistic Information in Multilingual Language Models (MDM) | |
June 9th | Project Presentation |