- Level Intermediate
- Duration 13 hours
- Course by University of Glasgow
- Offered by
About
Large language models such as GPT-3.5, which powers ChatGPT, are changing how humans interact with computers and how computers can process text. This course will introduce the fundamental ideas of natural language processing and language modelling that underpin these large language models. We will explore the basics of how language models work, and the specifics of how newer neural-based approaches are built. We will examine the key innovations that have enabled Transformer-based large language models to become dominant in solving various language tasks. Finally, we will examine the challenges in applying these large language models to various problems including the ethical problems involved in their construction and use.
Through hands-on labs, we will learn about the building blocks of Transformers and apply them for generating new text. These Python exercises step you through the process of applying a smaller language model and understanding how it can be evaluated and applied to various problems. Regular practice quizzes will help reinforce the knowledge and prepare you for the graded assessments.
Modules
N-Gram Language Models
- 3 Videos
- 1 Lab
- 1 Assignment
1 Assignment
- Lesson Quiz
1 Labs
- Building Your Own N-Gram Language Model
3 Videos
- Course and Instructor Introductions
- What is a Language Model?
- N-gram Language Models
Evaluating Language Models
- 1 Videos
- 1 Lab
- 1 Assignment
1 Assignment
- Lesson Quiz
1 Labs
- Calculating Perplexity from an N-Gram Language Model
1 Videos
- Evaluating Language Models
Text Generation with Language Models
- 2 Videos
- 1 Readings
- 1 Lab
- 2 Assignment
2 Assignment
- Lesson Quiz
- Module 1 Quiz
1 Labs
- Building a Simple Text Generator
2 Videos
- Generating Text
- Summary of Module
1 Readings
- Optional Reading
Tokenization
- 3 Videos
- 1 Lab
- 1 Assignment
1 Assignment
- Lesson quiz
1 Labs
- Tokenization
3 Videos
- Introduction to Week
- Representing text with numerical vectors
- Words, tokens or sub-tokens
Neural language models & Transformers
- 3 Videos
- 2 Lab
- 1 Assignment
1 Assignment
- Lesson quiz
2 Labs
- Language Modelling with Transformers
- Calculating perplexity
3 Videos
- Neural language models
- The Transformer architecture
- How are Transformers trained?
GPT
- 4 Videos
- 1 Readings
- 1 Lab
- 2 Assignment
2 Assignment
- Lesson quiz
- Module 2 Quiz
1 Labs
- Language Generation with Transformers
4 Videos
- GPT and BERT
- Using GPT
- The ethical challenges of large language models
- Wrap-up of the module
1 Readings
- On the Dangers of Stochastic Parrots
Hallucinations in LLMs
- 2 Videos
- 1 Readings
- 1 Assignment
1 Assignment
- Lesson Quiz
2 Videos
- Introduction to the Module
- Hallucinations in LLMs
1 Readings
- Optional: ChatGPT Lawyer details
LLM use cases: Chatbots, Academic uses
- 2 Videos
- 5 Readings
- 1 Lab
- 1 Assignment
1 Assignment
- Lesson Quiz
1 Labs
- Basic Chatbot
2 Videos
- Chatbots, RLHF, and ChatGPT
- Academic conduct
5 Readings
- Required: OpenAI blog post about ChatGPT and RLHF
- Required: Research about AI detectors and non-native speakers
- Optional: Ouyang et al. (2022) paper about InstructGPT
- Optional: News stories about the humans in RLHF
- Optional: Russell Group and University of Glasgow Principles on AI in Education
Risks real and imagined
- 3 Videos
- 6 Readings
- 2 Assignment
2 Assignment
- Lesson Quiz
- Module 3 Quiz
3 Videos
- Creativity and copyright
- Regulation and risks
- Course Recap
6 Readings
- Required: Scientific American article on AI risks
- Optional: AI-generated e-books and journalism
- Optional: Generative AI, copyright, and Hollywood strikes
- Optional: Open letters
- Optional: Information on draft legislation
- Optional: One Hundred Year Study on Artificial Intelligence
Mary Ellen Foster
Sean MacAvaney
Jake Lever