Class Topics (Winter 2023)
As a reading resource, please find a list of class topics and relevant materials below. Under each topic, we include a list of related readings, including lecture notes, blog posts, papers, and other resources.
We broadly organize class topics under three areas: (1) Fundamentals, (2) Survey of Existing FMs and their Applications, and (3) Societal Considerations & Impact. The list below of relevant readings and materials is not exhaustive; we’ll be updating this page as we go through the quarter, and encourage you to dig deeper into the topics that interest you. These topics are also not in order of when they’ll be covered in class.
Special thanks to the original content creators, including course notes from a past version of CS 324.
Table of Contents
- Fundamentals
- What are foundation models (FMs) and why are they interesting?
- How does data impact FMs and what are the downstream effects?
- How do we train FMs and what are the downstream effects?
- Model Architectures and Training Objectives for FMs
- Emergent Behaviors and Capabilities
- Adapting FMs to New Tasks and Data Domains
- Training Methods and Infrastucture
- Survey of Existing FMs and their Applications
- Societal Considerations & Impact
Fundamentals
What are foundation models (FMs) and why are they interesting?
- Background on Neural Networks (course series from Andrej Karpathy):
- Course Notes:
- Blog Posts:
- Papers:
How does data impact FMs and what are the downstream effects?
- Course Notes:
- Blog Posts:
- Papers:
Model Architectures and Training Objectives for FMs
- Course Notes:
Blog Posts:
- Papers:
Emergent Behaviors and Capabilities
- Course Notes:
- Blog Posts:
- Papers:
- Scaling Laws for Neural Language Models
- Show Your Work: Scratchpads for Intermediate Computation with Language Models
- Chain of Thought Prompting Elicits Reasoning in Large Language Models
- Ask Me Anything: A simple strategy for prompting language models
- Emergent Abilities of Large Language Models
- Data Distributional Properties Drive Emergent Few-Shot Learning in Transformers
- An Explanation of In-Context Learning as Implicit Bayesian Inference
- Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Adapting FMs to New Tasks and Data Domains
- Course Notes:
Blog Posts:
- Papers:
- Multitask Prompted Training Enables Zero-Shot Task Generalization
- Finetuned Language Models Are Zero-Shot Learners
- Prefix-Tuning: Optimizing Continuous Prompts for Generation
- Training Language Models to Follow Instructions with Human Feedback
- The Power of Scale for Parameter-Efficient Prompt Tuning
- LoRA: Low-Rank Adaptation of Large Language Models
- Fast Model Editing at Scale
Training Methods and Infrastructure
Course Notes:
- Blog Posts:
- Papers:
- Decentralized Training of Foundation Models in Heterogeneous Environments
- Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
- FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
- Training Processes (Section 2.5), OPT: Open Pre-trained Transformer Language Models (Also see full logbook of training)
Survey of Existing FMs and their Applications
Text and (Masked) Language Modeling FMs
Course Notes:
- Blog Posts:
- Papers:
- Language Models are Unsupervised Multitask Learners
- Language Models are Few-Shot Learners
- Training Language Models to Follow Instructions with Human Feedback
- CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
- Highly Accurate Protein Structure Prediction with AlphaFold
- Language Models of Protein Sequences at the Scale of Evolution Enable Accurate Structure Prediction
- GatorTron: A Large Language Model for Clinical Natural Language Processing
- Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset
- LegalBench: Prototyping a Collaborative Benchmark for Legal Reasoning
Image-Text and Multimodal FMs
Course Notes:
Blog Posts:
Papers:
- Learning Transferable Visual Models From Natural Language Supervision
- Zero-Shot Text-to-Image Generation
- Denoising Diffusion Probabilistic Models
- High-Resolution Image Synthesis with Latent Diffusion Models
- Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
- Imagen Video: High Definition Video Generation with Diffusion Models
- Make-A-Video: Text-to-Video Generation without Text-Video Data
- DreamFusion: Text-to-3D using 2D Diffusion
- Point-E: A System for Generating 3D Point Clouds from Complex Prompts
- Flamingo: a Visual Language Model for Few-Shot Learning
- CM3: A Causal Masked Multimodal Model of the Internet
- A Generalist Agent
- Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
- MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Societal Considerations & Impact
Security and Privacy
- Course Notes:
Blog Posts:
- Papers:
Environmental Impact
- Course Notes:
Blog Posts:
- Papers:
Legal Considerations
- Course Notes:
- Blog Posts:
- Papers / Articles: