CS 723 Reading List

  • Please submit your reviews to martijn.devos@epfl.ch before Monday noon
  • Week
    #1Feb. 21: Introduction
    PDF The Task of the Referee
    #2Feb. 28: Benchmarks and Analytics
    PDF MLPerf training benchmark [Google et al., MLSys 20]
    PDF MLPerf Inference Benchmark [Harvard et al., ISCA 20]
    #3Mar. 07: ML inference at scale
    PDF Efficiently Scaling Transformer Inference [arXiv 22]
    PDF Orca: A Distributed Serving System for Transformer-Based Generative Models [OSDI 22]
    #4Mar. 14: Large Language Models (LLMs)
    PDF BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [arXiv 22]
    PDF LLaMA: Open and Efficient Foundation Language Models [MetaAI 23]
    #5Mar. 21: Sustainability
    PDF Sustainable AI: Environmental Implications, Challenges and Opportunities [MlSys 22]
    PDF Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training [NSDI 2023]
    #6Mar. 28: Systems & ML
    PDF A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms [Harvard, MLSys 2020]
    PDF Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning [OSDI 22]
    #7Apr. 04: Deep Learning with Low-Precision Encoding
    PDF LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale [NeurIPS 22]
    PDF FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding [HPCA 22]
    #8Apr. 18: Hardware Accelerators for Deep Learning
    PDF Ten Lessons From Three Generations Shaped Google’s TPUv4i [Google, ISCA 21]
    PDF RaPiD: AI Accelerator for Ultra-low Precision Training and Inference [IBM, ISCA 21]
    #9Apr. 25: Sparsity in Deep Neural Networks
    PDF Pixelated Butterfly: Simple and Efficient Sparse Training for Neural Network Models [ICLR 22]
    PDF CrAM: A Compression-Aware Minimizer [arXiv 22]
    #10May. 02: Domain Specific Languages for ML
    PDF Blink: Fast and Generic Collectives for Distributed ML [Microsoft et al., MLSys 20]
    PDF MSCCLang: Microsoft Collective Communication Language [Microsoft, ASPLOS 23]
    #11May. 09: New Training Paradigms
    PDF Beyond Data and Model Parallelism for Deep Neural Networks [Stanford, MLSys 19]
    PDF Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning [OSDI 22]
    #12May. 23: Federated Learning
    PDF FedScale: Benchmarking Model and System Performance of Federated Learning at Scale [PMLR 22]
    PDF PAPAYA: Practical, Private, and Scalable Federated Learning [MLSys 2022]
    #13May. 30: Decentralized Learning in Heterogeneous Environments
    PDF Decentralized Training of Foundation Models in Heterogeneous Environments [NeurIPS 22]
    PDF SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient [arXiv 23]