CS 723 Spring 2023

CS 723 Reading List

Please submit your reviews to martijn.devos@epfl.ch before Monday noon

Week


#1	Feb. 21: Introduction
	The Task of the Referee
#2	Feb. 28: Benchmarks and Analytics
	MLPerf training benchmark [Google et al., MLSys 20]
	MLPerf Inference Benchmark [Harvard et al., ISCA 20]
#3	Mar. 07: ML inference at scale
	Efficiently Scaling Transformer Inference [arXiv 22]
	Orca: A Distributed Serving System for Transformer-Based Generative Models [OSDI 22]
#4	Mar. 14: Large Language Models (LLMs)
	BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [arXiv 22]
	LLaMA: Open and Efficient Foundation Language Models [MetaAI 23]
#5	Mar. 21: Sustainability
	Sustainable AI: Environmental Implications, Challenges and Opportunities [MlSys 22]
	Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training [NSDI 2023]
#6	Mar. 28: Systems & ML
	A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms [Harvard, MLSys 2020]
	Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning [OSDI 22]
#7	Apr. 04: Deep Learning with Low-Precision Encoding
	LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale [NeurIPS 22]
	FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding [HPCA 22]
#8	Apr. 18: Hardware Accelerators for Deep Learning
	Ten Lessons From Three Generations Shaped Google’s TPUv4i [Google, ISCA 21]
	RaPiD: AI Accelerator for Ultra-low Precision Training and Inference [IBM, ISCA 21]
#9	Apr. 25: Sparsity in Deep Neural Networks
	Pixelated Butterfly: Simple and Efficient Sparse Training for Neural Network Models [ICLR 22]
	CrAM: A Compression-Aware Minimizer [arXiv 22]
#10	May. 02: Domain Specific Languages for ML
	Blink: Fast and Generic Collectives for Distributed ML [Microsoft et al., MLSys 20]
	MSCCLang: Microsoft Collective Communication Language [Microsoft, ASPLOS 23]
#11	May. 09: New Training Paradigms
	Beyond Data and Model Parallelism for Deep Neural Networks [Stanford, MLSys 19]
	Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning [OSDI 22]
#12	May. 23: Federated Learning
	FedScale: Benchmarking Model and System Performance of Federated Learning at Scale [PMLR 22]
	PAPAYA: Practical, Private, and Scalable Federated Learning [MLSys 2022]
#13	May. 30: Decentralized Learning in Heterogeneous Environments
	Decentralized Training of Foundation Models in Heterogeneous Environments [NeurIPS 22]
	SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient [arXiv 23]

CS 723 Topics on ML SystemsSpring 2023

CS 723 Reading List

CS 723 Topics on ML Systems
Spring 2023