x

Reader 1: Introduction
PDF Chapter 6 (Sections 6.1 & 6.2) of Hennessy & Patterson's Computer Architecture
Reader 2: Evaluation Methodologies
PDF A. R. Alameldeen and D. A. Wood, IPC Considered Harmful for Multiprocessor Workloads
PDF T. F. Wenisch, R. E. Wunderlich, M. Ferdman, A. Ailamaki, Babak Falsafi, and James C. Hoe, SimFlex: Statistical Sampling of Computer System Simulation
Reader 3: Programming Models
PDF Chapter 1 (Section 1.3.2 & 1.3.3) of Culler, Singh & Gupta’s Parallel Computer Architecture
PDF M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, et al., TensorFlow: A System for Large-Scale Machine Learning
Reader 4: Coherence
PDF Chapter 6 and 7 of Sorin, Hill & Wood’s A Primer on Memory Consistency and Cache Coherence
PDF M. Ferdman, P. Lotfi-Kamran, K. Balet, and B. Falsafi, Cuckoo Directory: A Scalable Directory for Many-Core Systems
Reader 5: Consistency
PDF S. Adve and K. Gharachorloo, Shared Memory Consistency Models: A Tutorial
PDF C. Blundell, M. M. K. Martin, and T. F. Wenisch, InvisiFence: Performance-Transparent Memory Ordering in Conventional Multiprocessors
PDF A. Singh, S. Narayanasamy, D. Marion, T. Millstein, M. Musuvathi, End-to-end Sequential Consistency
Reader 6: Synchronization
PDF T. David, R. Guerraoui, V. Trigonakis, Everything you always wanted to know about synchronization but were afraid to ask
Reader 7: Transactional Memory
PDF Chapter 5 (Section 5.1 & 5.2) of Harris, Larus & Rajwar's Transactional Memory
Reader 8: CMP Caches
PDF Chapter 2 of Balasubramonian, Jouppi & Muralimanohar's Multi-Core Cache Hierarchies
PDF N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki, Reactive NUCA: Near-Optimal Block Placement and Replication in Distributed Caches
Reader 9: Interconnects
Chapter 1, 2, and 6 of Jerger & Peh's On-Chip Networks
L. M. Ni and P. K. McKinley, A Survey of Wormhole Routing in Direct Networks
Reader 10: Scaling Trends
N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki, Toward Dark Silicon in Servers
S. Borkar and A.Chien, The Future of Microprocessors
Reader 11: Specialization
V. Govindaraju, C. Ho, and K. Sankaralingam, Dynamically Specialized Datapaths for Energy Efficient Computing
Reader 12: Server Processors
M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi, A Case for Specialized Processors for Scale-Out Workloads
P. Lotfi-Kamran, B. Grot, M. Ferdman, S. Volos, Y. O. Koçberber, J. Picorel, A. Adileh, D. Jevdjic, S. Idgunji, E. Ozer, and B. Falsafi, Scale-Out Processors
Reader 13: Emerging Memory
M. Drumond, A. Daglis, N. Mirzadeh, D. Ustiugov, J. Picorel, B. Falsafi, B. Grot, The Mondrian Data Engine
Reader 14: Distributed Memory Systems
S. Novakovic, A. Daglis, E. Bugnion, B. Falsafi, B. Grot, Scale-Out NUMA
Reader 15: Datacenters
Chapter 1 and 2 of Barroso & Hölzle's The Datacenter as a Computer - An Introduction to the Design of Warehouse-Scale Machines
A. Caulfield, E. Chung, A. Putnam et al., A Cloud-Scale Acceleration Architecture
Reader 16: GPUs
E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, NVIDIA Tesla: A Unified Graphics and Computing Architecture