Illustration of servers and clouds Publications

2023

IIBLAST: Speeding Up Commercial FPGA Routing by Decoupling and Mitigating the Intra-CLB Bottleneck

S. ShrivastavaS. NikolicC. RavishankarD. GaitondeM. Stojilovic

2023-08-21. IEEE/ACM International Conference on Computer-Aided Design (IEEE/ACM ICCAD 2023), San Francisco, CA, USA, October 29 - November 2, 2023. DOI : 10.1109/ICCAD57390.2023.10323897.

Temperature Impact on Remote Power Side-Channel Attacks on Shared FPGAs

O. GlamocaninH. BazazM. PayerM. Stojilovic

2023-04-19. Design, Automation and Test in Europe Conference DATE 2023, Antwerp, Belgium, April 17-19, 2023. DOI : 10.23919/DATE56975.2023.10136979.

GRAMM: Fast CGRA Application Mapping Based on A Heuristic for Finding Graph Minors

G. ZhouM. StojilovicJ. H. Anderson

2023-09-04. 33rd International Conference on Field-Programmable Logic and Applications (FPL), Gothenburg, SWEDEN, SEP 04-08, 2023. p. 305-310. DOI : 10.1109/FPL60245.2023.00052.

Imprecise Store Exceptions

S. GuptaY. LiQ. KangA. BhattacharjeeB. Falsafi  et al.

2023. The 50th Annual International Symposium on Computer Architecture (ISCA ’23), Orlando, FL, USA, June 17–21, 2023. DOI : 10.1145/3579371.3589087.

SecureCells: A Secure Compartmentalized Architecture

A. BhattacharyyaF. HofhammerY. LiS. GuptaA. Sánchez MarĂ­n  et al.

2023. 44th IEEE Symposium on Security and Privacy, San Francisco, USA, May 22-24, 2023. p. 2921-2939. DOI : 10.1109/SP46215.2023.00125.

Cooperative Concurrency Control for Write-Intensive Key-Value Workloads

M. J. SutherlandB. FalsafiA. Daglis

2023. The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'23), Vancouver, BC, Canada, March 25–29, 2023. p. 30-46. DOI : 10.1145/3567955.3567957.

Active Wire Fences for Multitenant FPGAs

O. GlamocaninA. KosticS. KosticM. Stojilovic

2023. 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS), Tallinn, Estonia, May 3-5, 2023. p. 13-20. DOI : 10.1109/DDECS57882.2023.10138941.

AstriFlash: A Flash-Based System for Online Services

S. GuptaY. OhL. YanM. J. SutherlandA. Bhattacharjee  et al.

2023. The 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA-29), Montreal, QC, Canada, Feb 25 – March 01, 2023. DOI : 10.1109/HPCA56546.2023.10070955.

2022

FPGA-to-CPU Undervolting Attacks

D. G. A. S. MahmoudS. HusseinV. LendersM. Stojilovic

2022-03-22. 25th Design, Automation and Test in Europe, Antwerp, Belgium [Virtual], March 14-23, 2022. p. 999-1004. DOI : 10.23919/DATE54114.2022.9774663.

Deep Learning Detection of GPS Spoofing

O. JullianB. OteroM. StojilovićJ. J. CostaJ. VerdĂş  et al.

2022-02-02. 7th International Conference Machine Learning, Optimization, and Data Science (LOD 2021), Grasmere, UK, October 4-8, 2021. p. 527-540. DOI : 10.1007/978-3-030-95467-3_38.

A Deep-Learning Approach to Side-Channel Based CPU Disassembly at Design Time

H. FendriM. MacchettiJ. PerrineM. Stojilovic

2022-03-22. 25th Design, Automation and Test in Europe Conference and Exhibition (DATE), Antwerp, Belgium [Virtual], March 14-23, 2022. p. 670-675. DOI : 10.23919/DATE54114.2022.9774531.

2021

Equinox: Training (for Free) on a Custom Inference Accelerator

M. P. Drumond Lages De OliveiraL. CoulonA. Pourhabibi ZarandiA. C. YĂĽzĂĽgĂĽlerB. Falsafi  et al.

2021-10-18. 54th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’21), Virtual Event, Greece, October 18–22, 2021. DOI : 10.1145/3466752.3480057.

Cerebros: Evading the RPC Tax in Datacenters

A. Pourhabibi ZarandiM. J. SutherlandA. DaglisB. Falsafi

2021-10-18. MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual Event, Greece, October 18–22, 2021. p. 407-420. DOI : 10.1145/3466752.3480055.

Shared FPGAs and the Holy Grail: Protections against Side-Channel and Fault Attacks

O. GlamocaninD. MahmoudF. RegazzoniM. Stojilovic

2021-02-04. DATE 2021 Design, Automation and Test in Europe, Virtual, February 1-5, 2021. p. 1645-1650. DOI : 10.23919/DATE51398.2021.9473947.

Improving First-Order Threshold Implementations of SKINNY

A. F. CaforioD. P. CollinsS. BanikO. Glamocanin

2021. 22nd International Conference on Cryptology in India (INDOCRYPT21), Remote, December 12-15, 2021. p. 246-267. DOI : 10.1007/978-3-030-92518-5_1.

Rebooting Virtual Memory with Midgard

S. GuptaA. BhattacharyyaY. OhA. BhattacharjeeB. Falsafi  et al.

2021. ISCA 2021 48th International Symposium on Computer Architecture, Online conference, June 14-19, 2021. DOI : 10.1109/ISCA52012.2021.00047.

NetCracker: A Peek into the Routing Architecture of Xilinx 7-Series FPGAs

M. B. PetersenS. NikolicM. Stojilovic

2021-03-01. International Symposium on Field-Programmable Gate Arrays, Virtual Conference, February 28 - March 2, 2021. DOI : 10.1145/3431920.3439285.

2020

Nonintrusive and Adaptive Monitoring for Locating Voltage Attacks in Virtualized FPGAs

S. S. MirzargarG. RenaultA. GuerrieriM. Stojilovic

2020-12-01. 19th International Conference on Field-Programmable Technology (ICFPT), Maui, HI, USA (Virtual conference), December 7-11, 2020. p. 288-289. DOI : 10.1109/ICFPT51103.2020.00050.

X-Attack: Remote Activation of Satisfiability Don’t-Care Hardware Trojans on Shared FPGAs

D. MahmoudW. HuM. Stojilovic

2020. 30th International Conference on Field-Programmable Logic and Applications (FPL), ELECTR NETWORK, August 31 - September 4, 2020. p. 185-192. DOI : 10.1109/FPL50879.2020.00039.

The NEBULA RPC-Optimized Architecture

M. SutherlandS. GuptaB. FalsafiV. MaratheD. Pnevmatikatos  et al.

2020. 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), Valencia, Spain, May, 30th - June, 3rd 2020. p. 199-212. DOI : 10.1109/ISCA45697.2020.00027.

Optimus Prime: Accelerating Data Transformation in Servers

A. Pourhabibi ZarandiS. GuptaH. KassirM. J. SutherlandZ. Tian  et al.

2020. Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, March 16–20, 2020. p. 1203-1216. DOI : 10.1145/3373376.3378501.

A Shared-Memory Parallel Implementation of the RePlAce Global Cell Placer

F. GesslerP. BriskM. Stojilovic

2020-01-08. 33rd International Conference on VLSI Design and 19th International Conference on Embedded Systems (VLSID), Bangalore, India, January 4-8, 2020. DOI : 10.1109/VLSID49098.2020.00031.

Are Cloud FPGAs Really Vulnerable to Power Analysis Attacks?

O. GlamocaninL. CoulonF. RegazzoniM. Stojilovic

2020-03-09. Design, Automation and Test in Europe (DATE), Grenoble, France, March 9-13, 2020. p. 1007-1010. DOI : 10.23919/DATE48585.2020.9116481.

Built-in Self-Evaluation of First-Order Power Side-Channel Leakage for FPGAs

O. GlamocaninL. CoulonF. RegazzoniM. Stojilovic

2020-02-23. 28th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2020), Seaside, California, USA, February 23-25, 2020. DOI : 10.1145/3373087.3375318.

Closing Leaks: Routing Against Crosstalk Side-Channel Attacks

Z. SeifooriS. S. MirzargarM. Stojilovic

2020-02-23. 28th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2020), Seaside, California, USA, February 23-25, 2020. DOI : 10.1145/3373087.3375319.

2019

Distributed Logless Atomic Durability with Persistent Memory

S. GuptaA. DaglisB. Falsafi

2019-10-16. The 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-52), Columbus, OH, USA, October 12–16, 2019. DOI : 10.1145/3352460.3358321.

Linebacker: Preserving Victim Cache Lines in Idle Register Files of GPUs

Y. OhG. KooM. AnnavaramW. W. Ro

2019-01-01. 46th International Symposium on Computer Architecture (ISCA), Phoenix, AZ, Jun 22-26, 2019. p. 183-196. DOI : 10.1145/3307650.3322222.

A machine learning approach for power gating the FPGA routing network

S. ZeinabH. AsadiM. Stojilovic

2019-12-11. 2019 International Conference on Field-Programmable Technology (ICFPT), Tianjin, China, December 9-13, 2019. p. 10-18. DOI : 10.1109/ICFPT47387.2019.00010.

Physical Side-Channel Attacks and Covert Communication on FPGAs: A Survey

S. S. MirzargarM. Stojilovic

2019-09-08. 29th International Conference on Field Programmable Logic and Applications (FPL), Barcelona, Spain, September 9 - 13, 2019. DOI : 10.1109/FPL.2019.00039.

FPGA-Assisted Deterministic Routing for FPGAs

D. KorolijaM. Stojilovic

2019-05-20. 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brasil, May 20-24, 2019. p. 155-162. DOI : 10.1109/IPDPSW.2019.00034.

SMoTherSpectre: Exploiting Speculative Execution through Port Contention

A. BhattacharyyaA. SandulescuM. NeugschwandtnerA. SorniottiB. Falsafi  et al.

2019. The 26th ACM Conference on Computer and Communications Security - ACM CSS 2019, London, UK, November 11-15, 2019. p. 785–800. DOI : 10.1145/3319535.3363194.

Timing Violation Induced Faults in Multi-Tenant FPGAs

D. MahmoudM. Stojilovic

2019-03-25. Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, ITALY, Mar 25-29, 2019. p. 1745-1750. DOI : 10.23919/DATE.2019.8715263.

Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores

A. MargaritovS. GuptaR. Gonzalez-AlberquillaB. Grot

2019-01-01. 25th IEEE International Symposium on High Performance Computer Architecture (HPCA), Washington, DC, Feb 16-20, 2019. p. 15-27. DOI : 10.1109/HPCA.2019.00024.

RPCValet: NI-Driven Tail-Aware Balancing of µs-Scale RPCs

A. DaglisM. SutherlandB. Falsafi

2019-04-15. Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '19, Providence, Rhode Island, USA, April 13-17, 2019. p. 35-48. DOI : 10.1145/3297858.3304070.

2018

Design Guidelines for High-Performance SCM Hierarchies

D. UstiugovA. DaglisJ. Picorel ObandoM. J. SutherlandE. Bugnion  et al.

2018-10-01. 4th International Symposium on Memory Systems (MEMSYS), Old Town Alexandria, VA, USA, October 1-4, 2018. DOI : 10.1145/3240302.3240310.

Deterministic Parallel Routing for FPGAs based on Galois Parallel Execution Model

Y. MoctarM. StojilovicP. Brisk

2018-01-01. 28th International Conference on Field Programmable Logic and Applications (FPL), Dublin, IRELAND, Aug 26-31, 2018. p. 21-25. DOI : 10.1109/FPL.2018.00011.

Towards Commoditizing Simulations of System Models Using Recurrent Neural Networks

A. C. YuzugulerA. MogaC. Franke

2018-01-01. IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Aalborg, DENMARK, Oct 29-31, 2018. DOI : 10.1109/SmartGridComm.2018.8587599.

Training DNNs with Hybrid Block Floating Point

M. DrumondT. LinM. JaggiB. Falsafi

2018-01-01. NeurIPS 2018 - 32nd Conference on Neural Information Processing Systems, Montreal, CANADA, Dec 02-08, 2018.

LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching

M. SadrosadatiA. MirhosseiniS. B. EhsaniH. Sarbazi-AzadM. Drumond  et al.

2018. Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '18, Williamsburg, VA, USA, March 24th – March 28th, 2018. p. 489-502. DOI : 10.1145/3173162.3173211.

2017

Parallel FPGA routing: Survey and challenges

M. Stojilovic

2017-01-01. 2017 27th International Conference on Field Programmable Logic and Applications (FPL), Ghent, Belgium, September 4-8, 2017. p. 1-8. DOI : 10.23919/FPL.2017.8056782.

Near-Memory Address Translation

J. PicorelD. JevdjicB. Falsafi

2017. 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), Portland, OR, SEP 09-13, 2017. p. 303-317. DOI : 10.1109/Pact.2017.56.

The Mondrian Data Engine

M. P. Drumond Lages De OliveiraA. DaglisN. MirzadehD. UstiugovJ. Picorel Obando  et al.

2017. The 44th International Symposium on Computer Architecture, Toronto, ON, Canada, June 24-28, 2017. DOI : 10.1145/3079856.3080233.

2016

Unlocking Energy

B. FalsafiR. GuerraouiJ. Picorel ObandoV. Trigonakis

2016. 2016 USENIX Annual Technical Conference, Denver, Colorado, USA, June 22-24, 2016. p. 393-406.

The Case for RackOut: Scalable Data Serving Using Rack-Scale Systems

S. NovakovicA. DaglisE. BugnionB. FalsafiB. Grot

2016. ACM Symposium on Cloud Computing, Santa Clara, USA, October 05-07, 2016. DOI : 10.1145/2987550.2987577.

SABRes: Atomic Object Reads for In-Memory Rack-Scale Computing

A. DaglisD. UstiugovS. NovakovicE. BugnionB. Falsafi  et al.

2016. 49th Annual IEEE/ACM International Symposium on Microarchitecture, Taipei, Taiwan, October 15-19, 2016. DOI : 10.1109/MICRO.2016.7783709.

An Analysis of Load Imbalance in Scale-out Data Serving

S. NovakovicA. DaglisE. BugnionB. FalsafiB. Grot

2016. ACM SIGMETRICS, Antibes Juan-Les-Pins, France, June 14-18, 2016. p. 367–368. DOI : 10.1145/2896377.2901501.

Towards Near-Threshold Server Processors

A. PahlevanJ. Picorel ObandoA. Pourhabibi ZarandiD. RossiM. Zapater Sancho  et al.

2016. Design, Automation and Test in Europe Conference (DATE '16), Dresden, Germany, March 14-18, 2016. p. 7-12.

2015

Confluence: unified instruction supply for scale-out servers

C. KaynakB. GrotB. Falsafi

2015. the 48th International Symposium, Waikiki, Hawaii, 05-09 December 2015. p. 166-177. DOI : 10.1145/2830772.2830785.

Sort vs. Hash Join Revisited for Near-Memory Execution

N. MirzadehO. KocberberB. FalsafiB. Grot

2015. 5th Workshop on Architectures and Systems for Big Data (ASBD 2015), Portland, Oregon, USA, June 13, 2015.

Manycore Network Interfaces for In-Memory Rack-Scale Computing

A. DaglisS. NovakovicE. BugnionB. FalsafiB. Grot

2015. 42nd International Symposium in Computer Architecture, Portland, Oregon, USA, June 13-17, 2015. DOI : 10.1145/2749469.2750415.

2014

Unison Cache: A Scalable and Effective Die-Stacked DRAM Cache

D. JevdjicG. H. LohC. KaynakB. Falsafi

2014. 47th Annual IEEE/ACM International Symposium on Microarchitecture, Cambridge, UK, December 13-17, 2014. p. 25-37. DOI : 10.1109/MICRO.2014.51.

BuMP: Bulk Memory Access Prediction and Streaming

S. VolosJ. PicorelB. FalsafiB. Grot

2014. 47th Annual IEEE/ACM International Symposium on Microarchitecture, December 13-17, 2014. p. 545-557. DOI : 10.1109/MICRO.2014.44.

FADE: A Programmable Filtering Accelerator for Instruction-Grain Monitoring

S. FytrakiE. VlachosO. KocberberB. FalsafiB. Grot

2014. 20th IEEE International Symposium On High Performance Computer Architecture (HPCA-2014), Orlando, Florida, USA, February 15-19, 2014. p. 108-119. DOI : 10.1109/HPCA.2014.6835922.

Scale-Out NUMA

S. NovakovicA. DaglisE. BugnionB. FalsafiB. Grot

2014. Nineteenth International Conference on Architectural Support for Programming Languages and Operating Systems, Salt Lake City, Utah, USA, March 1-5, 2014. DOI : 10.1145/2541940.2541965.

2013

Multi-Grain Coherence Directory

J. ZebchukB. FalsafiA. Moshovos

2013. 46th Annual IEEE/ACM International Symposium on Microarchitecture, Davis, CA, USA, December 7-11, 2013. DOI : 10.1145/2540708.2540739.

Meet the Walkers: Accelerating Index Traversals for In-Memory Databases

O. KocberberB. GrotJ. PicorelB. FalsafiK. Lim  et al.

2013. 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'13), Davis, CA, USA, December 7-11, 2013. DOI : 10.1145/2540708.2540748.

SHIFT: Shared History Instruction Fetch for Lean-Core Server Processors

C. KaynakB. GrotB. Falsafi

2013. 46th Annual IEEE/ACM International Symposium on Microarchitecture, Davis, CA, USA, December 7-11, 2013. DOI : 10.1145/2540708.2540732.

Die-Stacked DRAM Caches for Servers: Hit Ratio, Latency, or Bandwidth? Have It All with Footprint Cache

D. JevdjicS. VolosB. Falsafi

2013. 40th International Symposium on Computer Architecture, Tel-Aviv, Israel, June 23-27, 2013. p. 404–415. DOI : 10.1145/2485922.2485957.

From A to E: Analyzing TPC’s OLTP Benchmarks -- The obsolete, the ubiquitous, the unexplored

P. TözünI. PandisI. C. KaynakD. JevdicA. Ailamaki

2013. 16th International Conference on Extending Database Technology, Genoa, Italy, March 18-22, 2013. p. 17-28. DOI : 10.1145/2452376.2452380.

2012

Dark Silicon Accelerators for Database Indexing

O. KocberberB. FalsafiK. LimP. RanganathanS. Harizopoulos

2012. 1st Dark Silicon Workshop, Portland, Oregon, USA, June 10, 2012.

Thermal Characterization of Cloud Workloads on a Power-Efficient Server-on-Chip

D. MilojevicS. IdgunjiD. JevdjicE. OzerP. Lotfi-Kamran  et al.

2012. 30th IEEE International Conference on Computer Design, Montreal, Quebec, Canada, September 30 - October 3, 2012. DOI : 10.1109/ICCD.2012.6378637.

NOC-Out: Microarchitecting a Scale-Out Processor

P. Lotfi-KamranB. GrotB. Falsafi

2012. 45th International Symposium on Microarchitecture, Vancouver, BC, Canada, December 1-5, 2012. DOI : 10.1109/MICRO.2012.25.

Scale-Out Processors

P. Lotfi-KamranB. GrotM. FerdmanS. VolosO. Kocberber  et al.

2012. 39th Annual International Symposium on Computer Architecture, Portland, Oregon, USA, June 9-13, 2012. DOI : 10.1145/2366231.2337217.

CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers

S. VolosC. SeiculescuB. GrotN. Khosro PourB. Falsafi  et al.

2012. 6th International Symposium on Networks-on-Chip, Lyngby, Denmark, May 9-11, 2012.

Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware

M. FerdmanA. AdilehO. KocberberS. VolosM. Alisafaee  et al.

2012. Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, London, UK, March 3-7, 2012.

2011

Reliability in the Dark Silicon Era

B. Falsafi

2011. 17th IEEE International On-Line Testing Symposium (IOLTS), Athens, Greece, Jul 13-15, 2011. p. V-V.

Proactive Instruction Fetch

M. FerdmanC. KaynakB. Falsafi

2011. 44th Annual IEEE/ACM Symposium on Microarchitecture (MICRO 2011), Porto Alegre, Brazil, December 3-7. p. 152-162. DOI : 10.1145/2155620.2155638.

CCNoC: On-Chip Interconnects for Cache-Coherent Manycore Server Chips

C. SeiculescuS. VolosN. Khosro PourB. FalsafiG. De Micheli

2011. Workshop on Energy-Efficient Design (WEED 2011), San Jose, California, USA, June 5, 2011.

Cuckoo Directory: A Scalable Directory for Many-Core Systems

M. FerdmanP. Lotfi-KamranK. BaletB. Falsafi

2011. HPCA 2011, San Antonio, Texas, USA, February 12-16, 2011. DOI : 10.1109/HPCA.2011.5749726.

2010

ParaLog: enabling and accelerating online parallel monitoring of multithreaded applications

E. VlachosM. L. GoodsteinM. A. KozuchS. ChenB. Falsafi  et al.

2010. ASPLOS 2010, Pittsburgh, Pennsylvania, USA, March 13-17, 2010. p. 271-284. DOI : 10.1145/1736020.1736051.

TurboTag: Lookup Filtering to Reduce Coherence Directory Power

P. Lotfi-KamranM. FerdmanD. CrisanB. Falsafi

2010. 16th International Symposium on Low Power Electronics and Design (ISLPED 10), Austin, Texas, USA, August 18-20. p. 377-382. DOI : 10.1145/1840845.1840929.

2009

Chip-Level Redundancy in Distributed Shared-Memory Multiprocessors

B. T. GoldB. FalsafiJ. C. Hoe

2009. p. 195-201. DOI : 10.1109/PRDC.2009.39.

Spatio-Temporal Memory Streaming

S. SomogyiT. F. WenischA. AilamakiB. Falsafi

2009. 36th ACM/IEEE Annual International Symposium on Computer Architecture, Austin, TX. p. 69-80. DOI : 10.1145/1555754.1555766.

Practical Off-chip Meta-data for Temporal Memory Streaming

T. F. WenischM. FerdmanA. AilamakiB. FalsafiA. Moshovos

2009. 15th International Symposium on High-Performance Computer Architecture, Raleigh, NC. p. 79-90. DOI : 10.1109/HPCA.2009.4798239.

Reactive NUCA: Near-Optimal Block Placement and Replication in Distributed Caches

N. HardavellasM. FerdmanB. FalsafiA. Ailamaki

2009. 36th ACM/IEEE Annual International Symposium on Computer Architecture, Austin, TX. p. 184-195. DOI : 10.1145/1555754.1555779.

ReSim, a Trace-Driven, Reconfigurable ILP Processor Simulator

S. FytrakiD. Pnevmatikatos

2009. DATE 2009, Nice, France, April 20-24, 2009. p. 536-541. DOI : 10.1109/DATE.2009.5090722.

2008

A Rate-based Prefiltering Approach to BLAST Acceleration

P. AfratisE. SotiriadesG. ChrysosS. FytrakiD. Pnevmatikatos

2008. International Conference on Field Programmable Logic and Applications (FPL), Heidelberg, Germany, September 08-10, 2008. p. 631-634. DOI : 10.1109/FPL.2008.4630026.

Stall Power Reduction in Pipelined Architecture Processors

P. Lotfi-KamranA.-M. RahmaniA.-A. SalehpourA. Afzali-KushaZ. Navabi

2008. 21st International Conference on VLSI Design, Hyderabad, January 4-8, 2008. p. 541-546. DOI : 10.1109/VLSI.2008.34.

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs to Avoid Congestion

P. Lotfi-KamranM. DaneshtalabC. LucasZ. Navabi

2008. Design, Automation and Test in Europe, 2008. DATE '08, Munich, March 10-14, 2008. p. 1408-1413. DOI : 10.1109/DATE.2008.4484871.

A Complexity-Effective Architecture for Accelerating Full-System Multiprocessor Simulations Using FPGAs

E. S. ChungE. NurvitadhiJ. C. HoeB. FalsafiK. Mai

2008. 16th international ACM/SIGDA symposium on Field programmable gate arrays (FPGA), Monterey, CA, February. p. 77–86. DOI : 10.1145/1344671.1344684.

Temporal instruction fetch streaming

M. FerdmanT. F. WenischA. AilamakiB. FalsafiA. Moshovos

2008. the 41st annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Lake Como, Italy, November. p. 1-10. DOI : 10.1109/MICRO.2008.4771774.

Flexible hardware acceleration for instruction-grain program monitoring

S. ChenM. KozuchT. StrigkosB. FalsafiP. B. Gibbons  et al.

2008. the 35th Annual International Symposium on Computer Architecture (ISCA), Beijing, China, June. p. 377-388. DOI : 10.1109/ISCA.2008.20.

Predictor virtualization

I. BurceaS. SomogyiA. MoshovosB. Falsafi

2008. the 13th international conference on Architectural support for programming languages and operating systems (ASPLOS), Seattle, WA, March. p. 157-167. DOI : 10.1145/1346281.1346301.

Temporal streams in commercial server applications

T. F. WenischM. FerdmanA. AilamakiB. FalsafiA. Moshovos

2008. IEEE International Symposium on Workload Characterization (IISWC), Seattle, WA, September. p. 99-108. DOI : 10.1109/IISWC.2008.4636095.

2007

A UML Based System Level Failure Rate Assessment Technique for SoC Designs

M. HosseinabadyM.-H. NeishaburiP. Lotfi-KamranZ. Navabi

2007. 25th IEEE VLSI Test Symposium, Berkeley, May 6-10, 2007. p. 243-248. DOI : 10.1109/VTS.2007.9.

An Analysis of Database System Performance on Chip Multiprocessors

N. HardavellasI. PandisR. JohnsonN. MancherilA. Ailamaki  et al.

2007. Athens, Greece, July.

PAI: A lightweight mechanism for single-node memory recovery in DSM servers

J. KimJ. C. SmolensB. FalsafiJ. C. Hoe

2007. Melbourne, Australia, December. p. 298-305. DOI : 10.1109/PRDC.2007.53.

Multi-bit error tolerant caches using two-dimensional error coding

J. KimN. HardavellasK. MaiB. FalsafiJ. C. Hoe

2007. Chicago, IL, December. p. 197-209. DOI : 10.1109/MICRO.2007.19.

Last-touch correlated data streaming

M. FerdmanB. Falsafi

2007. San Jose, CA, April. p. 105-115. DOI : 10.1109/ISPASS.2007.363741.

PROTOFLEX: FPGA-accelerated hybrid functional simulator

E. S. ChungE. NurvitadhiJ. C. HoeB. FalsafiK. Mai

2007. Long Beach, CA, March. DOI : 10.1109/IPDPS.2007.370516.

Scheduling threads for constructive cache sharing on CMPs

S. ChenP. B. GibbonsM. KozuchV. LiaskovitisA. Ailamaki  et al.

2007. San Diego, CA, June. p. 105-115. DOI : 10.1145/1248377.1248396.

To Share or Not To Share?

R. JohnsonN. HardavellasI. PandisN. MancherilS. Harizopoulos  et al.

2007. 33rd International Conference on Very Large Data Bases, Vienna, Austria, September. p. 351-362.

Mechanisms for store-wait-free multiprocessors

T. F. WenischA. AilamakiB. FalsafiA. Moshovos

2007. San Diego, CA, June. p. 266-277. DOI : 10.1145/1250662.1250696.

Database Servers on Chip Multiprocessors: Limitations and Opportunities

N. HardavellasI. PandisR. JohnsonN. MancherilA. Ailamaki  et al.

2007. Asilomar, CA, January.

2006

ProtoFlex: Co-simulation for Component-wise FPGA Emulator Development

E. S. ChungJ. C. HoeB. Falsafi

2006. Austin, TX, February.

The Granularity of Soft-Error Containment in Shared-Memory Multiprocessors

B. T. GoldJ. C. SmolensB. FalsafiJ. C. Hoe

2006. Urbana-Champagne, IL, April.

Simulation sampling with live-points

T. F. WenischR. E. WunderlichB. FalsafiJ. C. Hoe

2006. Austin, TX, March. p. 2-12. DOI : 10.1109/ISPASS.2006.1620785.

Reunion: Complexity-effective multicore redundancy

J. C. SmolensB. T. GoldB. FalsafiJ. C. Hoe

2006. Orlando, FL, December. p. 223-234. DOI : 10.1109/MICRO.2006.42.

Parallel depth first vs. work stealing schedulers on CMP architectures

V. LiaskovitisS. ChenP. B. GibbonsA. AilamakiG. E. Blelloch  et al.

2006. Cambridge, MA, August. p. 330. DOI : 10.1145/1148109.1148167.

Log-based architectures for general-purpose monitoring of deployed code

S. ChenB. FalsafiP. B. GibbonsM. KozuchT. C. Mowry  et al.

2006. San Jose, CA, October. p. 63-65. DOI : 10.1145/1181309.1181319.

Spatial Memory Streaming

S. SomogyiT. F. WenischA. AilamakiB. FalsafiA. Moshovos

2006. Boston, MA, June 17-21, 2006. p. 252-263. DOI : 10.1109/ISCA.2006.38.

2005

TED+: A Data Structure for Microprocessor Verification

P. Lotfi-KamranM. HosseinabadyH. ShojaeiM. MassoumiZ. Navabi

2005. Asia and South Pacific Design Automation Conference, Shanghai, January 18-21, 2005. p. 567-572. DOI : 10.1109/ASPDAC.2005.1466228.

TurboSMARTS: Accurate microarchitecture simulation sampling in minutes

T. F. WenischR. E. WunderlichB. FalsafiJ. C. Hoe

2005. p. 408-409. DOI : 10.1145/1064212.1064278.

Understanding the performance of concurrent error detecting superscalar microarchitectures

J. C. SmolensK. JangwooJ. C. HoeB. Falsafi

2005. Athens, Greece, December. p. 13-18. DOI : 10.1109/ISSPIT.2005.1577062.

ReCast: Boosting tag line buffer coverage in low-power high-level caches "for free"

W.-H. ParkA. MoshovosB. Falsafi

2005. San Jose, CA, October. p. 609-616. DOI : 10.1109/ICCD.2005.90.

Store-Ordered Streaming of Shared Memory

T. F. WenischS. SomogyiN. HardavellasJ. KimC. Gniady  et al.

2005. St. Louis, MO, USA, 17-21 September. p. 75-86. DOI : 10.1109/PACT.2005.37.

DBmbench: fast and accurate database workload representation on modern microarchitecture

M. ShaoA. AilamakiB. Falsafi

2005. Toronto, Canada, October. p. 254-267. DOI : 10.1145/1105634.1105653.

Accelerating Database Operations Using a Network Processor

B. T. GoldA. AilamakiL. HustonB. Falsafi

2005. Baltimore, USA, June.

Temporal Streaming of Shared Memory

T. F. WenischS. SomogyiN. HardavellasJ. KimA. Ailamaki  et al.

2005. Madison, WI, June 4-8, 2005. p. 222-233. DOI : 10.1109/ISCA.2005.50.

2004

An Evaluation of Stratified Sampling of Microarchitecture Simulations

R. E. WunderlichT. F. WenischB. FalsafiJ. C. Hoe

2004. Munich, Germany, June.

Fingerprinting: Bounding the Soft-Error Detection Latency and Bandwidth

J. SmolensB. GoldJ. KimB. FalsafiJ. C. Hoe  et al.

2004. Boston, MA, October.

Efficient resource sharing in concurrent error detecting superscalar microarchitectures

J. C. SmolensJ. KimJ. C. HoeB. Falsafi

2004. Portland, OR, December. p. 257-268. DOI : 10.1109/MICRO.2004.19.

Accurate and complexity-effective spatial pattern prediction

C. F. ChenS.-H. YangB. FalsafiA. Moshovos

2004. Madrid, Spain, February. p. 276-287.

Memory coherence activity prediction in commercial workloads

S. SomogyiT. F. WenischN. HardavellasJ. KimA. Ailamaki  et al.

2004. Munich, Germany, June. p. 37-45. DOI : 10.1145/1054943.1054949.

2003

Performance and Energy Trade-Offs of Bitline Isolation in Nanoscale CMOS Caches

S.-H. YangB. Falsafi

2003. San Diego, CA, June.

Near-optimal precharging in high-performance nanoscale CMOS caches

S.-H. YangB. Falsafi

2003. San Diego, CA, December. p. 67-78. DOI : 10.1109/MICRO.2003.1253184.

SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

R. WunderlichT. WenischB. FalsafiJ. Hoe

2003. ISCA 2003: 30th International Symposium on Computer Architecture, San Diego, CA, USA. p. 84-95. DOI : 10.1109/ISCA.2003.1206991.

Implicitly-multithreaded processors

I. ParkB. FalsafiT. N. Vijaykumar

2003. San Diego, CA, June. p. 39-50. DOI : 10.1145/859618.859624.

2002

Gated Precharge: Using Temporal Locality of Subarrays to Save Deep- Submicron Cache Energy

S.-H. YangB. Falsafi

2002. Anchorage, AK, May.

Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay

S.-H. YangM. D. PowellB. FalsafiT. N. Vijaykumar

2002. Boston, MA, February. p. 151-161. DOI : 10.1109/HPCA.2002.995706.

Speculative sequential consistency with little custom storage

C. GniadyB. Falsafi

2002. Charlottesville, VA, September. p. 179-188. DOI : 10.1109/PACT.2002.1106016.

2001

An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance I-caches

S.-H. YangM. D. PowellB. FalsafiK. RoyT. N. Vijaykumar

2001. Monterrey, Mexico, January 20-24, 2001. p. 147-157. DOI : 10.1109/HPCA.2001.903259.

Dual Use of Superscalar Datapath for Transient-Fault Detection and Recovery

J. RayJ. C. HoeB. Falsafi

2001. 34th Annual IEEE/ACM International Symposium on Microarchitecture, Austin, Texas, December 1-5, 2001. p. 214-224. DOI : 10.1109/MICRO.2001.991120.

Reducing set-associative cache energy via way-prediction and selective direct-mapping

M. D. PowellA. AgarwalT. N. VijaykumarB. FalsafiK. Roy

2001. Monterrey, Mexico, January 20-24, 2001. p. 54-65.

Multiplex: Unifying conventional and speculative thread-level parallelism on a chip multiprocessor

C.-L. OoiS. W. KimI. ParkR. EigenmannB. Falsafi  et al.

2001. Yorktown Heights, NY, USA, June. p. 368-380. DOI : 10.1145/377792.377863.

JETTY: Filtering snoops for reduced energy consumption in SMP servers

A. MoshovosG. MemikB. FalsafiA. Choudhary

2001. Monterrey, Mexico, January. p. 85-96. DOI : 10.1109/HPCA.2001.903254.

Dead-block prediction & dead-block correlating prefetchers

A.-C. LaiC. FideB. Falsafi

2001. Göteborg, Sweden, June. p. 144-154. DOI : 10.1109/ISCA.2001.937443.

Reference idempotency analysis: A framework for optimizing speculative execution

S. W. KimC.-L. OoiR. EigenmannB. FalsafiT. N. Vijaykumar

2001. Snowbird, Utah, USA, June. p. 2-11. DOI : 10.1145/379539.379547.

2000

Low-Overhead and High-Performance Implementations of Sequential Consistency

C. GniadyB. Falsafi

2000. Vancouver, BC, June.

Address partitioning in DSM clusters with parallel coherence controllers

I. PragaspathyB. Falsafi

2000. Philadelphia, PA, October. p. 47-56. DOI : 10.1109/PACT.2000.888330.

Gated-Vdd: a circuit technique to reduce leakage in deep- submicron cache memories

M. D. PowellS.-H. YangB. FalsafiK. RoyT. N. Vijaykumar

2000. International Symposium on Low Power Electronics and Design (ISLPED), Rapallo, Italy, July. p. 90-95. DOI : 10.1109/LPE.2000.876763.

Comparing the effectiveness of fine-grain memory caching against page migration/replication in reducing traffic in DSM clusters

A.-C. LaiB. Falsafi

2000. Bar Harbor, ME, July. p. 79-88. DOI : 10.1145/341800.341811.

Selective, accurate, and timely self-invalidation using last-touch prediction

A.-C. LaiB. Falsafi

2000. Vancouver, BC, June. p. 139-148. DOI : 10.1109/ISCA.2000.854385.

1999

Memory sharing predictor: the key to a speculative coherent DSM

A.-C. LaiB. Falsafi

1999. Atlanta, GA, May. p. 172-183. DOI : 10.1109/ISCA.1999.765949.

Is SC+ILP=RC?

C. GuiadyB. FalsafiT. N. Vijaykumar

1999. ISCA, Atlanta, GA, May. p. 162-171. DOI : 10.1109/ISCA.1999.765948.

Parallel Dispatch Queue: a queue-based programming abstraction to parallelize fine-grain communication protocols

B. FalsafiD. A. Wood

1999. North Beach, CA, August. p. 182-192. DOI : 10.1109/HPCA.1999.744362.

1998

Sirocco: cost-effective fine-grain distributed shared memory

I. SchoinasB. FalsafiM. D. HillJ. R. LarusD. A. Wood

1998. Paris, France, October. p. 40-49. DOI : 10.1109/PACT.1998.727144.

1997

Wisconsin Wind Tunnel II: A Fast and Portable Parallel Architecture Simulator

S. S. MukherjeeS. K. ReinhardtB. FalsafiM. LitzkowS. Huss-Lederman  et al.

1997. Denver, CO, June.

Reactive NUMA: A design for unifying S-COMA and CC-NUMA

B. FalsafiD. A. Wood

1997. Denver, CO, June. p. 229-240. DOI : 10.1145/264107.264205.

Scheduling communication on an SMP node parallel machine

B. FalsafiD. A. Wood

1997. San Antonio, TX, February. p. 128-138. DOI : 10.1109/HPCA.1997.569649.

1996

Coherent network interfaces for fine-grain communication

S. S. MukherjeeB. FalsafiM. D. HillD. A. Wood

1996. Philadelphia, PA, May. p. 247-258. DOI : 10.1145/232973.232999.

1994

Fine-grain access control for distributed shared memory

I. SchoinasB. FalsafiA. R. LebeckS. K. ReinhardtJ. R. Larus  et al.

1994. ASPLOS'94. 6th International Conference on Architectural support for Programming Languages and Operating Systems, San Jose, CA, October. p. 297-306. DOI : 10.1145/195470.195575.

Cost/performance of a parallel computer simulator

B. FalsafiD. A. Wood

1994. Edinburgh, Scotland, July. p. 173-182.

Application-specific protocols for user-level shared memory

B. FalsafiA. R. LebeckS. K. ReinhardtI. SchoinasM. D. Hill  et al.

1994. Supercomputing '94, Washington D.C., USA, November 14-18. p. 380-389. DOI : 10.1109/SUPERC.1994.344301.

1993

Mechanisms for cooperative shared memory

D. A. WoodS. ChandraB. FalsafiM. D. HillJ. R. Larus  et al.

1993. 20th International Symposium on Computer Architecture, San Diego, CA, May. p. 156-167. DOI : 10.1145/165123.165151.

Kernel support for the Wisconsin Wind Tunnel

S. K. ReinhardtB. FalsafiD. A. Wood

1993. San Diego, CA, January. p. 73-89.

1990

Component Labeling Algorithms on an Intel iPSC/2 Hypercube

B. FalsafiR. Miller

1990. Charleston, SC, April. p. 159-164.


BibTex for all references found