|Conference Papers||Journal Papers||Technical Reports||Ph.D. Theses||Book chapters||Books|
A Deep-Learning Approach to Side-Channel Based CPU Disassembly at Design Time2022-03-22. 25th Design, Automation and Test in Europe - DATE 2022, Antwerp, Belgium [Virtual], March 14-23, 2022.
FPGA-to-CPU Undervolting Attacks2022-03-22. 25th Design, Automation and Test in Europe - DATE 2022 , Antwerp, Belgium [Virtual], March 14-23, 2022.
Deep Learning Detection of GPS Spoofing2022-02-02. 7th International Conference Machine Learning, Optimization, and Data Science (LOD 2021), Grasmere, UK, October 4-8, 2021. p. 527-540. DOI : 10.1007/978-3-030-95467-3_38.
Nonintrusive and Adaptive Monitoring for Locating Voltage Attacks in Virtualized FPGAs2020-12-01. 19th International Conference on Field-Programmable Technology (ICFPT), Maui, HI, USA (Virtual conference), December 7-11, 2020. p. 288-289. DOI : 10.1109/ICFPT51103.2020.00050.
Optimus Prime: Accelerating Data Transformation in Servers2020. Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, March 16–20, 2020. p. 1203-1216. DOI : 10.1145/3373376.3378501.
FPGA-Assisted Deterministic Routing for FPGAs2019-05-20. 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brasil, May 20-24, 2019. p. 155-162. DOI : 10.1109/IPDPSW.2019.00034.
RPCValet: NI-Driven Tail-Aware Balancing of µs-Scale RPCs2019-04-15. Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '19, Providence, Rhode Island, USA, April 13-17, 2019. p. 35-48. DOI : 10.1145/3297858.3304070.
Linebacker: Preserving Victim Cache Lines in Idle Register Files of GPUs2019-01-01. 46th International Symposium on Computer Architecture (ISCA), Phoenix, AZ, Jun 22-26, 2019. p. 183-196. DOI : 10.1145/3307650.3322222.
Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores2019-01-01. 25th IEEE International Symposium on High Performance Computer Architecture (HPCA), Washington, DC, Feb 16-20, 2019. p. 15-27. DOI : 10.1109/HPCA.2019.00024.
Towards Commoditizing Simulations of System Models Using Recurrent Neural Networks2018-01-01. IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Aalborg, DENMARK, Oct 29-31, 2018. DOI : 10.1109/SmartGridComm.2018.8587599.
Training DNNs with Hybrid Block Floating Point2018-01-01. NeurIPS 2018 - 32nd Conference on Neural Information Processing Systems, Montreal, CANADA, Dec 02-08, 2018.
LTRF: Enabling High-Capacity Register Files for GPUs via Hardware/Software Cooperative Register Prefetching2018. Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '18, Williamsburg, VA, USA, March 24th – March 28th, 2018. p. 489-502. DOI : 10.1145/3173162.3173211.
Parallel FPGA routing: Survey and challenges2017-01-01. 2017 27th International Conference on Field Programmable Logic and Applications (FPL), Ghent, Belgium, September 4-8, 2017. p. 1-8. DOI : 10.23919/FPL.2017.8056782.
Near-Memory Address Translation2017. 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), Portland, OR, SEP 09-13, 2017. p. 303-317. DOI : 10.1109/Pact.2017.56.
Unlocking Energy2016. 2016 USENIX Annual Technical Conference, Denver, Colorado, USA, June 22-24, 2016. p. 393-406.
Towards Near-Threshold Server Processors2016. Design, Automation and Test in Europe Conference (DATE '16), Dresden, Germany, March 14-18, 2016. p. 7-12.
Sort vs. Hash Join Revisited for Near-Memory Execution2015. 5th Workshop on Architectures and Systems for Big Data (ASBD 2015), Portland, Oregon, USA, June 13, 2015.
From A to E: Analyzing TPC’s OLTP Benchmarks -- The obsolete, the ubiquitous, the unexplored2013. 16th International Conference on Extending Database Technology, Genoa, Italy, March 18-22, 2013. p. 17-28.
Dark Silicon Accelerators for Database Indexing2012. 1st Dark Silicon Workshop, Portland, Oregon, USA, June 10, 2012.
CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers2012. 6th International Symposium on Networks-on-Chip, Lyngby, Denmark, May 9-11, 2012.
Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware2012. Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, London, UK, March 3-7, 2012.
Reliability in the Dark Silicon Era2011. 17th IEEE International On-Line Testing Symposium (IOLTS), Athens, Greece, Jul 13-15, 2011. p. V-V.
CCNoC: On-Chip Interconnects for Cache-Coherent Manycore Server Chips2011. Workshop on Energy-Efficient Design (WEED 2011), San Jose, California, USA, June 5, 2011.
ReSim, a Trace-Driven, Reconfigurable ILP Processor Simulator2009. DATE 2009, Nice, France, April 20-24, 2009.
A Rate-based Prefiltering Approach to BLAST Acceleration2008. International Conference on Field Programmable Logic and Applications (FPL), Heidelberg, Germany, September 08-10, 2008.
Stall Power Reduction in Pipelined Architecture Processors2008. 21st International Conference on VLSI Design, Hyderabad, January 4-8, 2008. p. 541-546. DOI : 10.1109/VLSI.2008.34.
BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs to Avoid Congestion2008. Design, Automation and Test in Europe, 2008. DATE '08, Munich, March 10-14, 2008. p. 1408-1413. DOI : 10.1109/DATE.2008.4484871.
A UML Based System Level Failure Rate Assessment Technique for SoC Designs2007. 25th IEEE VLSI Test Symposium, Berkeley, May 6-10, 2007. p. 243-248. DOI : 10.1109/VTS.2007.9.
An Analysis of Database System Performance on Chip Multiprocessors2007. Athens, Greece, July.
PROTOFLEX: FPGA-accelerated hybrid functional simulator2007. Long Beach, CA, March. DOI : 10.1109/IPDPS.2007.370516.
To Share or Not To Share?2007. 33rd International Conference on Very Large Data Bases, Vienna, Austria, September. p. 351-362.
Mechanisms for store-wait-free multiprocessors2007. San Diego, CA, June. p. 266-277.
Database Servers on Chip Multiprocessors: Limitations and Opportunities2007. Asilomar, CA, January.
ProtoFlex: Co-simulation for Component-wise FPGA Emulator Development2006. Austin, TX, February.
The Granularity of Soft-Error Containment in Shared-Memory Multiprocessors2006. Urbana-Champagne, IL, April.
Simulation sampling with live-points2006. Austin, TX, March. p. 2-12.
Spatial Memory Streaming2006. Boston, MA, June 17-21, 2006. p. 252-263.
TED+: A Data Structure for Microprocessor Verification2005. Asia and South Pacific Design Automation Conference, Shanghai, January 18-21, 2005. p. 567-572. DOI : 10.1109/ASPDAC.2005.1466228.
Accelerating Database Operations Using a Network Processor2005. Baltimore, USA, June.
Temporal Streaming of Shared Memory2005. Madison, WI, June 4-8, 2005. p. 222-233.
An Evaluation of Stratified Sampling of Microarchitecture Simulations2004. Munich, Germany, June.
Fingerprinting: Bounding the Soft-Error Detection Latency and Bandwidth2004. Boston, MA, October.
Accurate and complexity-effective spatial pattern prediction2004. Madrid, Spain, February. p. 276-287.
Performance and Energy Trade-Offs of Bitline Isolation in Nanoscale CMOS Caches2003. San Diego, CA, June.
Gated Precharge: Using Temporal Locality of Subarrays to Save Deep- Submicron Cache Energy2002. Anchorage, AK, May.
Reducing set-associative cache energy via way-prediction and selective direct-mapping2001. Monterrey, Mexico, January 20-24, 2001. p. 54-65.
JETTY: Filtering snoops for reduced energy consumption in SMP servers2001. Monterrey, Mexico, January. p. 85-96.
Low-Overhead and High-Performance Implementations of Sequential Consistency2000. Vancouver, BC, June.
Address partitioning in DSM clusters with parallel coherence controllers2000. Philadelphia, PA, October. p. 47-56.
Comparing the effectiveness of fine-grain memory caching against page migration/replication in reducing traffic in DSM clusters2000. Bar Harbor, ME, July. p. 79-88.
Wisconsin Wind Tunnel II: A Fast and Portable Parallel Architecture Simulator1997. Denver, CO, June.
Reactive NUMA: A design for unifying S-COMA and CC-NUMA1997. Denver, CO, June. p. 229-240.
Coherent network interfaces for fine-grain communication1996. Philadelphia, PA, May. p. 247-258.
Cost/performance of a parallel computer simulator1994. Edinburgh, Scotland, July. p. 173-182.
Application-specific protocols for user-level shared memory1994. Supercomputing '94, Washington D.C., USA, November 14-18. p. 380-389. DOI : 10.1109/SUPERC.1994.344301.
Kernel support for the Wisconsin Wind Tunnel1993. San Diego, CA, January. p. 73-89.
Component Labeling Algorithms on an Intel iPSC/2 Hypercube1990. Charleston, SC, April. p. 159-164.
BibTex for all references found