Publications

Selected papers

   Vega: A Ten-Core SoC for IoT Endnodes With DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode
Davide Rossi, Francesco Conti, Manuel Eggimann, Alfio Di Mauro, Giuseppe Tagliavini, Stefan Mach, Marco Guermandi, Antonio Pullini, Igor Loi, Jie Chen, Eric Flamand, Luca Benini
IEEE Journal of Solid-State Circuits, (Early Access): 1 - 1, New York, NY: IEEE, 2021.
DOI: 10.1109/JSSC.2021.3114881

   MemPool: A Shared-L1 Memory Many-Core Cluster with a Low-Latency Interconnect
Matheus Cavalcante, Samuel Riedel, Antonio Pullini, Luca Benini
5 December 2020.
arXiv:2012.02973

   Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores
Fabian Schuiki, Florian Zaruba, Torsten Hoefler and Luca Benini
1 April 2020.
arXiv:1911.08356

   Snitch: A 10 kGE Pseudo Dual-Issue Processorfor Area and Energy Efficient Execution of Floating-Point Intensive Workloads
Florian Zaruba, Fabian Schuiki, Torsten Hoefler and Luca Benini
24 Feb 2020.
arXiv:2002.10143

   Mr.Wolf: An Energy-Precision Scalable Parallel Ultra Low Power SoC for IoT Edge Processing
Antonio Pullini, Davide Rossi, Igor Loi, Giuseppe Tagliavini, Luca Benini
IEEE Journal of Solid-State Circuits, (Early Access): 1 - 12, New York, NY: IEEE, 2019.
DOI: 10.1109/JSSC.2019.2912307

  An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics 
Francesco Conti, Robert Schilling, Pasquale D. Schiavone, Antonio Pullini, Davide Rossi, Frank K. Gürkaynak, Michael Muehlberghuber, Michael Gautschi, Igor Loi, Germain Haugou, Stefan Mangard and Luca Benini
IEEE Transactions on Circuits and Systems I, Regular Papers, 64 (9): 2481-2494, New York, NY: IEEE, 2017.
DOI: 10.1109/TCSI.2017.2698019

   Energy-Efficient Near-Threshold Parallel Computing: The PULPv2 Cluster
Davide Rossi, Antonio Pullini, Igor Loi, Michael Gautschi, Frank K. Gürkaynak, Adam Teman, Jeremy Constantin, Andreas Burg, Ivan Miro-Panades, Edith Beignè, Fabien Clermidy, Philippe Flatresse and Luca Benini
IEEE Micro, 37 (5): 20-31, Piscataway, NJ: IEEE, 2017.
DOI: 10.1109/MM.2017.3711645

  HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA
Andreas Kurth, Pirmin Vogel, Alessandro Capotondi, Andrea Marongiu, Luca Benini
Proceedings of Computer Architecture Research with RISC-V Workshop (CARRV' 17), Boston, MA: 2017.
DOI: 10.3929/ethz-b-000219249

  μDMA: An autonomous I/O subsystem for IoT end-nodes 
Antonio Pullini, Davide Rossi, Germain Haugou and Luca Benini
2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS), Piscataway, NJ: IEEE, 2017
DOI: 10.1109/PATMOS.2017.8106971

  Near-Threshold RISC-V core with DSP extensions for scalable IoT endpoint devices 
Michael Gautschi, Pasquale D. Schiavone, Andreas Traber, Igor Loi, Antonio Pullini, Davide Rossi, Eric Flamand, Frank K. Gürkaynak and Luca Benini
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 25 (10): 2700-2713, New York, NY: IEEE, 2017.
DOI: 10.1109/TVLSI.2017.2654506

  Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications 
Pasquale D. Schiavone, Francesco Conti, Davide Rossi, Michael Gautschi, Antonio Pullini, Eric Flamand and Luca Benini
2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS), 8106976, Piscataway, NJ: IEEE, 2017.
DOI: 10.1109/PATMOS.2017.8106976

2022

  A “New Ara” for Vector Computing: An Open Source Highly Efficient RISC-V V 1.0 Vector Processor Design
Matteo Perotti, Matheus Cavalcante, Nils Wistoff, Renzo Andri, Lukas Cavigelli, Luca Benini
17 October 2022.
arXiv:2210.08882

   Darkside: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training
Angelo Garofalo, Yvan Tortorella, Matteo Perotti, Luca Valente, Alessandro Nadalini, Luca Benini, Davide Rossi and Francesco Conti
IEEE Open Journal of the Solid-State Circuits Society (Early Access): 1 - 1, New York, NY: IEEE, 2022.
DOI: 10.1109/OJSSCS.2022.3210082

  Kraken: A Direct Event/Frame-Based Multi-sensor Fusion SoC for Ultra-Efficient Visual Processing in Nano-UAVs
Alfio Di Mauro, Moritz Scherer, Davide Rossi, Luca Benini
18 August 2022.
arXiv:2209.01065

  Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters
Gianna Paulin, Matheus Cavalcante, Paul Scheffler, Luca Bertaccini, Yichao Zhang, Frank Gürkaynak, Luca Benini
2 September 2022.
arXiv:2209.00889

  Spatz: A Compact Vector Processing Unit for High-Performance and Energy-Efficient Shared-L1 Clusters
Matheus Cavalcante, Domenic Wüthrich, Matteo Perotti, Samuel Riedel, Luca Benini
16 July 2022.
arXiv:2207.07970

  MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V cores
Luca Bertaccini, Gianna Paulin, Tim Fischer, Stefan Mach, Luca Benini
7 July 2022.
arXiv:2207.03192

  On-Demand Redundancy Grouping: Selectable Soft-Error Tolerance for a Multicore Cluster
Michael Rogenmoser, Nils Wistoff, Pirmin Vogel, Frank Gürkaynak, Luca Benini
25 May 2022.
arXiv:2205.12580

  Monte Cimone: Paving the Road for the First Generation of RISC-V High-Performance Computers
Andrea Bartolini, Federico Ficarelli, Emanuele Parisi, Francesco Beneventi, Francesco Barchi, Daniele Gregori, Fabrizio Magugliani, Marco Cicala, Cosimo Gianfreda, Daniele Cesarini, Andrea Acquaviva, Luca Benini 7 May 2022.
arXiv:2205.03725

  SNE: an Energy-Proportional Digital Accelerator for Sparse Event-Based Convolutions
Alfio Di Mauro, Arpan Suravi Prasad, Zhikai Huang, Matteo Spallanzani, Francesco Conti, Luca Benini
29 April 2022.
arXiv:2204.10687

  RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs
Yvan Tortorella, Luca Bertaccini, Davide Rossi, Luca Benini, Francesco Conti
24 April 2022.
arXiv:2204.11192

  Energy-Efficient Tree-Based EEG Artifact Detection
Thorir Mar Ingolfsson, Andrea Cossettini, Simone Benatti, Luca Benini
19 April 2022.
arXiv:2204.09577

   TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference
Alessio Burrello, Alberto Dequino, Daniele Jahier Pagliari, Francesco Conti, Marcello Zanghieri, Enrico Macii, Luca Benini, Massimo Poncino
24 March 2022.
arXiv:2203.12925

   GVSoC: A Highly Configurable, Fast and Accurate Full-Platform Simulator for RISC-V based IoT Processors
Nazareno Bruschi, Germain Haugou, Giuseppe Tagliavini, Francesco Conti, Luca Benini, Davide Rossi
20 January 2022.
arXiv:2201.08166

   HEROv2: Full-Stack Open-Source Research Platform for Heterogeneous Computing
Andreas Kurth, Björn Forsberg, Luca Benini
11 January 2022.
arXiv:2201.03861

   Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks
Gianmarco Cerutti, Lukas Cavigelli, Renzo Andri, Michele Magno, Elisabetta Farella, Luca Benini
10 January 2022.
arXiv:2201.03386

   A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End Inference of Real-World Deep Neural Networks
Angelo Garofalo, Gianmarco Ottavi, Francesco Conti, Geethan Karunaratne, Irem Boybat, Luca Benini, Davide Rossi
4 January 2022.
arXiv:2201.01089

2021

   MemPool-3D: Boosting Performance and Efficiency of Shared-L1 Memory Many-Core Clusters with 3D Integration
Matheus Cavalcante, Anthony Agnesina, Samuel Riedel, Moritz Brunion, Alberto Garcia-Ortiz, Dragomir Milojevic, Francky Catthoor, Sung Kyu Lim, Luca Benini
2 December 2021.
arXiv:2112.01168

   Banshee: A Fast LLVM-Based RISC-V Binary Translator
Samuel Riedel; Fabian Schuiki; Paul Scheffler; Florian Zaruba; Luca Benini
2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD): 1-9, Munich, Germany: IEEE, 2021.
DOI: 10.1109/ICCAD51958.2021.9643546

   A 1.3TOPS/W @ 32GOPS Fully Integrated 10-Core SoC for IoT End-Nodes with 1.7μW Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode
Davide Rossi, Francesco Conti, Manuel Eggiman, Stefan Mach, Alfio Di Mauro, Marco Guermandi, Giuseppe Tagliavini, Antonio Pullini, Igor Loi, Jie Chen, Eric Flamand, Luca Benini
2021 IEEE International Solid- State Circuits Conference (ISSCC): 60-62, San Francisco, CA: IEEE, 2021.
DOI: 10.1109/ISSCC42613.2021.9365939

   A TinyML Platform for On-Device Continual Learning with Quantized Latent Replays
Leonardo Ravaglia, Manuele Rusci, Davide Nadalini, Alessandro Capotondi, Francesco Conti, Luca Benini
20 October 2021.
arXiv:2110.10486

   End-to-end 100-TOPS/W Inference With Analog In-Memory Computing: Are We There Yet?
Gianmarco Ottavi, Geethan Karunaratne, Francesco Conti, Irem Boybat, Luca Benini and Davide Rossi
3 September 2021.
arXiv:2109.01404

   DNN is not all you need: Parallelizing Non-Neural ML Algorithms on Ultra-Low-Power IoT Processors
Enrico Tabanelli, Giuseppe Tagliavini, Luca Benini
16 July 2021.
arXiv:2107.09448

   Towards Long-term Non-invasive Monitoring for Epilepsy via Wearable EEG Devices
Thorir Mar Ingolfsson, Andrea Cossettini, Xiaying Wang, Enrico Tabanelli, Giuseppe Tagliavini, Philippe Ryvlin, Luca Benini, Simone Benatti
17 June 2021.
arXiv:2106.08008

   PsPIN: A high-performance low-power architecture for flexible in-network compute
Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beránek, Luca Benini, Torsten Hoefler
1 June 2021.
arXiv:2010.03536

   Tiny-FPU: Low-Cost Floating-Point Support for Small RISC-V MCU Cores
Luca Bertaccini, Matteo Perotti, Stefan Mach, Pasquale Davide Schiavone, Florian Zaruba, Luca Benini
2021 IEEE International Symposium on Circuits and Systems (ISCAS): 1-5, Piscataway, NJ: IEEE, 2021.
DOI: 10.1109/ISCAS51556.2021.9401149

   ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network
Thorir Mar Ingolfsson, Xiaying Wang, Michael Hersche, Alessio Burrello, Lukas Cavigelli, Luca Benini
25 March 2021.
arXiv:2103.13740

   Fully Onboard AI-powered Human-Drone Pose Estimation on Ultra-low Power Autonomous Flying Nano-UAVs
Daniele Palossi, Nicky Zimmerman, Alessio Burrello, Francesco Conti, Hanna Müller, Luca Maria Gambardella, Luca Benini, Alessandro Giusti, Jérôme Guzzi
19 March 2021.
arXiv:2103.10873

   RISC-V for Real-time MCUs - Software Optimization and Microarchitectural Gap Analysis
Robert Balas, Luca Benini
2021 Design, Automation & Test in Europe Conference & Exhibition (DATE): 874-877, Piscataway, NJ: IEEE, 2021.
DOI: 10.23919/DATE51398.2021.9474114

   A 5 μW Standard Cell Memory-based Configurable Hyperdimensional Computing Accelerator for Always-on Smart Sensing
Manuel Eggimann, Abbas Rahimi, Luca Benini
4 February 2021.
arXiv:2102.02758

   Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices
Gianmarco Cerutti, Renzo Andri, Lukas Cavigelli, Michele Magno, Elisabetta Farella, Luca Benini
12 January 2021.
arXiv:2101.04446

2020

   XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Network on RISC-V based IoT End Nodes
Angelo Garofalo, Giuseppe Tagliavini, Francesco Conti, Luca Benini, Davide Rossi
29 November 2020.
arXiv:2011.14325

   Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra
Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini
16 November 2020.
arXiv:2011.08070

   CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration with Better-than-Binary Energy Efficiency
Moritz Scherer, Georg Rutishauser, Lukas Cavigelli, Luca Benini
3 November 2020.
arXiv:2011.01713

   An Energy-Efficient Low-Voltage Swing Transceiver for mW-Range IoT End-Nodes
Hayate Okuhara, Ahmed Elnaqib, Davide Rossi, Alfio Di Mauro, Philipp Mayer, Pierpaolo Palestri, Luca Benini
9 October 2020.
arXiv:2010.04566

   ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor
Andreas Kurth, Samuel Riedel, Florian Zaruba, Torsten Hoefler, Luca Benini
2020 57th ACM/IEEE Design Automation Conference (DAC): 1-6, San Francisco, CA: IEEE, 2021.
DOI: 10.1109/DAC18072.2020.9218661

   PsPIN: A high-performance low-power architecture for flexible in-network compute
Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider,Jakub Beranek, Luca Benini, Torsten Hoefler
8 October 2020.
arXiv:2010.03536

   A Mixed-Precision RISC-V Processor for Extreme-Edge DNN Inference
Gianmarco Ottavi, Angelo Garofalo, Giuseppe Tagliavini, Francesco Conti, Luca Benini and Davide Rossi
8 October 2020.
arXiv:2010.04073
DOI:10.1109/ISVLSI49217.2020.000-5

   An Open-Source Platform for High-Performance Non-Coherent On-Chip Communication
Andreas Kurth, Wolfgang Ronninger, Thomas Benz, Matheus Cavalcante, Fabian Schuiki, Florian Zaruba and Luca Benini
11 September 2020.
arXiv:2009.05334

   Manticore: A 4096-core RISC-V Chiplet Architecture for Ultra-efficient Floating-point Computing
Florian Zaruba, Fabian Schuiki, Luca Benini
14 August 2020.
arXiv:2008.06502
DOI:10.1109/MM.2020.3045564

   Memory-Latency-Accuracy Trade-offs for Continual Learning on a RISC-V Extreme-Edge Node
Leonardo Ravaglia, Manuele Rusci, Alessandro Capotondi, Francesco Conti, Lorenzo Pellegrini, Vincenzo Lomonaco, Davide Maltoni and Luca Benini
22 July 2020.
arXiv:2007.13631
DOI:10.1109/SiPS50750.2020.9195220

   Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node
Alfio Di Mauro, Francesco Conti, Pasquale Davide Schiavone, Davide Rossi, Luca Benini
17 July 2020.
arXiv:2007.08952
DOI:10.1109/TCSI.2020.3012576

   FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing
Stefan Mach, Fabian Schuiki, Florian Zaruba, Luca Benini
3 July 2020.
arXiv:2007.01530

   XwattPilot: A Full-stack Cloud System Enabling Agile Development of Transprecision Software for Low-power SoCs
Dionysios Diamantopoulos, Florian Scheidegger, Stefan Mach, Fabian Schuiki, Germain Haugou, Michael Schaffner, Frank K. Gurkaynak, Christoph Hagleitner, A. Cristiano I. Malossi, Luca Benini
2020 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS): 1-3, Piscataway, NJ: IEEE, 2020.
DOI:10.1109/COOLCHIPS49199.2020.9097644

   Enabling Mixed-Precision Quantized Neural Networks in Extreme-Edge Devices
Nazareno Bruschi, Angelo Garofalo, Francesco Conti, Giuseppe Tagliavini, Davide Rossi
17th ACM International Conference on Computing Frontiers (CF ’20): 217-220, New York, NY: ACM, 2020.
DOI:10.1145/3387902.3394038

   Combining Learning and Optimization for Transprecision Computing
Andrea Borghesi, Giuseppe Tagliavini, Michele Lombardi, Luca Benini, Michela Milano
17th ACM International Conference on Computing Frontiers (CF ’20): 10-18, New York, NY: ACM, 2020.
DOI:10.1145/3387902.3392615

   Design of an Open-Source Bridge Between Non-Coherent Burst-Based and Coherent Cache-Line-Based Memory Systems
Matheus Cavalcante, Andreas Kurth, Fabian Schuiki, Luca Benini
17th ACM International Conference on Computing Frontiers (CF ’20): 81-88, New York, NY: ACM, 2020.
DOI:10.1145/3387902.3392631

   Arnold: an eFPGA-Augmented RISC-V SoC for Flexible and Low-Power IoT End-Nodes
Pasquale Davide Schiavone, Davide Rossi, Alfio Di Mauro, Frank Gürkaynak, Timothy Saxe, Mao Wang, Ket Chong Yap, Luca Benini
25 June 2020.
arXiv:2006.14256

   HW/SW approaches for RISC-V code size reduction
Matteo Perotti, Pasquale Davide Schiavone, Giuseppe Tagliavini, Davide Rossi, Tariq Kurd, Mark Hill, Liu Yingying, Luca Benini
Fourth Workshop on Computer Architecture Research with RISC-V (CARRV 2020). May 2020. Virtual Workshop.
Link

   Prevention of Microarchitectural Covert Channels on an Open-Source 64-bit RISC-V Core
Nils Wistoff, Moritz Schneider, Frank K. Gürkaynak, Luca Benini, Gernot Heiser
1 May 2020.
arXiv:2005.02193

   Energy-Efficient Hardware-Accelerated Synchronization for Shared-L1-Memory Multiprocessor Clusters
Florian Glaser, Giuseppe Tagliavini, Davide Rossi, Germain Haugou, Qiuting Huang and Luca Benini
14 April 2020.
arXiv:2004.06662

   LLHD: A Multi-level Intermediate Representation for Hardware Description Languages
Fabian Schuiki, Andreas Kurth, Tobias Grosser and Luca Benini
7 April 2020.
arXiv:2004.03494

   Extending the RISC-V ISA for Efficient RNN-based 5G Radio Resource Management
Renzo Andri, Tomas Henriksson and Luca Benini
5 April 2020.
arXiv:2002.12877

   XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors Through ISA Extensions
Angelo Garofalo, Giuseppe Tagliavini, Francesco Conti, Davide Rossi, Luca Benini
2020 Design, Automation & Test in Europe Conference & Exhibition (DATE): 186-191, Piscataway, NJ: IEEE, 2020.
DOI: 10.23919/DATE48585.2020.9116529

   An On-the-Fly Feature Map Compression Engine for Background Memory Access Cost Reduction in DNN Inference
Georg Rutishauser, Lukas Cavigelli, Luca Benini
Working Paper. ETH Research Collection, 2020.
DOI: 10.3929/ethz-b-000388819

2019

   A PULP-based Parallel Power Controller for Future Exascale Systems
Andrea Bartolini, Davide Rossi, Antonio Mastrandrea, Christian Conficoni, Simone Benatti, Andrea Tilli, Luca Benini
2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS): 771-774, Piscataway, NJ: IEEE, 2019.
DOI: 10.1109/ICECS46596.2019.8964699

   FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things
Xiaying Wang, Michele Magno, Lukas Cavigelli, Luca Benini
8 Nov 2019.
arXiv:1911.03314

   Network-Accelerated Non-Contiguous Memory Transfers
Salvatore Di Girolamo, Konstantin Taranov, Andreas Kurth, Michael Schaffner, Timo Schneider, Jakub Beranek, Maciej Besta, Luca Benini, Duncan Roweth, Torsten Hoefler
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19): 56:1 -56:14, New York, NY: ACM, 2019.
DOI: 10.1145/3295500.3356189

   PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors
Angelo Garofalo, Manuele Rusci, Francesco Conti, Davide Rossi, Luca Benini
Phil.Trans.R.Soc.A378:20190155.
DOI: 10.1098/rsta.2019.0155

   A RISC-V Based Open Hardware Platform for Always-On Wearable Smart Sensing
Manuel Eggimann, Stefan Mach, Michele Magno, Luca Benini
2019 IEEE 8th International Workshop on Advances in Sensors and Interfaces (IWASI): 169 - 174, Piscataway, NJ: IEEE, 2019.
DOI: 10.1109/IWASI.2019.8791364

   Ara: A 1 GHz+ Scalable and Energy-Efficient RISC-V Vector Processor with Multi-Precision Floating Point Support in 22 nm FD-SOI
Matheus Cavalcante, Fabian Schuiki, Florian Zaruba, Michael Schaffner, Luca Benini
2 June 2019.
arXiv:1906.00478

   An Open Source and Open Hardware Deep Learning-Powered Visual Navigation Engine for Autonomous Nano-UAVs
Daniele Palossi, Francesco Conti, Luca Benini
2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS): 604-611, Piscataway, NJ: IEEE, 2019.
DOI: 10.1109/DCOSS.2019.00111

   The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-ready 1.7GHz 64bit RISC-V Core in 22nm FDSOI Technology
Florian Zaruba, Luca Benini
10 Apr 2019. Submitted to IEEE Transaction on Very Large Scale Integration (VLSI) Systems.
arXiv:1904.05442

   An Energy Efficient System for Touch Modality Classification in Electronic Skin Applications
M. Osta, A. Ibrahim, M. Magno, M. Eggimann, A. Pullini, P. Gastaldo, M. Valle
2019 IEEE International Symposium on Circuits and Systems (ISCAS): 1-4, Piscataway, NJ: IEEE, 2019.
DOI: 10.1109/ISCAS.2019.8702113

   Design and Evaluation of SmallFloat SIMD extensions to the RISC-V ISA
Giuseppe Tagliavini, Stefan Mach, Davide Rossi, Andrea Marongiu, Luca Benini
2019 Design, Automation & Test in Europe Conference & Exhibition (DATE): 654-657, Piscataway, NJ: IEEE, 2019.
DOI: 10.23919/DATE.2019.8714897

   An Energy-Efficient IoT node for HMI applications based on an ultra-low power Multicore Processor
Victor Kartsch, Marco Guermandi, Simone Benatti, Fabio Montagna, Luca Benini
2019 IEEE Sensors Applications Symposium (SAS): 1-6, Piscataway, NJ: IEEE, 2019.
DOI: 10.1109/SAS.2019.8705984

2018

   A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets
Fabian Schuiki, Michael Schaffner, Frank K. Gürkaynak, Luca Benini
IEEE Transactions on Computers, 68 (4): 484 - 497, New York, NY: IEEE, 2018.
DOI: 10.1109/TC.2018.2876312

   Quentin: an Ultra-Low-Power PULPissimo SoC in 22nm FDX
Pasquale D. Schiavone, Davide Rossi, Antonio Pullini, Alfio Di Mauro, Francesco Conti and Luca Benini
IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S 2018): 6.1, San Francisco, CA, USA, October 15-18, 2018.
DOI: 10.3929/ethz-b-000314427

   Exploring Shared Virtual Memory for FPGA Accelerators with a Configurable IOMMU 
Pirmin Vogel, Andrea Marongiu, Luca Benini
IEEE Transactions on Computers, Early Access: 1-1, New York, NY: IEEE, 2018.
DOI: 10.1109/TC.2018.2879080

   High speed ASIC implementations of leakage-resilient cryptography 
Robert Schilling, Thomas Unterluggauer, Stefan Mangard, Frank K. Gürkaynak, Michael Muehlberghuber, Luca Benini
2018 Design, Automation & Test in Europe Conference & Exhibition (DATE): 1259-1264, Piscataway, NJ: IEEE, 2018.
DOI: 10.23919/DATE.2018.8342208

   GAP-8: A RISC-V SoC for AI at the Edge of the IoT
Eric Flamand, Davide Rossi, Francesco Conti, Igor Loi, Antonio Pullini, Florent Rotenberg, Luca Benini
2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP): 1-4, Piscataway, NJ: IEEE, 2018.
DOI: 10.1109/ASAP.2018.8445101

   The Quest for Energy-Efficient I$ Design in Ultra-Low-Power Clustered Many-Cores
Igor Loi, Alessandro Capotondi, Davide Rossi, Andrea Marongiu, Luca Benini
IEEE Transactions on Multi-Scale Computing Systems, 4(2): 99-112, New York, NY: IEEE, 2018.
DOI: 10.1109/TMSCS.2017.2769046

   A sensor fusion approach for drowsiness detection in wearable ultra-low-power systems
Victor Javier Kartsch, Simone Benatti, Pasquale Davide Schiavone, Davide Rossi, Luca Benini
Information Fusion, 43: 66-76, Amsterdam, Elsevier BV, 2018.
DOI: 10.1016/j.inffus.2017.11.005

   A Heterogeneous Cluster with Reconfigurable Accelerator for Energy Efficient Near-Sensor Data Analytics
Satyajit Das, Kevin J. M. Martin, Philippe Coussy, Davide Rossi
2018 IEEE International Symposium on Circuits and Systems (ISCAS): 1-5, Piscataway, NJ: IEEE, 2018.
DOI: 10.1109/ISCAS.2018.8351749

   PULP-HD: accelerating brain-inspired high-dimensional computing on a parallel ultra-low power platform
Fabio Montagna, Abbas Rahimi, Simone Benatti, Davide Rossi, and Luca Benini
Proceedings of the 55th Annual Design Automation Conference (DAC '18): 111:1-111:6, New York, NY: ACM, 2018.
DOI: 10.1145/3195970.3196096

   Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes
Erfan Azarkhish, Davide Rossi, Igor Loi and Luca Benini
IEEE Transactions on Parallel Distributed Systems, 29 (2): 420-434, New York, NY: IEEE, 2018.
DOI: 10.1109/TPDS.2017.2752706

  XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference Best Paper Award
Francesco Conti, Pasquale Davide Schiavone, Luca Benini
Accepted for presentation at CODES'18 and for publication in IEEE Transactions on Computer-Aided Design of Circuits and Systems (TCAD) as part of the ESWEEK-TCAD special issue
arXiv:1807.03010 [cs.NE]

  Ultra Low Power Deep-Learning-powered Autonomous Nano Drones
Daniele Palossi, Antonio Loquercio, Francesco Conti, Eric Flamand, Davide Scaramuzza, Luca Benini
Under review on IEEE Internet of Things Journal (IEEE IOTJ)
arXiv:1805.01831 [cs.RO]

2017

   A Sub-mW IoT-Endnode for Always-On Visual Monitoring and Smart Triggering
Manuele Rusci, Davide Rossi, Elisabetta Farella and Luca Benini
IEEE Internet of Things Journal, 4 (5): 1284-1295, New York, NY: IEEE, 2017.
DOI: 10.1109/JIOT.2017.2731301

   Flexible, Scalable and Energy Efficient Bio-Signals Processing on the PULP Platform: A Case Study on Seizure Detection
Fabio Montagna, Simone Benatti, Davide Rossi
Journal of Low Power Electronics and Applications, 7 (2): 16, Basel: MDPI, 2017.
DOI: 10.3390/jlpea7020016

   A machine learning approach for automated wide-range frequency tagging analysis in embedded neuromonitoring systems
Fabio Montagna, Marco Buiatti, Simone Benatti, Davide Rossi, Elisabetta Farella, Luca Benini
Methods, 129: 96 - 107, Amsterdam, Elsevier BV, 2017.
DOI: 10.1016/j.ymeth.2017.06.019

  A Self-Aware Architecture for PVT Compensation and Power Nap in Near-Threshold Processors
Davide Rossi, Igor Loi, Antonio Pullini, Christoph Müller, Andreas Burg, Francesco Conti, Luca Benini and Philippe Flatresse
IEEE Design & Test, 34 (6): 46-53, New York, NY: IEEE, 2017.
DOI: 10.1109/MDAT.2017.2750907

  Lightweight Virtual Memory Support for Zero-Copy Sharing of Pointer-Rich Data Structures in Heterogeneous Embedded SoCs
Pirmin Vogel, Andrea Marongiu, Luca Benini
IEEE Transactions on Parallel and Distributed Systems 28 (7): 1947 - 1959, New York, NY: IEEE, 2017.
DOI: 10.1109/TPDS.2016.2645219

  Efficient Virtual Memory Sharing via On-Accelerator Page Table Walking in Heterogeneous Embedded SoCs
Pirmin Vogel, Andreas Kurth, Johannes Weinbuch, Andrea Marongiu, Luca Benini
ACM Transactions on Embedded Computing Systems 16 (5s): 154:1 - 154:19, New York, NY: ACM, 2017.
DOI: 10.1145/3126560

  Enabling Zero-Copy OpenMP Offloading on the PULP Many-Core Accelerator
Alessandro Capotondi, Andrea Marongiu
Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES '17), 68-71, New York, NY: ACM, 2017.
DOI: 10.1145/3078659.3079071

2016

  PULP: A Ultra-Low Power Parallel Accelerator for Energy-Efficient and Flexible Embedded Vision 
Francesco Conti, Davide Rossi, Antonio Pullini, Igor Loi and Luca Benini
Journal of Signal Processing Systems, 84 (3): 339-354, Berlin: Springer, 2016
DOI: 10.1007/s11265-015-1070-9

  Scalable EEG seizure detection on an ultra low power multi-core architecture
Simone Benatti, Fabio Montagna, Davide Rossi and Luca Benini
2016 IEEE Biomedical Circuits and Systems Conference (BioCAS 2016), 86-89, Piscataway, NJ: IEEE, 2016.
DOI: 10.1109/BioCAS.2016.7833731

  193 MOPS/mW @ 162 MOPS, 0.32V to 1.15V voltage range multi-core accelerator for energy efficient parallel and sequential digital processing
Davide Rossi, Antonio Pullini, Igor Loi, Michael Gautschi, Frank K. Gurkaynak, Adam Teman, Jeremy Constantin, Andreas Burg, Ivan Miro-Panades, Edith Beigný, Fabien Clermidy, Fady Abouzeid, Philippe Flatresse and Luca Benini
Proceedings of the IEEE Symposium in Low-Power and High-Speed Chips, 2016 (IEEE COOL CHIPS XIX), 7503670, Piscataway, NJ: IEEE, 2016.
DOI: 10.1109/CoolChips.2016.7503670

  An Event-Driven Ultra-Low-Power Smart Visual Sensorbr> Manuele Rusci, Davide Rossi, Michela Lecca, Massimo Gottardi, Elisabetta Farella, Luca Benini
IEEE Sensors Journal, 16 (13): 5344-5353, Piscataway, NJ: IEEE, 2016.
DOI: 10.1109/JSEN.2016.2556421

  A 65nm CMOS 6.4-to-29.2 pJ/FLOP@ 0.8 V shared logarithmic floating point unit for acceleration of nonlinear function kernels in a tightly coupled processor cluster
Michael Gautschi, Michael Schaffner, Frank Kagan Gürkaynak, Luca Benini
2016 IEEE International Solid-State Circuits Conference (ISSCC), 82-83 : San Francisco, CA: IEEE 2016.
DOI: 10.1109/ISSCC.2016.7417917

  High-Efficiency Logarithmic Number Unit Design based on an Improved Cotransformation Scheme
Youri Popoff, Florian Scheidegger, Michael Schaffner, Michael Gautschi, Frank Kagan Gürkaynak, Luca Benini
2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1387-1392, Piscataway, NJ: IEEE, 2016.
DOI: 10.3850/9783981537079_0174

  Enabling the heterogeneous accelerator model on ultra-low power microcontroller platforms
Francesco Conti, Daniele Palossi, Andrea Marongiu, Davide Rossi and Luca Benini
2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1201-1206, Piscataway, NJ: IEEE, 2016.
DOI: 10.3850/9783981537079_0626

  A heterogeneous multi-core system-on-chip for energy efficient brain inspired vision
Antonio Pullini, Francesco Conti, Davide Rossi, Igor Loi, Michael Gautschi and Luca Benini
2016 IEEE International Symposium on Circuits and Systems (ISCAS), 2910-2910, Piscataway, NJ: IEEE, 2016.
DOI: 10.1109/ISCAS.2016.7539213

  PULP: A Parallel Ultra Low Power platform for next generation IoT Applications
Davide Rossi, Francesco Conti, Andrea Marongiu, Antonio Pullini, Igor Loi, Michael Gautschi, Giuseppe Tagliavini, Alessandro Capotondi, Philippe Flatresse, Luca Benini
2015 IEEE Hot Chips 27 Symposium (HCS), 7477325, New York, NY: IEEE, 2016.
DOI: 10.1109/HOTCHIPS.2015.7477325

  A 60 GOPS/W, -1.8V to 0.9V body bias ULP cluster in 28nm UTBB FD-SOI technology
Davide Rossi, Antonio Pullini, Igor Loi, Michael Gautschi, Frank K. Gürkaynak, Andrea Bartolini, Philippe Flatresse and Luca Benini
Solid-State Electronics, 117: 170-184, Kidlington: Elsevier Science, 2016.
DOI: 10.1016/j.sse.2015.11.015

2015

  Power, Area, and Performance Optimization of Standard Cell Memory Arrays through Controlled Placement
Adam Teman, Davide Rossi, Pascal Meinerzhagen, Luca Benini, Andreas Burg
ACM Transactions on Design Automation of Electronic Systems (TODAES), 21 (4): 59:1-59:25, New York, NY: ACM, 2016.
DOI: 10.1145/2890498

  Exploring multi-banked shared-L1 program cache on ultra-low power, tightly coupled processor clusters
Igor Loi, Davide Rossi, Germain Haugou, Michael Gautschi and Luca Benini
Proceedings of the 12th ACM International Conference on Computing Frontiers, 64:1-64:8, New York, NY: ACM, 2015.
DOI: 10.1145/2742854.2747288

  Controlled placement of standard cell memory arrays for high density and low power in 28nm FD-SOI
Adam Teman, Davide Rossi, Pascal Meinerzhagen, Luca Benini, Andreas Burg
The 20th Asia and South Pacific Design Automation Conference, 81-86, Piscataway, NJ: IEEE, 2015.
DOI: 10.1109/ASPDAC.2015.7058985

  Tailoring instruction-set extensions for an ultra-low power tightly-coupled cluster of OpenRISC cores
Michael Gautschi, Andreas Traber, Antonio Pullini, Luca Benini, Michele Scandale, Alessandro Di Federico, Michele Beretta, Giovanni Agosta
2015 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), 25-30: IEEE, 2015.
DOI: 10.1109/VLSI-SoC.2015.7314386

  A −1.8V to 0.9V body bias, 60 GOPS/W 4-core cluster in low-power 28nm UTBB FD-SOI technology
Davide Rossi, Antonio Pullini, Michael Gautschi, Igor Loi, Frank K. Gürkaynak, Philippe Flatresse and Luca Benini
2015 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S), 1-3, Piscataway, NJ: IEEE, 2015.
DOI: 10.1109/S3S.2015.7333483

  A ultra-low-energy convolution engine for fast brain-inspired vision in multicore clusters
Francesco Conti, Luca Benini
2015 Design, Automation & Test in Europe Conference & Exhibition (DATE), 683-688, Piscataway, NJ: IEEE, 2015.
DOI: 10.7873/DATE.2015.0404

  Lightweight Virtual Memory Support for Many-Core Accelerators in Heterogeneous Embedded SoCs
Pirmin Vogel, Andrea Marongiu, Luca Benini
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES '15), 45-54, Piscataway, NJ: IEEE, 2015.
DOI: 10.1109/CODESISSS.2015.7331367

2014

  Customizing an open source processor to fit in an ultra-low power cluster with a shared L1 memory
Michael Gautschi, Davide Rossi and Luca Benini
Proceedings of the 24th edition of the great lakes symposium on VLSI, 87-88, New York, NY: ACM, 2014
DOI: 10.1145/2591513.2591569

  Ultra-low-latency lightweight DMA for tightly coupled multi-core clusters
Davide Rossi, Igor Loi, Germain Haugou and Luca Benini
Proceedings of the 11th ACM Conference on Computing Frontiers, 15, Piscataway, NJ: IEEE, 2014.
DOI: 10.1145/2597917.2597922

  Energy-efficient vision on the PULP platform for ultra-low power parallel computing
Francesco Conti, Davide Rossi, Antonio Pullini, Igor Loi and Luca Benini
Proceedings of the 2014 IEEE Workshop on Signal Processing Systems, Piscataway, NJ: IEEE, 2014.
DOI: 10.1109/SiPS.2014.6986099

  Energy efficient parallel computing on the PULP platform with support for OpenMP
Davide Rossi, Igor Loi, Francesco Conti, Giuseppe Tagliavini, Antonio Pullini and Andrea Marongiu
IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI), 2014 : 3 - 5 Dec. 2014, Eilat, Piscataway,NJ: IEEE, 2014.
DOI: 10.1109/EEEI.2014.7005803