Doctoral student Dimitra Papagiannopoulou Awarded Best Paper at SAMOS Conference

Dimitria Dimitra Papagiannopoulou, a fifth year doctoral student in the electrical sciences and computer engineering group at the Brown University School of Engineering, has won the Stamatis Vassiliadis Best Paper Award at the 2014 International Conference on Embedded Computer Systems: Architecture, MOdeling and Simulation (SAMOS XIV), held on Samos Island, Greece.

Papagiannopoulou's co-authors on the paper include: Tali Moreshet, R. Iris Bahar, Andrea Morongiou, Luca Benini, and Maurice Herlihy.

Her paper was entitled, "Speculative Synchronization for Coherence-free Embedded NUMA Architectures."

ABSTRACT:

High-end embedded systems are turning to architectural configurations consisting of clusters of multi-processors, where each cluster also includes a portion of the shared memory. Such a configuration is known as a non-uniform memory access (NUMA) architecture. In order to keep the processors and memory hierarchy simple, such embedded systems tend to employ simple, scratchpad-like memories, rather than hardware managed caches that require some form of cache coherence management. These "coherence-free" systems still require some means to synchronize memory accesses and guarantee memory consistency. Conventional lock-based approaches may be employed to accomplish the synchronization, but may lead to both useability and performance issues. Instead, speculative synchronization, such as hardware transactional memory, may be a more attractive approach. However, hardware speculative techniques traditionally rely on the underlying cache-coherence protocol to synchronize memory accesses among the cores. The lack of a cache-coherence protocol adds new challenges in the design of hardware speculative support. In this paper, we present a new scheme for hardware transactional memory support within a cluster-based NUMA system that lacks an underlying cache-coherence protocol. To the best of our knowledge, this is the first design for speculative synchronization for this type of architecture. Through a set of benchmark experiments, we show that our design can achieve significant performance improvements over traditional lock-based schemes.