DeNovo: Rethinking Hardware for Disciplined Parallelism  
Sarita Adve's Research Group
University of Illinois at Urbana-Champaign

EE Times article about DeNovo

A large part of the complexity and inefficiency in current hardware concurrency mechanisms arguably arises from a software-oblivious approach to hardware design. The DeNovo project seeks to rethink concurrent hardware from the ground up, given the assumption that most future software will use disciplined concurrency models for better dependability. We are pursuing the following directions:

Disciplined hardware concurrency model: The history of the work on memory consistency models (which lies at the heart of concurrency semantics) exemplifies well the pitfalls of software-oblivious hardware design. Our previous work on data-race-free took a combined hardware/software approach and laid the foundation for the convergence in memory models adopted by high-level languages and most hardware vendors today (e.g., the Java and C++ memory models). Nevertheless, these models are still too complex and fragile, particularly in the context of safe programming languages. Like data-race-free, DeNovo takes the approach that hardware should provide easy to reason semantics (sequential consistency or stronger) only for disciplined programs. DeNovo requires that violations of the discipline be preferably caught at compile time and, in the worst case, generate an exception at runtime. We are defining the notion of "discipline" at the hardware level in collaboration with the UPCRC disciplined languages work. In particular, we espouse deterministic-by-default semantics - non-determinism, when required should be explicitly requested and well-encapsulated. We are exploring how to express and implement such semantics for hardware, given its need to balance the requirements of a variety of programs including legacy programs.

Rewarding and enforcing discipline: We believe there are many opportunities to exploit disciplined programming models to build simpler and more effective hardware. For example, current hardware cache coherence protocols are designed for "wild" shared memory, making them unnecessarily complex and hard to scale. We are exploring how the knowledge of a disciplined programming approach can be used to simplify the maintenance of cache coherence as well as how to use such information to best represent and manage tasks in the hardware and the runtime to maximize locality and load balance. Hardware can also be used to aid disciplined programming languages. For example, there will invariably be untrusted and unverified code that potentially does not obey the required discipline - we are working on hardware support for "sandboxing" such code. Other opportunities include hardware and runtime support for fine-grained synchronization and direct support for deterministic constructs.

Interface mechanism - a typed virtual instruction set: Virtualization is perhaps the only viable means for supporting the expected variety of heterogeneous architectures as well as implementation-specific mechanisms that vendors may be reluctant to make part of their software exposed ISA. A virtual instruction set computer (VISC) provides a separate low-level but rich and machine independent ISA for software, which is then translated to a hardware ISA. The latter is implementation-specific and never exposed to the software (other than the dynamic translator). We are exploring the design of such an ISA and integrated runtime for DeNovo. Given our emphasis on safety and dependability, this ISA will be typed, to allow expressing rich high level information in a structured way and to check safety properties at install or runtime.

The DeNovo project is in close collaboration with the Deterministic Parallel Java (DPJ) project.