The long latencies introduced by remote accesses in a
large multiprocessor can be hidden by caching. Caching also decreases the network load. We introduce a new class of
architectures called Cache Only Memory Archi-tectures (COMA). These architectures provide the programming paradigm of the
shared-memory architectures, but have no physically shared memory; instead, the
caches attached to the processors containallthe memory in the system, and their size is therefore large. A
datum is allowed to be in any or many of the caches, and will automatically be moved to where it is needed by a cache-coherence protocol,which also ensures that the last copy of a datum is never lost. The location of a datum in the
machine is completely decoupled from its address. We also introduce one example of COMA: the Data Diffusion Machine (DDM), and its simulated performance for large applications. The DDM is based on a hierarchical network structure, with processor/memory pairs at its tips. Remote accesses generally cause only a
limited amount of traffic over a limited part of the machine.
More abstracts about the DDM - a cache-only memory architecture