Mesh-of-Trees interconnection network for an eXplicitly Multi-Threaded parallel computer architecture
As the multiple-decade long increase in clock rates starts to slow down, main-stream general-purpose processors evolve towards single-chip parallel processing. On-chip interconnection networks are essential components of such machines, supporting the communication between processors and the memory system. This task is especially challenging for some easy-to-program parallel computers, which are designed with performance-demanding memory systems.
This study proposes an interconnection network, with a novel implementation of the Mesh-of-Trees (MoT) topology. The MoT network is evaluated relative to metrics such as wire area complexity, total register count, bandwidth, network diameter, single switch delay, maximum throughput per area, trade-offs between throughput and latency, and post-layout performance. It is also compared with some other traditional network topologies, such as mesh, ring, hypercube, butterfly, fat trees, butterfly fat trees, and replicated butterfly networks. Concrete results show that MoT provides higher throughput and lower latency especially when the input traffic (or the on-chip parallelism) is high, at comparable area cost. The layout of MoT network is evaluated using standard cell design methodology. A prototype chip with 8-terminal MoT network was taped out at 90nm technology and tested. In the context of an easy-to-program single-chip parallel processor, MoT network is embedded in the eXplicit Multi-Threading (XMT) architecture, and evaluated by running parallel applications. In addition to the basic MoT architecture, a novel hybrid extension of MoT is proposed, which allows significant area savings with a small reduction in throughput.
0984: Computer science