"Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors"
As computing and embedded systems evolve towards highly parallel multiprocessors, major
research and development efforts are being focused on network interface (NI) architectures that
enable efficient interprocessor communication (IPC). This thesis is focused on NI architecture,
prototyping and design issues for cluster and chip multiprocessors. This work includes the
development of a NI queue manager, key NI design issues with respect to IPC and a proposed NI
design well suited to chip multiprocessors.
The first part of this thesis presents the architecture design and implementation of a NI queue
manager that supports Virtual Output Queuing, Variable-Size Multi-Packet Segmentation and
QFC flow control. To increase the available network buffer space VOQs can migrate to external
memory in the form of memory blocks connected in linked-lists. Free-List Bypass and Free Block
Preallocation optimization techniques are employed to minimize the required number of accesses
to external memory and achieve higher bandwidth. In addition, a novel packet processing
mechanism that converts arbitrary traffic segments into autonomous network packets was
implemented. Detailed FPGA hardware cost results are presented for each individual module, as
well as for three different implementations of the queue manager. Network performance
experiments were carried out using the developed queue manager on a FPGA-based prototyping
platform and confirmed previous theoretical and simulation results about the behavior and
performance of the buffered crossbar switch.
The second part starts with a discussion of fundamental NI design issues that affect IPC.
Various approaches and solutions are evaluated based on performance, scalability, reliability and
protection. The issues addressed include NI placement, virtualization and protection, address
translation, data transfer mechanisms and the software interface. Relevant academic and
commercial approaches and solutions are referenced throughout the discussion.
The second part also contains a proposal for a design of a NI that is lightweight and tightly
coupled to the processor, making it well suited to future chip multiprocessors. Two powerful
communication primitives are offered: Message Queues and Remote DMA. Message Queues are
intended for low latency communication, mainly synchronization and control messages or small
low-overhead data transfers. Remote DMA minimizes processor involvement in communication,
is well suited for bulky data transfers and facilitates zero-copy protocols. Furthermore, the
proposed NI supports a versatile protection and security solution, based on the existence of
protection zones that can easily be adapted to the specific security requirements of a system.
Keywords: network interface, chip multiprocessors, virtual output queues, hardware
linked-lists, variable-size multi-packet segmentation
Thesis Advisor: Manolis Katevenis, Professor