![]() |
![]() |
Beowulf: The Original and the original text 2.
For SC96 the two Beowulf clusters, 16 processors each from Caltech and LANL, joined as a larger cluster. They were interconnected not-entirely-obvious topology. Because of differences in component pricing, the two systems started with different topologies:
For SC96 the two machines were connected by 16 full-duplex fast ethernet point-to-point links (100 Tx and 100 Rx mbps on 16 channels), creating a 5th degree hypercube with bypasses. The routing was configured so that most traffic went through the switches.
The processing nodes were all single processor P6-200Mhz systems using the Intel VS440FX "Venus" boards with the "Natoma" chipset. Each node had 128MB of memory and 5GB of disk. The fast ethernet adapters (six channels in the LANL machine, two in the Caltech) used the excellent DEC "Tulip" bus-master chips.
[[ Note: both ethernet switch brands are effectively no-contention crossbars with both input and output buffering, resulting in much more effective bandwidth use than e.g. a Myrinet switch. ]]
Most other extant Beowulf systems (Drexel, GWU, Clemson, several at GSFC) are P5 systems using "channel bonding", multiple parallel channels of 100mbps fast ethernet. This allows scalable bandwidth, and avoids the high cost of a ethernet switch. But for the above systems, the price/performance tradeoff combined with the faster P6 boards pointed to a switched/richer- interconnect-topology approach.
Two "white papers" written by Thomas Sterling are available here and here.