Member 1126
11 entries

Immortal since Dec 19, 2007
Uplinks: 0, Generation 2
  • Affiliated
  •  /  
  • Invited
  •  /  
  • Descended
  • bpwnes’ favorites
    3D Mandelbrot Set
    From eyeclipse
    Analizing nature (again)
    From Olena
    Beautiful Minds
    From David Carvalho
    三芝 Sanjhih
    From josh
    World without us - Lisbon...
    Recently commented on
    3D Mandelbrot Set
    From bpwnes
    Talk to Strangers
    From HackerLastPip
    Points on a new Internet
    From bpwnes
    "[Unmanned] War, what is...
    From alert
    IBM uses DNA to make...
    Now playing SpaceCollective
    Where forward thinking terrestrials share ideas and information about the state of the species, their planet and the universe, living the lives of science fiction. Introduction
    Featuring Powers of Ten by Charles and Ray Eames, based on an idea by Kees Boeke.
    Computer Models
    The first nCUBE machines to be released were the nCUBE 10 of late 1985. These were based on a set of custom chips, including a 32-bit ALU and a 64-bit IEEE 754 FPU with 128kB of RAM combined onto a board known as a module. Each module delivered 2 MIPS, 500 kiloflops (32-bit single precision), or 300 kiloflops (64-bit double precision), and ran the Vertex operating system.

    The name referred to the machines ability to build an order-ten hypercube, supporting 1024 CPU's in a single machine. Some of the modules would be used strictly for input/output, which included the nChannel storage-control card, frame buffers, and the InterSystem card that allowed nCUBEs to be attached to each other. At least one host board needed to be installed, acting as the terminal driver. It could also partition the machine into sub-cubes and allocate them separately to different users.

    Researchers Robert Benner, John Gustafson and Gary Montry of the Parallel Processing Division of Sandia National Laboratory won the first Gordon Bell Prize in 1987 using the nCUBE 10.

    For the second series the naming was changed, and they created the single-chip nCUBE 2 processor. This was otherwise similar to the nCUBE 10's CPU, but ran faster at 25 MHz to provide about 7 MIPS and 3.5 megaflops. This was later improved to 30 MHz in the 2S model. RAM was increased as well, with 4 to 16 MB of RAM on a "single wide" 1" x 3.5" module, double that on the "double wide" module, and quadruple that on a double wide, double side module. The I/O cards generally had less RAM, with different backend interfaces to support SCSI, HIPPI, etc.

    Each nCUBE-2 CPU also included thirteen I/O channels running at 20 Mbit/s. One of these was dedicated to I/O duties, while the other twelve were used as the interconnect system between CPUs. Each channel used wormhole routing to forward messages along. The machines themselves were wired up as order-twelve hypercubes, allowing for up to 4096 CPU's in a single machine.

    Each module ran a 200kB microkernel called nCX, but the system now used a Sun Microsystems workstation as the front end and no longer needed the Host Controller. nCX included a parallel filesystem that could do 96-way striping for high performance. C and C++ languages are available, as is NQS, Linda, and Parasoft's Express. These were supported by an in-house compiler team.

    The largest nCUBE-2 system installed was at Sandia National Laboratory, a 1024-CPU system that reached 1.91 gigaflops in testing.

    The nCUBE-3 CPU included several improvements, and moved to a 64-bit ALU. Among the other improvements was a process-shrink to 0.5u, allowing the speed to be increased to 50 MHz (with plans for 66 and 100 MHz). The CPU was also superscalar and included 16kB instruction and data caches, and an MMU for virtual memory support.

    Additional I/O links were added, with two dedicated to I/O and sixteen for interconnects, allowing for up to 65,536 CPUs in the hypercube. The channels operated at 100 Mbit/s, due to use of 2 bit parallel instead of the serial lines previously The nCUBE3 also added fault-tolerant adaptive routing support, in addition to fixed routing, although in retrospect it's not entirely clear why.

    A fully loaded nCUBE-3 machine could use up to 65k processors, for 3TIPS, and 6.5 teraflops. The maximum memory will be 65 Tb, with a network I/O capability of 24 TB/second. Thus, the processor is biased in terms of I/O, which is usually the limitation. The nChannel board provides 16 I/O channels, where each channel can support transfers at 20 Mbyte/s.

    Tue, Jan 8, 2008  Permanent link
    Categories: Parallel Computing
      RSS for this post
      Promote (3)
      Add to favorites (1)
    Create synapse