As seen in previous nVidia nForce chipsets, the new
nForce 4 is a single chip solution. This means that instead of dividing the
chipset into a Northbridge (for video and memory input which communicates
directly with the processor) and Southbridge (peripheral and drive input,
communicates with Northbridge), all functions have been placed on a single
integrated core logic circuit.
This design innovation helps to reduce data
bottlenecks by eliminating the data bus between the Northbridge and Southbridge
chipsets completely. Given the unique architecture of the Athlon 64 processor,
in which the memory controller is on the CPU itself rather than the chipset,
this kind of one-chip solution makes a lot of sense.
SLI and NForce 4
nVidia's new SLI (Scaleable Link Interface)
technology is used to link two nVidia based cards together, splitting the
rendering load between them to increase 3D performance. The technology requires
a pair of compatible videocards (Nvidia Geforce 6600GT models and above) with
SLI connectors (must be implemented by the video card manufacturer) and an
Nforce 4 SLI chipset-based motherboard.
Typical PCI-Express-based motherboards use the PCI-Express
x16 slot to interface with video cards. As you'd imagine, this provides 16 PCI Express lanes to
the single card for a total available bandwidth of 8GB/s. The Nvidia
Nforce 4 SLI solution provides two physical PCI-Express video slots, and uses a switch to
divert 8 PCI-Express data lanes to serve each slot.
A single card can also be used in either slot, and
in this case the full 16 PCI-Express lanes are available. In a typical SLI
solution, the cards themselves are also linked by way of an SLI cable attached
to the special MIO 'video bus' connector on the top of each card.
In the nForce 4 motherboards
we have seen, the SLI switch is implemented on a small card which must be
physically switched around to go from 'normal mode' in which the full 16 lanes of PCI-Express
goodness are available to a single card and 'SLI mode' in which 8
lanes are directed to each physical slot.
How it all Works
Nvidia's SLI works by allowing the two graphical
processors to share the rendering workload, governed by the Nvidia Detonator
software drivers. The CPU passes all neccessary 3D information to the 'primary'
GPU, which then shares the information with the second card via the video bus
interface cable. This removes the overhead of synchronizing the two processors
from the PCI-Express bus, allowing improved performance. The video bus link
itself apparently runs at up to 10GB/s, though we doubt that this bandwidth is
fully utilized.
Currently, the only nVidia SLI-compatible video
processors are the Geforce 6600GT, 6800, 6800GT and 6800 Ultra. The graphical
processors in each video card must be identical, as must the video BIOS
revisions, though the cards can run at separate speeds (the SLI system will
assume the lowest clock speeds for both cards). This means that it is going to
be pretty much essential to have two identical cards from the same manufacturer
to get SLI working correctly. Nvidia has introduced a certification program to
ensure that users can find compatible products.
The
actual SLI rendering process uses one of two modes: Alternate Frame
Rendering (AFR) and Split Frame Rendering
(SFR). AFR has
each video card render a separate frame, while SFR, the method that has gotten
more publicity, uses each GPU to render part of one frame. Interestingly, the
choice of which method to use in which games is pre-programmed into the
Detonator driver suite, meaning that if there is no existing profile for the
game you are playing, SLI will not work with that game. In these cases, a
compatibility mode is used, which cuts off the SLI process completely,
using only a single GPU (and we'd assume only 8 PCI-Express lanes) for all
rendering tasks. Nvidia claims that they have already created profiles for more
than 100 of the most popular 3D games, and more will follow with Detonator
driver updates.
The
Split Frame Rendering mode is probably the most
interesting part of Nvidia's SLI technology. Using the Detonator driver to
balance and allocate the video load, each GPU shares about half of the rendering
work for each frame, then the completed frame is assembled by the first primary
GPU and output to the PCI-Express x16 bus. Obviously this will not be 100%
efficient, as different parts of each graphical frame will vary in complexity
and some overhead is added in assembling the frame at the end, but overall this
method should result in a considerable performance increase. You can expect CPU
load to increase as well, since the Detonator software is responsible for
balancing the video load to each card at all times.
Alternate Frame Rendering mode, where a frame is rendered separately on
each video card, should give even higher performance, but this technology cannot
always be used on modern 3D games due to certain graphical effects which require
multiple frames to be blended together. Split Frame Rendering has no such
limitation as both cards are always working on a single graphical frame.
The major benefit of Nvidia's SLI is its ability to
more fully utilize the massive bandwidth of the PCI-Express x16 video solution.
A pair of GPUs can process information twice as fast (minus the overhead of the
communication between them) and use the available bandwidth more efficiently,
considerably boosting 3D performance. This should also enable users to get
top-tier performance out of a pair of mid-range 6600GT cards. Interested users
should note that having two videocards also considerably increases power
consumption, and using a pair of 6800 Ultras will mandate a hefty power
supply.