Intel Pentium 4 1.5 GHz (m478) Review
When Intel released their Pentium 4 processor about 1-1/2 years
ago, it was met with mixed emotions. While boasting an incredible clock speed of
1.5 GHz (which was a lot back then) it's performance was close to that of their Pentium
3 1.0 GHz CPU, even though it had a 500 MHz advantage! With a street price
of $240 CDN currently, the P4 1.5GHz is hardly cheap, but it is quite inexpensive compared
to the top of the line Pentium4 models.
Along side
the introduction of the Pentium4, Intel stated the
heart of the Pentium4 is powered by its NetBurst Micro architecture. What that
incorporates is pictured below.
The 400 MHz
system bus is really a quad pumped (QDR)100 MHz FSB. Meaning this, for every clock
cycle it sends 4 pieces of data to the CPU delivering an incredible 3.2GB/s worth
of bandwidth. That's about three times
more bandwidth then
what a Pentium 3 would get with a (1.06GB/s) 133 MHz FSB.
Hyper-pipelined technology: With better branch prediction
/ recovery for its 20 stage pipeline, over the Pentium 3's 10 stage
pipeline, the Pentium4 significantly increases the frequency and scalability of
the processor.
Rapid Execution engine: With two ALU's (arithmetic
Logic Units) running at twice the processor speed, this allows the integer
instructions such as Add, Subtract, Logical AND and Logical OR to run every
1/2 clock cycle speeding up everyday processes.
Advanced transfer cache: With extremely low
latencies, the 8K data cache increases performance over it's predecessor because
if its speed.
Advanced dynamic execution is a out-of-order
speculative execution engine that reduces the number of branch
mis-predictions by about 33% over the P6 generation.
With
Enhanced Floating point/multimedia a second
register has been added and the data pathway has been expanded to
128 bits.
Streaming SIMD (Single Instruction Multiple Data)
Extensions2:
With 144 new instructions, the 128 bit SIMD
integer arithmetic and 128-bit SIMD double-precision floating-point operations
will greatly enhance and accelerate video, audio, 3d rendering, etc.
Intel has always
been a very innovative company, trying to introduce new things to the CPU sector.
With it's longer pipeline, they hope Pentium4 architecture will drive
their processors for the next few GHz.
The 400 MHz (100 MHz QDR) FSB alleviates the
bus speed bottleneck the Pentium 3 had. Often, increasing the FSB to say
150 MHz would boost performance past CPU's that were two classes higher in
terms of raw core MHz! Still, problems did arise with the 400 MHz
FSB.
To be able to take advantage of the great amount of
bandwidth, you would also need memory that could also deliver that same amount
otherwise the processor would be starved of data. RDRAM was chosen to be
standard issue with the P4's because of licensing agreements with Rambus Inc.
Only dual channel RDRAM could deliver enough data to satisfy the Pentium4.
Eventually Intel did release a chipset that did use SDRAM but the performance
penalty for that was severe (about 30%!) and only now are DDR chipset starting
to surface.
The Hyper-pipelined technology is the
most controversial part of the P4. Since the pipeline in the P4
is 20 stages long over the P3's 10 stages, for a piece
of data to be computed, it has a lot further to travel.
What basically Intel's done is trade IPC (instructions Per Clock) for raw MHz.
Because of the performance penalty of longer
pipelines, the Pentium4 uses branch prediction. What the processor does is see
what software or operation will be needed next and load it into cache memory.
With a 94% (what Intel claims) success rate, this helps hide the problem with
longer pipelines most of the time, however when the processor predicts wrong,
the whole cache must be flushed clean and refilled. The performance penalty for
that is quite high relatively because of its deeper pipeline.
One of the biggest things that stick out when
viewing the spec's of the Pentium4 is it's extremely small L1 cache. Because
of it's small size 8KB, it has extremely low latency of 2 compared to a latency
of 3 for the Athlon's 64KB L1 cache (50%). However as you'll see in the
benchmarks, the small size despite it's lower latency doesn't help
much.
SSE2 is very
important to the Pentium4 and will go along way in determining how
well the CPU does. SSE2 has 144 new instructions that take over
for the FPU (Floating Point Unit), when developers use SSE2 (like its
predecessor SSE) will allows for accelerated FPU operations. However the problem is, current
applications have to be rewritten to take advantage of SSE2 otherwise they won't be benefit
from it.
Intel doesn't really have to worry though, software
developers were quick to adopt SSE and with their arch rival AMD seeing the
value of SSE2, they've actually licensed that from Intel for their Hammer CPU's,
SSE2 is here to stay.
On the right of each picture is a Socket 423 Processor and on the left of
each picture is the Socket m478.
When the 42 million transistor P4 "Willamette" was originally released, it
used a Socket 423 format, however everyone knew it would be short lived
because it's replacement was already on the horizon, the Socket m478. The "Northwood" P4's built on .13 micron technology
will only be produced on the Socket m478 packaging, so if one was buying a P4 system
now, you'd be wisest going Socket m478, for upgradeability and peace of mind.
Still using OGA (Organic Grid Array) technology,
one of the first things you'll notice is the rather large heat spreader on top
of the CPU. Intel calls this FC-PGA2. Under the spreader is the flip chip
core, but what the spreader does is protect the chip from damage. Intel really
hasn't had the same problem as AMD and their weaker cores, my trusty P3 550E has
a ton of cracks on it but still runs at 850 MHz like a charm! =)
On that note, were quite disappointed at the
overclockability of our 1.5 GHz test sample, it would only reach 1.64 GHz in our
test board the Shuttle AV45GTR, of course since the board didn't have CPU
voltage controls, it made things a little more difficult.