In HP: Last Itanium man Standing; Nehalem Lives the Dream (http://www.theregister.co.uk/2010/04/26/itanium_hp_last_standing) Timothy Prickett Morgan wrote, “Without the threat of Itanium, which was never really fulfilled, perhaps IBM would have never knuckled down and put some money into decent Power chip development, which allowed the company to go from joke to dominance in the Unix server racket.” While the title of his article may be somewhat accurate, this and some of its other claims need to be addressed.
The Itanium was developed to address what was perceived at the time to be a classic “technology wall”. From the late 1980s into mid 1990s, some CPU architects assumed that RISC processor technology was running out of intrinsic processing capability. (Ironically, RISC was developed to address an earlier “technology wall”.) It appeared there was more instruction level parallelism (ILP) pumped out by RISC compilers than could be processed by 1980’s vintage RISC processors, leaving on the table mutually exclusive operations that could be otherwise executed. It was incorrectly assumed that RISC technology could not mature. It indeed did mature, using capabilities such as: instruction pipelining, superscaling, out of order execution, register re-naming, speculative execution, advanced branching algorithms, etc. These capabilities, combined with ever-advancing fabrication technology, allowed RISC processors to address and exceed all their earlier perceived limitations. CISC processors have also adopted such capabilities.
The Itanium's Explicitly Parallel Instruction Computing (EPIC) architecture used Very Large Instruction Word (VLIW -- the same architecture used in the Elbrus, the last Soviet supercomputer of the late 1980s) technology and included many more execution units than could ever be effectively used. The Itanium has so many execution units (both integer and floating point) that it was actually designed to execute instructions down both sides of a conditional branch simultaneously while the branch condition was being evaluated. A 33% longer instruction word forced the adoption of larger caches, created code bloat, and mandated higher data bandwidths, all relative to RISC processors.
Intel and HP actually overshot the ability of compilers to extract enough ILP for the Itanium to execute code effectively. There is only so much ILP inherent in human-scribed source code. Because of the complexity of Intel's EPIC architecture, it could not be run at a state-of-the-art clock. The Merced (the first Itanium) was introduced at 733 MHz in 2001 when most state-of the-art processors were running at least 1GHz. At almost a decade after the Merced was released, today’s Itanium only operates at 1.7 GHz. This is in stark contrast with IBM's POWER having hit 5 GHz several years ago. Had the Merced been available in the late 1990s, things might have been radically different.
Interestingly enough, today's Itanium uses hyperthreading (two simultaneous threads/core) helping utilize empty execution units. Even so, Itanium's benchmarks are on the order of 0.5X relative to the Nehalem EX. On the RISC side, IBM's POWER7's eight cores and four threads/core current benchmarks demonstrate a 4X performance capability over the Itanium.
To even hint that the Itanium was the motivating force behind IBM's “joke to dominance” in the RISC UNIX marketplace ignores advances in process technology, intelligent architecting, and the ability of IBM to model and simulate the dynamics of code compilation and execution. In the very late 1990s Sun Microsystems spent millions of dollars developing a VLIW processor similar to the Itanium. That chip was called the MAJC. It was so poor in performance that Solaris wasn't even ported to it. The MAJC ended up a high-end graphics processor, eventually being dropped as another failure.
Morgan also suggests, “And without Intel relegating 64-bit processing to Itaniums and leaving Xeons to 32-bits, there would not have been a gap in which Advanced Micro Devices could leap and create the Opterons, which are the inspiration for the Nehalem family of processors that have put Intel back in the driver's seat when it comes to server CPUs .”
Intel clearly learned lessons from the Itanium as seen in such places as microcode fusing on the Nehalem, but high volume 64-bit computing was on its way with or without the existence or demise of the Itanium. Intel clearly hiccupped staying with very long pipeline Xeons with AMD's Opteron and 64-bits filling that void.
Whether the Opteron “inspired” the Nehalem as the Itanium enters the coffin is a rather wide leap. The Itanium was simply too little too late and should be afforded proper burial. Of more interest would be to imagine a 3.5 GHz HP PA-RISC -- a processor the Itanium really did eliminate.