Friday, May 9, 2008

Future Turion Ultra


The Turion Ultra (codenamed Griffin) is the first processor family from AMD solely for the mobile platform, based on the Athlon 64 (K8) architecture with some specific architectural enhancements similar to upcoming Opteron processors aimed at lower power consumption and longer battery life.

Features:

The Turion Ultra is a dual-core processor to be fabricated on 65 nm technology using 300 mm SOI wafers. It will support DDR2-800 SO-DIMM's and features a DRAM prefetcher to improve performance and a mobile-enhanced northbridge (memory controller, HyperTransport controller, and crossbar switch). Each processor core comes with 1 MiB L2 cache for a total of 2 MiB L2 cache for the entire processor. This is double the L2 cache found on the current Turion 64 X2 processor. Clock rates will range from 2.0 GHz to 2.4 GHz, and total design power (TDP) will range from 32 watts to 35 watts.[1]

An outstanding feature of the Turion Ultra processor is that it implements three voltage planes: one for the northbridge and one for each core.[2] This, along with multiple phase-locked loops (PLL), allows one core to alter its voltage and operating frequency independently of the other core, and independently of the northbridge. Indeed, in a matter of microseconds, the processor can switch to one of 8 frequency levels and one of 5 voltage levels. By adjusting frequency and voltage during use, the processor can adapt to different workloads and help reduce power consumption. It can operate as low as 250 MHz to conserve power during light use.

Additionally, the processor features deep sleep state C3, deeper sleep state C4 (AltVID), and HyperTransport 3.0 up to 2.6 GHz, or up to 41.6 GB/s bandwidth per link at 16-bit link width and dynamic scaling of HT link width down to 0-bit ("disconnected") in both directions from and to the chipset for four different usage scenarios [3]. It also implements multiple on-die thermal sensors through integrated SMBUS (SB-TSI) interface (replaces and eliminates the thermal monitor circuit chip through SMBUS in its predecessors) with additional MEMHOT signal sent from embedded controller to the processor, and reduces memory temperature.

The Turion Ultra processor will share the same socket S1 as its predecessor (Turion 64 X2) but will not have the same pinout.[4] It is designed to work with the RS780M chipset.

It is worth noting that given the above enhancements on the architecture, the cores were minimally modified and are based on the K8 instead of the K10 microarchitecture.[4] AMD Fellow Maurice Steinman has said the cores are almost transistor-for-transistor identical to those found in the 65 nm Turion 64 X2 processors[citation needed]. This makes it more likely that Turion Ultra will avoid the clock rate scaling difficulties present in AMD's K10 products.

Availability:

The Turion Ultra processor is still under development and is expected to be released as part of the "Puma" mobile platform in the second quarter of 2008. AMD executives are happy to divulge that the Puma platform has raked in over 100 design wins already, more than any other new product in AMD history.

Future products

The Athlon 64 line is expected to continue to evolve. In particular, new models scheduled to be launched in the third quarter of 2007 are to be based on the "K10" microarchitecture. The initial offerings are expected to be based on the Agena (quad-core, 2 MiB L3 cache), Agena FX and Kuma (dual-core, 2 MiB L3 cache) cores. These processors will be packaged in the Socket F+ (Agena FX) and Socket AM2+ form factors, but are expected to function in Socket AM2 motherboards as well, with the loss of HyperTransport 3.0 enhancements, which will only be available with Socket AM2+ motherboards. Processor model information has been reported as follows:[60]

According to the most recent report, the range of K10 desktop microprocessors will no longer use the trademark "Athlon", but will don the new name "Phenom". Subsequent models will use the name "Phenom X2" for dual core variants, "Phenom X4" for quad core variants and "Phenom FX" to replace the current Athlon 64 FX.


Main article: AMD K10

Athlon 64 models

Clawhammer (130 nm SOI)

Newcastle (130 nm SOI)

Also possible: ClawHammer-512 (Clawhammer with partially disabled L2-Cache)

Winchester (90 nm SOI)

Venice (90 nm SOI)

San Diego (90 nm SOI)

Orleans (90 nm SOI)

Lima (65 nm SOI)

Athlon 64 FX models

Sledgehammer (130 nm SOI)

  • CPU-Stepping: C0, CG
  • L1-Cache: 64 + 64 KiB (Data + Instructions)
  • L2-Cache: 1024 KiB, fullspeed
  • MMX, Extended 3DNow!, SSE, SSE2, AMD64
  • Socket 940, 800 MHz HyperTransport (HT800)
  • Registered DDR-SDRAM required
  • VCore: 1.50/1.55 V
  • Power Consumption (TDP): 89 Watt max
  • First Release: September 23, 2003
  • Clockrate: 2200 MHz (FX-51, C0), 2400 MHz (FX-53, C0 and CG)

Clawhammer (130 nm SOI)

  • CPU-Stepping: CG
  • L1-Cache: 64 + 64 KiB (Data + Instructions)
  • L2-Cache: 1024 KiB, fullspeed
  • MMX, Extended 3DNow!, SSE, SSE2, AMD64
  • Socket 939, 1000 MHz HyperTransport (HT1000)
  • VCore: 1.50 V
  • Power Consumption (TDP): 89 Watt (FX-55:104 Watt)
  • First Release: June 1, 2004
  • Clockrate: 2400 MHz (FX-53), 2600 MHz (FX-55)

San Diego (90 nm SOI)

Toledo (90 nm SOI)

Dual-core CPU

Windsor (90 nm SOI)

Dual-core CPU

Windsor (90 nm SOI) - Quad FX platform

Main article: AMD Quad FX platform

Dual-core, dual CPUs (four cores total)

Mobile Athlon 64

A line for mobile computing.

Sockets


At the introduction of Athlon 64 in September 2003, only Socket 754 and Socket 940 (Opteron) were ready and available. The onboard memory controller was not capable of running unbuffered (non-registered) memory in dual-channel mode at the time of release; as a stopgap measure, they introduced the Athlon 64 on Socket 754, and brought out a non-multiprocessor version of the Opteron called the Athlon 64 FX, as a multiplier unlocked enthusiast part for Socket 940, comparable to Intel's Pentium 4 Extreme Edition for the high end market.

In June 2004, AMD released Socket 939 as the mainstream Athlon 64 with dual-channel memory interface, leaving Socket 940 solely for the server market (Opterons), and relegating Socket 754 as a value/budget line, for Semprons and slower versions of the Athlon 64. Eventually Socket 754 replaced Socket A for Semprons.

In May 2006, AMD released Socket AM2, which provided support for the DDR2 memory interface. Also, this marked the release of AMD's Virtualization technology.

In August 2006, AMD released Socket F for Opteron server CPU which uses the LGA chip form factor.

In November 2006, AMD released a specialized version of Socket F, called 1207 FX, for dual-socket, dual-core Athlon FX processors on the Quad FX platform. While Socket F Opterons already allowed for four processor cores, Quad FX allowed unbuffered RAM and expanded CPU/chipset configuration in the BIOS. Consequentially, Socket F and F 1207 FX are incompatible and require different processors, chipsets, and motherboards.

Athlon 64 X2

The Athlon 64 X2 is the first dual-core desktop CPU manufactured by AMD. It is essentially a processor consisting of two Athlon 64 cores joined together on one die with additional control logic. The cores share one dual-channel memory controller, are based on the E-stepping model of Athlon 64 and, depending on the model, have either 512 or 1024 KiB of L2 Cache per core. The Athlon 64 X2 is capable of decoding SSE3 instructions (except those few specific to Intel's architecture), so it can run and benefit from software optimizations that were previously only supported by Intel chips. This enhancement is not unique to the X2, and is also available in the Venice and San Diego single core Athlon 64s.

In June 2007, AMD released low-voltage variants of their low-end 65 nm Athlon 64 X2, named "Athlon X2".[1] The Athlon X2 processors feature reduced TDP of 45 W.[2]


The main benefit of dual-core processors like the X2 is their ability to process more software threads at the same time. The ability of processors to execute multiple threads simultaneously is called thread-level parallelism (TLP). By placing two cores on the same die, the X2 effectively doubles the TLP over a single-core Athlon 64 of the same speed. The need for TLP processing capability is dependent on situation to a great degree, and certain situations benefit from it far more than others. Certain programs are currently only written with one thread, and are therefore unable to utilize the processing power of the second core.

Programs often written with multiple threads and capable of utilizing dual-cores include many music and video encoding applications, and especially professional rendering programs. High TLP applications currently correspond to server/workstation situations more than the typical desktop. These applications can realize almost twice the performance of a single-core Athlon 64 of the same specifications. Multi-tasking also runs a sizable number of threads; intense multi-tasking scenarios have actually shown improvements of considerably more than two times [2]. This is primarily due to the excessive overhead caused by constantly switching threads, and could potentially be improved by adjustments to operating system scheduling code.

In the consumer segment of the market as well, the X2 improves upon the performance of the original Athlon 64, especially for multi-threaded software applications. The overall increase in performance of the entry level Athlon 64 X2 chip (the Athlon 64 X2 3800+) over the single-core Athlon 64 3800+ chip is almost 10%. The spread between the latter and the Athlon 64 X2 5000+ is almost 40% [3]. One can interpret from these numbers that the majority of applications (at least in the benchmark test) are still largely single thread-dominated, hence the absence of a larger gap between the two 3800+ processors. As software programmers begin to take advantage of multi-core processing, the spread between single- and multi-core processors will increase.


Manufacturing costs:


Having two cores, the Athlon 64 X2 has an increased number of transistors. The 1-MiB-L2-cache 90 nm Athlon 64 X2 processor is 219 mm² in size with 243 million transistors [3] whereas its 1-MiB-L2-cache 90 nm Athlon 64 counterpart is 103.1 mm² and has 164 million transistors [4]. The 65 nm Athlon 64 X2 with only 512 KiB L2 per Core reduced this to 118 mm² with 221 million transistors compared to the 65 nm Athlon 64 with 77.2 mm² and 122 million transistors. As a result, a larger area of silicon must be defect free. These size requirements necessitate a more complex fabrication process, which further adds to the production of fewer functional processors per single silicon wafer. This lower yield makes the X2 more expensive to produce than the single-core processor.

In the middle of June 2006 AMD stated that they would no longer make any non-FX Athlon 64 or Athlon 64 X2 models with 1-MiB L2 caches [4]. This led to only a small production number of the Socket-AM2 Athlon 64 X2 with 1 MiB L2 cache per core, known as 4000+, 4400+, 4800+, and 5200+. The Athlon 64 X2 with 512 KiB per core, known as 3800+, 4200+, 4600+, and 5000+, were produced in far greater numbers. The introduction of the F3 stepping then saw several models with 1 MiB L2 cache per core as production refinements resulted in an increased yield.

Athlon 64 FX

The Athlon 64 FX is positioned as a hardware enthusiast product, marketed by AMD especially toward gamers.[55] Unlike the standard Athlon 64, all of the Athlon 64 FX processors have their multipliers completely unlocked.[56] The FX line is now dual-core, starting with the FX-60.[57] The FX always has the highest clock speed of all Athlons at its release.[58] From FX-70 onwards, the line of processors will also support dual-processor setup with NUMA, named AMD Quad FX platform.