The Athlon 64 X2 is the first dual-core desktop CPU manufactured by AMD. It is essentially a processor consisting of two Athlon 64 cores joined together on one die with additional control logic. The cores share one dual-channel memory controller, are based on the E-stepping model of Athlon 64 and, depending on the model, have either 512 or 1024 KiB of L2 Cache per core. The Athlon 64 X2 is capable of decoding SSE3 instructions (except those few specific to Intel's architecture), so it can run and benefit from software optimizations that were previously only supported by Intel chips. This enhancement is not unique to the X2, and is also available in the Venice and San Diego single core Athlon 64s.
In June 2007, AMD released low-voltage variants of their low-end 65 nm Athlon 64 X2, named "Athlon X2".[1] The Athlon X2 processors feature reduced TDP of 45 W.[2]
The main benefit of dual-core processors like the X2 is their ability to process more software threads at the same time. The ability of processors to execute multiple threads simultaneously is called thread-level parallelism (TLP). By placing two cores on the same die, the X2 effectively doubles the TLP over a single-core Athlon 64 of the same speed. The need for TLP processing capability is dependent on situation to a great degree, and certain situations benefit from it far more than others. Certain programs are currently only written with one thread, and are therefore unable to utilize the processing power of the second core.
Programs often written with multiple threads and capable of utilizing dual-cores include many music and video encoding applications, and especially professional rendering programs. High TLP applications currently correspond to server/workstation situations more than the typical desktop. These applications can realize almost twice the performance of a single-core Athlon 64 of the same specifications. Multi-tasking also runs a sizable number of threads; intense multi-tasking scenarios have actually shown improvements of considerably more than two times [2]. This is primarily due to the excessive overhead caused by constantly switching threads, and could potentially be improved by adjustments to operating system scheduling code.
In the consumer segment of the market as well, the X2 improves upon the performance of the original Athlon 64, especially for multi-threaded software applications. The overall increase in performance of the entry level Athlon 64 X2 chip (the Athlon 64 X2 3800+) over the single-core Athlon 64 3800+ chip is almost 10%. The spread between the latter and the Athlon 64 X2 5000+ is almost 40% [3]. One can interpret from these numbers that the majority of applications (at least in the benchmark test) are still largely single thread-dominated, hence the absence of a larger gap between the two 3800+ processors. As software programmers begin to take advantage of multi-core processing, the spread between single- and multi-core processors will increase.
Manufacturing costs:
Having two cores, the Athlon 64 X2 has an increased number of transistors. The 1-MiB-L2-cache 90 nm Athlon 64 X2 processor is 219 mm² in size with 243 million transistors [3] whereas its 1-MiB-L2-cache 90 nm Athlon 64 counterpart is 103.1 mm² and has 164 million transistors [4]. The 65 nm Athlon 64 X2 with only 512 KiB L2 per Core reduced this to 118 mm² with 221 million transistors compared to the 65 nm Athlon 64 with 77.2 mm² and 122 million transistors. As a result, a larger area of silicon must be defect free. These size requirements necessitate a more complex fabrication process, which further adds to the production of fewer functional processors per single silicon wafer. This lower yield makes the X2 more expensive to produce than the single-core processor.
In the middle of June 2006 AMD stated that they would no longer make any non-FX Athlon 64 or Athlon 64 X2 models with 1-MiB L2 caches [4]. This led to only a small production number of the Socket-AM2 Athlon 64 X2 with 1 MiB L2 cache per core, known as 4000+, 4400+, 4800+, and 5200+. The Athlon 64 X2 with 512 KiB per core, known as 3800+, 4200+, 4600+, and 5000+, were produced in far greater numbers. The introduction of the F3 stepping then saw several models with 1 MiB L2 cache per core as production refinements resulted in an increased yield.
No comments:
Post a Comment