X86 architecture
From Wikipedia, the free encyclopedia
- The correct title of this article is x86 architecture. The initial letter is shown capitalized due to technical restrictions.
x86 or 80x86 is the generic name of a microprocessor architecture, first developed and manufactured by Intel. It currently dominates the desktop, portable and small server markets, and has been used in personal computers since the 1980s IBM PC. Although x86 started off as an extension of the simple eight bit 8085 processor it has nevertheless often been put into the same "CISC" category as System/360 and VAX.
Operating systems that run on the x86 architecture include versions of DOS, Microsoft Windows, Unix and Linux, but there are many others as well. Recently, Apple Inc. has begun using the x86 architecture on their Macintosh line of computers.
RISC architectures such as the SPARC and PowerPC designs have challenged x86 in the workstation and server markets as well as for personal computers, but none have yet supplanted x86 in its core markets.
The architecture is called x86 because the earliest processors in this family were identified by model numbers ending in the sequence "86": the 8086, the 80186, the 80286, the 386, the 486, the Nx586 and the 6x86. Because one cannot establish trademark rights on numbers, Intel and most of its competitors began to use trademark-acceptable names such as Pentium for subsequent generations of processors, but the earlier naming scheme remains as a term for the entire family.
As hardware has evolved, the architecture has twice been extended to a larger word size. In 1985, Intel released the 32-bit 386 to replace the 16-bit 286. The 32-bit architecture was called i386 and is now called x86-32 or IA-32 (an abbreviation for Intel Architecture, 32-bit). In 2003, AMD introduced the Athlon 64, which implemented a further extension to the architecture to 64 bits, variously called x86-64, AMD64 (AMD's branding), Intel 64 (Intel's branding, formerly EM64T or IA-32e) and x64 (Microsoft and Sun Microsystems' vendor-neutral naming convention), the latter of which is not to be confused with the unrelated IA-64 architecture.
Contents |
[edit] History
The x86 architecture first appeared as the Intel 8086 CPU in 1978; the 8086 was a development of the Intel 8085 processor (which itself followed the 8080 and the 8008*), and programs written in 8085 or 8080 assembly could be mechanically translated to equivalent programs in 8086 assembler language. Three years later, the slightly simpler 8088 (an eight-bit databus version of 8086), was chosen as the main CPU for the IBM PC. The ubiquity of the PC platform has then resulted in the x86 becoming commercially the most successful CPU architecture ever (another widespread CPU design based on 8080 compatibility is the Zilog Z80 architecture).
Companies such as Cyrix, NEC Corporation, IBM, IDT and Transmeta have manufactured CPUs conforming to the x86 architecture. Clones first starting appearing in quantity in 386- and 486-based PCs. The clone manufacturers also named their chips '386' and '486'. Wanting to differentiate their product, Intel applied for trademarks. However, the legal problems with trademarking a number led Intel to start including the word 'Intel' or the prefix 'i' alongside the processor generation designation, e.g. 'Intel386' or 'i386'. This enabled consumers to easily identify which computers contained an actual Intel CPU, and was often re-affirmed to the consumer by the use of an Intel Inside sticker on the case of the computer base unit.
From their fifth-generation chips onwards, Intel adopted a textual-based name for their CPUs. This name, 'Pentium', was derived from the Greek for 'five', pent- and the Latin ending -ium. Because of this mix of language roots, the word could be considered a barbarism. Clone manufacturers such as Cyrix and IBM took advantage of the lack of an 'Intel586' to create CPUs of their own with a name that included the number '586', but these chips were comparatively unsuccessful.
Since the Pentium generation of CPUs, the most successful of the clone manufacturers is AMD, whose Athlon series, while not as popular as the Pentium or Intel Core 2 series[citation needed], has a significant market share.
Intel introduced the IA-64, a separate 64-bit architecture used in its Itanium processors and Itanium Processor Family (IPF). IA-64 is a completely new system that bears no resemblance to the x86 architecture and should not be confused with x86-64, which is the 64-bit extension of x86.
* The 8008 was basically an LSI-implementation of the Datapoint 2200 design.
[edit] Design
[edit] Technical overview
The x86 architecture is a variable instruction length, primarily two-address, "CISC" design with emphasis on backward compatibility. The instruction set is not typical CISC however, but basically an extended and orthogonalized version of the simple eight-bit 8085 architecture. Words are stored in little-endian order and 16-bit and 32-bit accesses are allowed to unaligned memory addresses. To conserve opcode space, most register-addresses are three bits, and at most one operand can be in memory (in contrast with some highly orthogonal CISC designs such as PDP-11 where both operands can be in memory), but this memory operand may also be the destination, while the other operand, the source, can be either register or immediate. This contributes, among other factors, to a code footprint that rivals 8-bit machines and enables efficient use of instruction cache memory. During execution, current x86 processors employ a few extra decoding steps to split most instructions into smaller pieces, micro-ops, which are readily executed by a micro-architecture that could be (simplistically) described as a RISC-machine without the usual load/store limitations. The small number of general registers (also inherited from 8085) has made register-relative addressing (using small immediate offsets) an important method of accessing operands, especially on the stack. Much work has therefore been invested in making such accesses as fast as register accesses, i.e. a one cycle instruction throughput in most circumstances.
[edit] Segmentation
Minicomputers during the late 1970s were running up against the 16-bit 64-kB address limit as memory became cheaper to install. Most minicomputer companies redesigned their processors to fully handle 32-bit addressing and data. But the Intel 8086 would instead adopt a much-criticized stopgap concept of segment registers which effectively raised the memory address limit by 4 bits, from 16 bits (64 KiB) to 20 bits (1 mebibyte). Data and code could be managed within "near" 16-bit segments within a larger 1-MiB address space, or a compiler could operate in a "far" mode using both segment and offset. While that limit would also prove to be too small by the mid-1980s, it was ideal for the emerging PC market, and made it very simple to translate software from the older 8080 and 8085 to the newer processor.
[edit] The original 8086 and 8088
The original Intel 8086 and 8088 have fourteen 16-bit registers. Four of them (AX, BX, CX, DX) are general registers (although each have an additional purpose; for example only CX can be used as a counter with the loop instruction). Each can be accessed as two separate bytes (thus BX's high byte can be accessed as BH and low byte as BL). Four segment registers (CS, DS, SS and ES) are used to form a memory address. There are two pointer registers. SP points to the bottom of the stack and BP which is used to point at some other place in the stack or the memory(Offset). Two registers (SI and DI) are for array indexing.The FLAGS register contains flags such as carry flag, overflow flag and zero flag. Finally, the instruction pointer (IP) points to the current instruction.
The 8086 has 64 KiB of 8-bit (or alternatively 32 K-word of 16-bit) I/O space, and a 64 KiB (one segment) stack in memory supported by hardware. Only words (2 bytes) can be pushed to the stack. The stack grows downwards (toward numerically lower addresses), its bottom being pointed by SS:SP. There are 256 interrupts, which can be invoked by both hardware and software. The interrupts can cascade, using the stack to store the return address.
[edit] Real mode
Real mode is an operating mode of 80286 and later x86-compatible CPUs. Real mode is characterized by a 20 bit segmented memory address space (meaning that only 1 MiB of memory can be addressed), direct software access to BIOS routines and peripheral hardware, and no concept of memory protection or multitasking at the hardware level. All x86 CPUs in the 80286 series and later start up in real mode at power-on; 80186 CPUs and earlier had only one operational mode, which is equivalent to real mode in later chips.
In real mode, memory access is segmented. This is done by shifting the segment address left by 4 bits and adding an offset in order to receive a final 20-bit address. For example, if DS is A000h and SI is 5677h, DS:SI will point at the absolute address DS × 16 + SI = A5677h. Thus the total address space in real mode is 220 bytes, or 1 MiB, quite an impressive figure for 1978. All memory addresses consist of both a segment and offset; every type of access (code, data, or stack) has a default segment register associated with it (for data the register is usually DS, for code it is CS, and for stack it is SS). For data accesses, the segment register can be explicitly specified (using a segment override prefix) to use any of the four segment registers.
In this scheme, two different segment/offset pairs can point at a single absolute location. Thus, if DS is A111h and SI is 4567h, DS:SI will point at the same A5677h as above. This scheme makes it impossible to use more than four segments at once. CS and SS are vital for the correct functioning of the program, so that only DS and ES can be used to point to data segments outside the program (or, more precisely, outside the currently-executing segment of the program) or the stack. This scheme was intended as a compatibility measure with the Intel 8085.
The segmented nature can make programming and compilers design difficult because the use of near and far pointers affect performance. The introduction of bank switching schemes such as EEMS made programming even more complicated before the adoption of 32 bit addressing methods with later processors.
[edit] 16-bit protected mode
In addition to real mode, the Intel 80286 supports protected mode, expanding addressable physical memory to 16 MiB and addressable virtual memory to 1GiB. This is done by using the segment registers only for storing an index to a segment table. There were two such tables, the Global Descriptor Table (GDT) and the Local Descriptor Table (LDT), each holding up to 8192 segment descriptors, each segment giving access to 64 KiB of memory. The segment table provided a 24-bit base address, which can be added to the desired offset to create an absolute address. Each segment can be assigned one of four ring levels used for hardware-based computer security.
Because real mode DOS programs may do direct hardware access or perform segment arithmetic, both incompatible with protected mode, an operating system (OS) is limited in its ability to run these applications as processes. To overcome these difficulties, Intel introduced the 80386 with virtual 8086 mode. While still subject to paging, it uses real mode to form linear addresses and allows the OS to trap both I/O and memory access. By design, protected mode programs do not assume a relation between selector values and physical addresses.
Operating systems like OS/2 1.x try to switch the processor between protected and real modes. This is both slow and unsafe, because a real mode program can easily crash a computer. OS/2 1.x defines restrictive programming rules allowing a Family API or bound program to run in either real or protected mode.
Windows 3.0 should run real mode programs in 16-bit protected mode. Windows 3.0, when transitioning to protected mode, decided to preserve the single privilege level model that was used in real mode, which is why Windows applications and DLLs can hook interrupts and do direct hardware access. That lasted through the Windows 9x series. If a Windows 1.x or 2.x program is written properly and avoids segment arithmetic, it will run the same way in both real and protected modes. Windows programs generally avoid segment arithmetic because Windows implements a software virtual memory scheme, moving program code and data in memory when programs are not running, so manipulating absolute addresses is dangerous; programs should only keep handles to memory blocks when not running. Starting an old program while Windows 3.0 is running in protected mode triggers a warning dialog, suggesting to either run Windows in real mode or to obtain an updated version of the application. Updating well-behaved programs using the MARK utility with the MEMORY parameter avoids this dialog. It is not possible to have some GUI programs running in 16-bit protected mode and other GUI programs running in real mode. In Windows 3.1 real mode disappeared.
[edit] 32-bit protected mode
The Intel 80386 introduced a significant advance in x86 architecture: an all 32-bit design supporting paging. All of the registers, instructions, I/O space and memory are 32-bit. Memory is accessed through a 32-bit extension of protected mode. As in the 286, segment registers are used to index a segment table describing the division of memory. With a 32-bit offset, every application may access up to 4 GiB (or more with memory segments). In addition, 32-bit protected mode supports paging, a mechanism making it possible to use virtual memory. An exception to this design is the Intel 80386SX, which is 32-bit with 24-bit addressing and a 16-bit data bus.
No new general-purpose registers were added. All 16-bit registers except the segment registers were expanded to 32 bits. This is represented by prefixing an "E" (for Extended) to the register opcodes (thus the expanded AX became EAX, SI became ESI and so on). With a greater number of registers, instructions and operands, the machine code format was expanded. To provide backward compatibility, segments with executable code can be marked as containing either 16 or 32 bit instructions. Special prefixes allow inclusion of 32-bit instructions in a 16-bit segment or vice versa.
Paging and segmented memory access are required for modern multitasking operating systems. Linux, 386BSD and Windows NT were developed for the 386 because it was the first Intel architecture CPU to support paging and 32-bit segment offsets. The 386 architecture became the basis of all further development in the x86 series. The success of Windows 3.1, the first widely accepted version of Microsoft Windows, was largely due to its ability to take advantage of 386 features, even though it was used mainly to run multiple sessions rather than to take advantage of the native 32-bit instruction set.
The Intel 80387 math co-processor was integrated into the next CPU in the series, the Intel 80486 (the 486SX, sold as a budget processor, had its co-processor disabled or removed). The new floating point unit (FPU) performs floating point calculations, important for scientific applications and graphic design.
[edit] MMX and beyond
MMX is a SIMD instruction set designed by Intel, introduced in 1997 for Pentium MMX microprocessors. It developed out of a similar unit first used on the Intel i860. It is supported on most subsequent IA-32 processors by Intel and other vendors. MMX is typically used for video applications.
MMX added 8 new 64-bit registers to the architecture, known as MM0 through MM7 (generically MMn). In reality, these new registers are aliases for the existing x87 FPU stack registers. Hence, anything done to the floating point stack also affects the MMX registers. Unlike the floating point stack, these MMn registers are randomly accessible.
[edit] 3DNow!
In 1997 AMD introduced 3DNow! which consisted of SIMD floating point instruction enhancements to MMX. The introduction of this technology coincided with the rise of 3D entertainment applications and was designed to improve the CPU's vector processing performance of graphic-intensive applications. 3D video game developers and 3D graphics hardware vendors use 3DNow! to enhance their performance on AMD's K6 and Athlon series of processors.
[edit] SSE
In 1999, Intel introduced the Streaming SIMD Extensions (SSE) instruction set which added eight new 128 bit registers (not overlaid with other registers) and 70 floating point instructions.
[edit] SSE2
In 2000 Intel introduced the SSE2 instruction set, adding a complete complement of integer instructions (analogous to MMX) to the original SSE registers and 64-bit SIMD floating point instructions to the original SSE registers. The first addition made MMX almost obsolete, and the second allowed the instructions to be realistically targeted by conventional compilers.
[edit] SSE3
Introduced in 2004 along with the Prescott revision of the Pentium 4 processor, SSE3 added specific memory and thread-handling instructions to boost the performance of Intel's HyperThreading technology. AMD licensed the SSE3 instruction set and implemented most of the SSE3 instructions for its revision E and later Athlon 64 processors. The Athlon 64 does not support HyperThreading and lacks those SSE3 instructions used only for HyperThreading.
[edit] 64-bit Long Mode
By 2002, it was obvious that the 32-bit address space of the x86 architecture was limiting its performance in applications requiring large data sets. A 32-bit address space would allow the processor to directly address only 4 GiB of data, a size surpassed by applications such as video processing and database engines, while using the 64-bit address, one can directly address 16777216 TiB (more than 16 billion MiB) of data.
Intel introduced the IA-64 architecture, the basis for its Itanium line of processors. IA-64 provides a backward compatibility for older 32-bit x86 in emulation mode only; however, this mode of operation is in practice exceedingly slow.[citation needed]
AMD, who would traditionally follow the lead of Intel, took the initiative of extending the 32-bit x86 architecture to 64-bit, initially calling it x86-64, later renaming it AMD64. The Opteron, Athlon 64, and Turion 64 (later Sempron) families of processors use this architecture. The success of the AMD64 line of processors coupled with the lukewarm reception of the IA-64 architecture prompted Intel to reverse-engineer and adopt the instruction set, adding new extensions of its own and branding it the EM64T architecture.
In its literature and product version names, Microsoft and Sun refer to AMD64/EM64T collectively as x64 in the Windows and Solaris operating systems respectively. Linux distributions refer to it either as "x86-64", its variant "x86_64", or "amd64". BSD systems use "amd64" while Mac OS X uses "x86_64".
This was the first time that a major upgrade of the x86 architecture was initiated and originated by a manufacturer other than Intel. It was also the first time that Intel accepted technology of this nature from an outside source.
[edit] Virtualization
x86 virtualization is difficult because the architecture did not meet the Popek and Goldberg requirements until recently. Nevertheless, there are several commercial x86 virtualization products, such as VMware, Parallels and Microsoft Virtual PC, as well as open source virtualization projects such as Xen, QEMU. Other solutions, such as the Kernel-based Virtual Machine ("KVM"), require newer processors which provide better hardware support for virtualization.
Intel and AMD have introduced x86 processors with hardware-based virtualization extensions that overcome the classical virtualization limitations of the x86 architecture. These extensions are known as Intel VT (IVT or simply VT) that was code named "Vanderpool," and AMD-V that was code named "Pacifica." Although most modern x86 server-based and many modern x86 desktop-based processors include these extensions, the technology is generally considered immature at this point with most software-based virtualization outperforming these extensions [1]. This is expected to change as the technology matures.
[edit] System-on-a-chip (SOC)
An x86 system-on-a-chip is a combination of an x86 CPU core with a northbridge (memory controller) and a southbridge (input/output (I/O) controller) in a single integrated circuit (IC).
[edit] Manufacturers
x86 and compatibles have been designed, manufactured and sold by a number of companies, including:
[edit] List of x86 generations
- initial/first generation - first member is Intel 8086 (and derivatives), later multiple clones appeared.
- update to first generation - first member is Intel 80186 (and derivatives), later multiple clones appeared.
- second generation - first member is Intel 80286, later multiple clones appeared.
- third generation - first member is Intel 80386 (and derivatives), later multiple clones appeared.
- fourth generation - first member is Intel 80486 (and derivatives), later multiple clones appeared
- fifth generation ("i586") - first member is Pentium (and derivatives), later appeared Nx586, 5x86, 5k86, WinChip, mP6
- sixth generation ("i686") - first member is Pentium Pro (and derivatives, including Pentium II, Celeron (PII), Xeon (PII), Pentium III, Pentium M and Intel Core), later appeared 6x86, K6, K6-2, K6-III, C3, Crusoe
- seventh generation ("i786") - first member is Athlon (and derivatives), later appeared Pentium 4 (and derivatives), C7, Efficeon
- eighth generation - first member is Opteron (and derivatives, including Athlon 64), later appeared Xeon 5100 series (and derivates, including Core 2)
[edit] See also
- IA-32
- x86 assembly language
- x86 instruction listings
- x87
- Real mode — Unreal mode — Virtual 8086 mode — Protected mode — Long mode
- x86-64
- IA64
- Microarchitecture
[edit] External links
[edit] References
- Adams, Keith; Agesen, Ole (2006-21-2006). "A Comparison of Software and Hardware Techniques for x86 Virtualization". Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, USA, 2006. ACM 1-59593-451-0/06/0010. Retrieved on 2006-12-22.