What is hyper-threading
Hyper-threading is a set of technologies introduced by Intel in 2002 that allow "double-personality" for cores. DEC initially developed and inspired much of the technology for Alpha processors. AMD provides similar technology but calls it "simultaneous multithreading" since Hyper-threading is an Intel trademark.
A good introduction to Intel hyper-threading: Intel's Hyper-Threading by Tinker and Valerino
Primer on the core architecture
The way modern all processors, starting with the Intel Pentium, work is:
- They have a set of registers: internal processor storage that is very fast
- They have various caches (L1, L2, L3)
- They have sophisticated mechanisms to re-order and parallelize instructions
- They have extensive instruction sets that allow various types of processing (mathematical and logical operations, etc). More importantly, arguments to the instructions are primarily registers but can be memory references as well
- They have execution units (up to 3 per core) that can execute instructions. The process of scheduling instructions on various execution units is complex
Execution in modern processors (without hyper-threading) consists of running instructions to change the state of registers and of memory. Access to registers is high-speed (no waiting), but memory access can be very slow (depending on the efficiency of the cache hierarchy). When a slow memory access operation is performed, the core will essentially do nothing until the memory is accessed; this is somewhat mitigated by the ability of cores to run up to 3 instructions at a time, but for the most part, it is accurate.
Key ideas in hyper-threading
- Create two sets of independent registers in every core; only one set is active at a time.
- When the execution of one piece of code on the current register set gets stalled (due to expensive memory access), switch to the other set of registers and run a different program
- Switching registers sets switches the context (the OS is usually responsible for this) and must invalidate the cache (exactly what the OS has to do to switch tasks)
All the machinery to execute instructions is shared between the 2 hyper-threads except the set of registers. As such, the two hyper-threads cannot execute at the same time.
The OS can switch threads on cores without the hyper-threading mechanism. The hardware would do the switch faster.
The benefits of the hyper-threading are dependent on the application since it can only help if cache performance is poor
Misconceptions about hyper-threading
The biggest misconception about hyper-threading is the fact that it "doubles" the number of cores. If hyper-threading is enabled (usually in BIOS), indeed the number of cores reported to the OS is double the initial core count. The trouble is the fact that the "hyper-threaded" cores do not have the performance of the original cores. The section above explained that only the registers are thread-specific; everything else is shared. As such the performance of two hyper-threaded cores depends fundamentally on getting stuck on memory accesses.
The second misconception is related to the fact that it cannot hurt, even if it does not provide full benefit. For certain workloads (see the section below), hyper-threading can add a 5-10% performance penalty. It also makes thread scheduling more complicated (the two hyper-threaded cores that are "related" should not run programs at the same time unless strictly needed).
Hyper-threaded technologies have been hyped up by both Intel and AMD over the last 2 decades. There are a lot of marketing "stories" around the technologies that do not necessarily translate to actual performance. Processors capable of hyper-threading are targeting enterprises (and consumers more and more); they are usually more expensive than processors without the capability.
Real hyper-threading performance
The only way to determine if hyper-threading is beneficial in specific circumstances is to turn it on/off and benchmark the applications. The interplay between thread switching, mapping to underlying execution units, and cache performance is too complex.
Some of the relevant studies/benchmarks on hyper-threading and their main findings are:
- The Impact of Hyper-Threading on Processor Resource Utilization in Production Applications, NASA found no significant performance improvement but significant efficiency improvement (instructions/cycle). Notice that most users only care about performance improvements
- Intel's Hyper-Threading by Tinker and Valerino found that for transaction processing (i.e. specific database workloads), the hyper-threading can add
- An Empirical Study of Hyper-Threading in High Performance Computing Clusters found significant performance degradation (up to 50%) for hyper-threading for linear algebra tasks.
- Evaluating Hyper-Threading and VMWare found a small (11%) improvement when hyper-threading is enabled
Pitfalls of using hyper-threading for VM Hosts
- Will not result in significantly better performance overall (11% at best)
- Will double-book the actual processors and offer half-performance to the users. Most VM users will perceive this as a performance degradation since most of their tasks will run at half-speed. This is very problematic since users are very per/core performance sensitive.
- Has added security risks Spectre-like attack
- The CPU can be over-subscribed for VM without hyper-threading since the VMs run as processes in the underlying OS.
Overall, enabling hyper-threading on VM hosts adds too few benefits and comes with too many pitfalls to recommend for production work.