Intel 3rd Gen Xeon Scalable (Ice Lake SP) Review: Generationally Big, Competitively Small
by Andrei Frumusanu on April 6, 2021 11:00 AM EST- Posted in
- Servers
- CPUs
- Intel
- Xeon
- Enterprise
- Xeon Scalable
- Ice Lake-SP
Section by Ian Cutress
Ice Lake Xeon Processor List
Intel is introducing around 40 new processors across the Xeon Platinum (8300 series), Xeon Gold (6300 and 5300 series) and Xeon Silver (4300 series). Xeon Bronze no longer exists with Ice Lake. Much like the previous generation, the 8/6/5/4 segmentation signifies the series, and the 3 indicates the generation. Beyond that the two digits are somewhat meaningless as before.
That being said, there is a significant change. In the past, Platinum/Gold/Silver also indicated socket support, with Platinum supporting up to 8P configurations. This time around, as Ice Lake does not support 8P, all the processors will support only up to 2P, with a few select models being uniprocessor only. This makes the Platinum/Gold/Silver segmentation arbitrary, if only to indicate what sort of performance/price bracket the processors are in.
On top of this, Intel is adding in more suffixes to the equation. If you work with Xeon Scalable processors day in and day out, there is now a need to differentiate the Q processor from a P processor, and an S processor from an M processor. There’s a handy list down below.
SKU List
The easiest way with this is to jump into the deep end with the processor list. RCP stands for recommended customer price, and SGX GB stands for how large Software Guard Extension enclaves can be – either 8 GB, 64 GB, or 512 GB. Cells highlighted in green show highlights in the stack.
Intel 3rd Gen Xeon Scalable Ice Lake Xeon Only |
||||||||||
AnandTech | Cores w/HT |
Base Freq |
1T Freq |
nT Freq |
L3 MB |
TDP W |
SGX GB |
RCP 1ku |
DC PMM |
|
Xeon Platinum (8x DDR4-3200) | ||||||||||
8380 | 40 | 2300 | 3400 | 3000 | 60 | 270 | 512 | $8099 | Yes | |
8368 | Q | 38 | 2600 | 3700 | 3300 | 57 | 270 | 512 | $6743 | Yes |
8368 | 38 | 2400 | 3400 | 3200 | 57 | 270 | 512 | $6302 | Yes | |
8362 | 32 | 2800 | 3600 | 3500 | 48 | 265 | 64 | $5488 | Yes | |
8360 | Y | 36 | 2400 | 3500 | 3100 | 54 | 250 | 64 | $4702 | Yes |
8358 | P | 32 | 2600 | 3400 | 3200 | 48 | 240 | 8 | $3950 | Yes |
8358 | 32 | 2600 | 3400 | 3300 | 48 | 250 | 64 | $3950 | Yes | |
8352 | Y | 32 | 2200 | 3400 | 2800 | 48 | 205 | 64 | $3450 | Yes |
8352 | V | 36 | 2100 | 3500 | 2500 | 54 | 195 | 8 | $3450 | Yes |
8352 | S | 32 | 2200 | 3400 | 2800 | 48 | 205 | 512 | $4046 | Yes |
8352 | M | 32 | 2300 | 3500 | 2800 | 48 | 185 | 64 | $3864 | Yes |
8351 | N | 36 | 2400 | 3500 | 3100 | 54 | 225 | 64 | $3027 | Yes |
Xeon Gold 6300 (8x DDR4-3200) | ||||||||||
6354 | 18 | 3000 | 3600 | 3600 | 39 | 205 | 64 | $2445 | Yes | |
6348 | 28 | 2600 | 3500 | 3400 | 42 | 235 | 64 | $3072 | Yes | |
6346 | 16 | 3100 | 3600 | 3600 | 36 | 205 | 64 | $2300 | Yes | |
6342 | 24 | 2800 | 3500 | 3300 | 36 | 230 | 64 | $2529 | Yes | |
6338 | T | 24 | 2100 | 3400 | 2700 | 36 | 165 | 64 | $2742 | Yes |
6338 | N | 32 | 2200 | 3500 | 2700 | 48 | 185 | 64 | $2795 | Yes |
6338 | 32 | 2000 | 3200 | 2600 | 48 | 205 | 64 | $2612 | Yes | |
6336 | Y | 24 | 2400 | 3600 | 3000 | 36 | 185 | 64 | $1977 | Yes |
6334 | 8 | 3600 | 3700 | 3600 | 18 | 165 | 64 | $2214 | Yes | |
6330 | N | 28 | 2200 | 3400 | 2600 | 42 | 165 | 64 | $2029 | Yes |
6330 | 28 | 2000 | 3100 | 2600 | 42 | 205 | 64 | $1894 | Yes | |
6326 | 16 | 2900 | 3500 | 3300 | 24 | 185 | 64 | $1300 | Yes | |
6314 | U | 32 | 2300 | 3400 | 2900 | 48 | 205 | 64 | $2600 | Yes |
6312 | U | 24 | 2400 | 3600 | 3100 | 36 | 185 | 64 | $1450 | Yes |
Xeon Gold 5300 (8x DDR4-2933) | ||||||||||
5320 | T | 20 | 2300 | 3500 | 2900 | 30 | 150 | 64 | $1727 | Yes |
5320 | 26 | 2200 | 3400 | 2800 | 39 | 185 | 64 | $1555 | Yes | |
5318 | Y | 24 | 2100 | 3400 | 2600 | 36 | 165 | 64 | $1273 | Yes |
5318 | S | 24 | 2100 | 3400 | 2600 | 36 | 165 | 512 | $1667 | Yes |
5318 | N | 24 | 2100 | 3400 | 2700 | 36 | 150 | 64 | $1375 | Yes |
5317 | 12 | 3000 | 3600 | 3400 | 18 | 150 | 64 | $950 | Yes | |
5315 | Y | 8 | 3200 | 3600 | 3500 | 12 | 140 | 64 | $895 | Yes |
Xeon Silver (8x DDR4-2666) | ||||||||||
4316 | 20 | 2300 | 3400 | 2800 | 30 | 150 | 8 | $1002 | ||
4314 | 16 | 2400 | 3400 | 2900 | 24 | 135 | 8 | $694 | Yes | |
4310 | T | 10 | 2300 | 3400 | 2900 | 15 | 105 | 8 | $555 | |
4310 | 12 | 2100 | 3300 | 2700 | 18 | 120 | 8 | $501 | ||
4309 | Y | 8 | 2800 | 3600 | 3400 | 12 | 105 | 8 | $501 | |
Q = Liquid Cooled SKU Y = Supports Intel SST-PP 2.0 P = IaaS Cloud Specialised Processor V = SaaS Cloud Specialised Processor N = Networking/NFV Optimized M = Media Processing Optimized T = Long-Life and Extended Thermal Support U = Uniprocessor (1P Only) S = 512 GB SGX Enclave per CPU Guaranteed (...but not all 512 GB are labelled S) |
The peak turbo on these processors is 3.7 GHz, which is much lower than what we saw with the previous generation. Despite this, Intel seems to be keeping prices reasonable, and enabling Optane support through most of the stack except for the Silver processors (which has its own single exception).
New suffixes include Q, for a liquid cooled processor model with higher all-core frequencies at 270 W, and Intel said this part came about based on customer demand. The T processors are extended life / extended thermal support, which usually means -40ºC to 125ºC support – useful when working at the poles or in other extreme conditions. M/N/P/V specialized processors, according to our chat with Lisa Spelman, GM of the Xeon and Memory Group, are the focal points for software stack optimizations. Users that want focused hardware that can get 2-10%+ more performance on their specific workload can get these processors for which the software will be specifically tuned. Lisa stated that while all processors will receive uplifts, the segmented parts are the ones those uplifts will be targeted for. This means managing turbo vs use case and adapting code for that, which can only really be optimized for a known turbo profile.
Competition
It’s hard not to notice that the server market over the last couple of years has become more competitive. Not only is Intel competing with its own high market share, but x86 alternatives from AMD have scored big wins when it comes to per-core performance, and Arm implementations such as the Ampere Altra can enable unprecedented density at competitive performance as well. Here’s how they all stand, looking at top-of-stack offerings.
Top-of-Stack Competition | ||||
AnandTech | EPYC 7003 |
Amazon Graviton2 |
Ampere Altra |
Intel Xeon |
Platform | Milan | Graviton2 | QuickSilver | Ice Lake |
Processor | 7763 | Graviton2 | Q80-33 | 8380 |
uArch | Zen 3 | N1 | N1 | Sunny Cove |
Cores | 64 | 64 | 80 | 40 |
TDP | 280 W | ? | 250 W | 270 W |
Base Freq | 2450 | 2500 | 3300 | 2300 |
Turbo Freq | 3500 | 2500 | 3300 | 3400 |
All-Core | ~3200 | 2500 | 3300 | 3000 |
L3 Cache | 256 MB | 32 MB | 32 MB | 60 MB |
PCIe | 4.0 x128 | ? | 4.0 x128 | 4.0 x64 |
Chipset | On CPU | ? | On CPU | External |
DDR4 | 8 x 3200 | 8 x 3200 | 8 x 3200 | 8 x 3200 |
DRAM Cap | 4 TB | ? | 4 TB | 4 TB |
Optane | No | No | No | Yes |
Price | $7890 | N/A | $4050 | $8099 |
At 40 cores, Intel does look a little behind, especially as Ampere is currently at 80 cores and a higher frequency, and will come out with a 128-core Altra Max version here very shortly. This means Ampere will be able to enable more cores in a single socket than Intel can in two sockets. Intel’s competitive advantage however will be the large current install base and decades of optimization, as well as new security features and its total offering to the market.
On a pure x86 level, AMD launched Milan only a few weeks ago, with its new Zen 3 core which has been highly impressive. Using a chiplet based approach, AMD has over 1000 mm2 of silicon to spread across 64 high performance cores and massive amounts of IO. Compared to Intel, which is around 660 mm2 and monolithic, AMD has the chipset onboard in its IO die, whereas Intel keeps it external which saves a good amount of idle power. Top of stack pricing between AMD and Intel is similar now, however AMD is also focusing in the mid-range with products like the 7F53 which really impressed us. We’ll see what Intel can respond with.
In our numbers today, we’ll be comparing Intel’s top-of-stack to everyone else. The battle royale of behemoths.
Gen on Gen Improvements: ISO Power
It is also important to look at what Intel is offering generationally in a like-for-like comparison. Intel’s 28-core 205 W point for the previous generation Cascade Lake is a good stake in the ground, and the Intel Xeon Gold 6258R is the dual socket equivalent of the Platinum 8280. We reviewed the two and they performed identically.
For this review, we’ve put the 40-core Xeon Platinum 8380 down to 205 W to see the effect of performance. But perhaps more in line, we also have the Xeon Gold 6330 which is a direct 28-core and 205 W replacement.
Intel Xeon Comparison: 3rd Gen vs 2nd Gen 2P, 205 W vs 205 W |
|||
Xeon Gold 6330 |
Xeon Plat 8352Y |
AnandTech | Xeon Gold 6258R |
28 / 56 | 32 / 64 | Cores / Threads | 28 / 56 |
2000 MHz Base 3100 MHz ST 2600 MHz MT |
2200 MHz Base 3400 MHz ST 2800 MHz MT |
Base Freq ST Freq MT Freq |
2700 MHz Base 4000 MHz ST 3300 MHz MT |
35 MB + 42 MB | 40 MB + 48 MB | L2 + L3 Cache | 28 MB + 38.5 MB |
205 W | 205 W | TDP | 205 W |
PCIe 4.0 x64 | PCIe 4.0 x64 | PCIe | PCIe 3.0 x48 |
8 x DDR4-3200 | 8 x DDR4-3200 | DRAM Support | 6 x DDR4-2933 |
4 TB | 4 TB | DRAM Capacity | 1 TB |
200-series | 200-series | Optane | 100-series |
4 TB Optane + 2 TB DRAM |
4 TB Optane + 2 TB DRAM |
Optane Capacity Per Socket |
1 TB DDR4-2666 + 1.5 TB |
64 GB | 64 GB | SGX Enclave | None |
1P, 2P | 1P, 2P | Socket Support | 1P, 2P |
3 x 11.2 GT/s | 3x 11.2 GT/s | UPI Links | 3 x 10.4 GT/s |
$1894 | $3450 | Price (1ku) | $3950 |
So the 6330 might seem like a natural fit, however, the 8352Y feels better given that it is more equivalent in price and offers more performance. Intel is promoting a +20% raw performance boost with the new generation, which is important here, because the 8352Y still loses 500 MHz to the previous generation in all-core frequency. The 8352Y and 6330 do make it up in the extra features, such as DDR4 channels, memory support, PCIe 4.0, Optane support, SGX enclave support, and faster UPI links.
This review has a few of our 6330 numbers that we’ve been able to run in the short time we’ve had the system.
169 Comments
View All Comments
TomWomack - Wednesday, April 7, 2021 - link
Is it known whether there will be an IceLake-X this time round? The list of single-Xeon motherboard launches suggests possibly not; it would obviously be appealing to have a 24-core HEDT without paying the Xeon premium.EthiaW - Wednesday, April 7, 2021 - link
Boeings and Airbuses are never actually sold at their nominal prices, they cost far less, a non-disclosed number, for big buyers after gruesome haggling, sometimes less than half the “catalogue” price.I think this is exactly what's intel doing now: set the catalogue price high to avoid losing face, and give huge discount to avoid losing market share.
duploxxx - Wednesday, April 7, 2021 - link
well easy conclusion.EPYC 75F3 is the clear winner SKU and the must have for most of the workloads.
This is based on price - performance - cores and its related 3rd party sw licensing...
I wonder when Intel will be able to convince VMware to move from a 32core licensing schema to a 40core :)
They used to get all the dev favor when PAT was still in the house, I had several senior engineers in escalation calls stating that the hypervisor was optimised for Intel ...guess what even under optimised looking for a VM farm in 2020-2021-....you are way better off with an AMD build.
WaltC - Wednesday, April 7, 2021 - link
If you can't beat the competition, then what? Ian seems to be impressed that Intel was finally able to launch a Xeon that's a little faster than its previous Xeon, but not fast enough to justify the price tag in relation to what AMD has been offering for a while. So here we are congratulating Intel on burning through wads more cash to produce yet-another-non-competitive result. It really seems as if Intel *requires* AMD to set its goals and to tell it where it needs to go--and that is sad. It all began with x86-64 and SDRAM from AMD beating out Itanium and RDRAM years ago. And when you look at what Intel has done since it's just not all that impressive. Well, at least we can dispense with the notion that "Intel's 10nm is TSMC's 7nm" as that clearly is not the case.JayNor - Wednesday, April 7, 2021 - link
What about the networking applications of this new chip? Dan Rodriguez's presentation showed gains of 1.4x to 1.8x for various networking benchmarks. Intel's entry into 5G infrastructure, NFV, vRAN, ORAN, hybrid cloud is growing faster than they originally predicted. They are able to bundle Optane, SmartNICs, FPGAs, eASIC chips, XeonD, P5900 family Atom chips... I don't believe they have a competitor that can provide that level of solution.Bagheera - Thursday, April 8, 2021 - link
Patr!ck Patr!ck Partr!ck?evilpaul666 - Saturday, April 10, 2021 - link
It only works in front of a mirror. Donning a hoodie helps, too.Oxford Guy - Wednesday, April 7, 2021 - link
There is some faulty logic at work in many of the comments, with claims like it's cheating to use a more optimized compiler.It's not cheating unless:
• the compiler produces code that's so much more unstable/buggy that it's quite a bit more untrustworthy than the less-optimized compiler
• you don't make it clear to readers that the compiler may make the architecture look more performant simply because the other architectures may not have had compiler optimizations on the same level
• you use the same compiler for different architectures when using a different compiler for one or more other architectures will produce more optimized code for those architectures as well
• the compiler sabotages the competition, via things like 'genuine Intel'
Fact is that if a CPU can accomplish a certain amount of work in a certain amount of time, using a certain amount of watts under a certain level of cooling — that is the part's actual performance capability.
If that means writing machine code directly (not even assembly) to get to that performance level, so what? That's an entirely different matter, which is how practical/economical/profitable/effortful it is to get enough code to measure all of the different aspects of the part's maximum performance capability. The only time one can really cite that as a deal-breaker is if one has hard data to demonstrate that by the time the hand-tuned/optimized code is written changes to the architecture (and/or support chips/hardware) will obsolete the advantage — making the effort utterly fruitless, beyond intellectual curiosity concerning the part's ability. For instance, if one knows that Intel, for instance, is going to integrate new instructions (very soon) that will make various types of hand-tuned assembly obsolete in short order, it can be argued that it's not worth the effort to write the code. People made this argument with some of AMD's Bulldozer/Piledriver instructions, on the basis that enough industry adoption wasn't going to happen. But, frankly... if you're going to make claims about the part's performance, you really should do what you can to find out what it is.
Oxford Guy - Wednesday, April 7, 2021 - link
One can, though, of course... include a disclaimer that 'it seems clear enough that, regardless of how much hand-tuned code is done, the CPU isn't going to deliver enough to beat the competition, if the competition's code is similarly hand-tuned' — if that's the case. Even if a certain task is tuned to run twice as fast, is it going to be twice as fast as tuned code for the competition's stuff? Is its performance per watt deficit going to be erased? Will its pricing no longer be a drag on its perceived competitiveness?For example, one could have wrung every last drop of performance out of Bulldozer but it wasn't going to beat Sandy Bridge E — a chip with the same number of transistors. Piledriver could beat at least the desktop version of Sandy in certain workloads when clocked well outside of the optimal (for the node's performance per watt) range but that's where it's very helpful to have tests at the same clock. It was discovered, for instance, that the Fury X and Vega had basically identical performance at the same clock. Since desktop Sandy could easily clock at the same 4.0 GHz Piledriver initially shipped with it could be tested at that rate, too.
Ideally, CPU makers would release benchmarks that demonstrate every facet of their chip's maximum performance. The concern about those being best-case and synthetic is less of a problem in that scenario because all aspects of the chip's performance would be tested and published. That makes cherry-picking impossible.
mode_13h - Thursday, April 8, 2021 - link
The faulty logic I see is that you seem to believe it's the review's job to showcase the product in the best possible light. No, that's Intel's job, and you can find plenty of that material at intel.com, if that's what you want.Articles like this should focus on representing the performance of the CPUs as the bulk of readers are likely to experience it. So, even if using some vendor-supplied compiler with trick settings might not fit your definition of "cheating", that doesn't mean it's a service to the readers.
I think it could be appropriate to do that sort of thing, in articles that specifically analyze some narrow aspect of a CPU, for instance to determine the hardware's true capabilities or if it was just over-hyped. But, not in these sort of overall reviews.