As with any processor vendor, having a detailed list of what the processor does and how to optimize for it is important. Helping programmers also plan for what’s coming is also vital. To that end, we often get glimpses of what is coming in future products by keeping track of these updates. Not only does it give detail on the new instructions, but it often verifies code names for products that haven’t ‘officially’ been recognized. Intel’s latest update to its ISA Extensions Reference manual does just this, confirming Alder Lake as a future product, and identifies what new instructions are coming in future platforms. Perhaps the biggest news of this is actually the continuation of BFLOAT16 support, originally supposed to be Cooper Lake only (and bearing in mind, Cooper Lake will have a limited launch), but will now also be included in the upcoming Sapphire Rapids generation, set for deployment in the Aurora supercomputer in late 2021.

In the 38th Edition of the ISA Extensions Reference manual from Intel, the company has a table front and center with all the latest updates an instructions coming to future platforms. From this, we can plot what platforms will be getting which instructions.

Intel Instruction Support
AnandTech Tremont
Atom
CPR
Xeon
ICL
Xeon
SPR
Xeon
Tiger
Lake
Alder
Lake
PCONFIG     ?    
WBNOINVD     ?    
Intel MKTME     ?    
ENCLV   ?    
MOVDIR*     ?
AVX512_BF16   no    
AVX512_VP2INTERSECT       ?
CET       ?
ENQCMD*          
PTWRITE          
TPAUSE, UM*        
Arch LBRs        
HLAT        
SERIALIZE        
TSXLDTRK          

Starting with what I think is big news: BF16 support in Sapphire Rapids. It is clear from this manual that BF16 will not be supported in Ice Lake Server, which means that technically BF16 will skip a generation, going from Cooper Lake to Sapphire Rapids. But as we’ve reported on previously, Cooper Lake has changed from a wide launch for everyone to a minimal launch for select customers only, and those that focus on 4S and 8S topologies (like Facebook). So this could be considered more of a ‘delayed’ launch, assuming Sapphire Rapids is going to be widely used. Anyone planning to use Cooper Lake for BF16 compatible workloads will have to wait an extra couple of years.

The other big news is the mentioning of Alder Lake. Up until this point, Alder Lake has only been mentioned in unconfirmed slides, or LinkedIn profiles of engineers who have worked on it (and subsequently those references were removed). As far as we understand, Alder Lake is the 10nm product following on from Tiger Lake. Tiger Lake (what we know so far) is a quad-core mobile chip due for launch at the end of 2020, which means Alder Lake is likely to be at the tail end of 2021.

What Alder Lake (and Sapphire Rapids) gets for instructions includes Architectural LBRs (Last Branch Recording) in order to speed up branches, HLAT (Hypervisor-managed Linear Address Translation), which forces linear address translation, and SERIALIZE, which forces a command to go through a core with all the caches pre-flushed and waits for all buffered writes to have finished before starting.  The LBR update helps with performance, the HLAT is primarily for Sapphire Rapids, and the SERIALIZE is to assist with recent security issues.

Also of note are some of the Ice Lake Server updates. It now lists Ice Lake Server as getting Intel’s MKTME, Intel’s Multi-Key Total Memory Encryption, which are a set of memory encryption techniques for multiple encrypted environments, increasing the scope of these technologies with a key to matching/surpassing AMD’s prowess in this area. The other one to note (but not new for this document) is ENCLV support, which SGX related to secure enclaves.

Another point of security is the new TSXLDTRK instruction for Sapphire Rapids. This is a TSX Load Tracking ‘suspend’ instruction, with a corresponding XRESLDTRK to resume load tracking for TSX. (TSX = Transactional Memory.)

The full information about these new instructions can be found on Intel’s Developer Zone.

Source: Instlatx64 on Twitter

Related Reading

Comments Locked

34 Comments

View All Comments

  • Deicidium369 - Tuesday, April 28, 2020 - link

    Not a single one of those "security issues" can be exploited in the wild and can not even be reliably exploited in a lab environment.

    Intel didn't need to salvage every single tiny little piece of silicon because they made a TERRIBLE deal with TSMC. Intel was able to make very large monolithic dies that TSMC an only approach with a GPU (alot more forgiving of defects).

    If Intel 14nm followed the TSMC model - 14+ would be 13nm, 14++ would be 12nm and 14+++ would be 11nm - and we know Intel's 10nm+ is denser than any of TSMCs "7nm" fluff.
  • Jorgp2 - Wednesday, April 1, 2020 - link

    It's crazy how many people have no idea what they're going on about
  • mode_13h - Wednesday, April 1, 2020 - link

    Thanks for the useless comment. Try enlightening us.
  • Qasar - Thursday, April 2, 2020 - link

    why would he ? timecop1818 says things like this, and he never does :-) :-)
  • Sahrin - Wednesday, April 1, 2020 - link

    >Tiger Lake (what we know so far) is a quad-core mobile chip due for launch at the end of 2020

    Uh...good luck with that, Intel.
  • mode_13h - Wednesday, April 1, 2020 - link

    I believe it'll launch. Whether or not you can actually buy them is a separate question.
  • PaulHoule - Wednesday, April 1, 2020 - link

    If I understand this right, it seems that BF16 will be hard to find in the near term. That is, if I want to use it for inference on the client, no dice.

    Is that right?
  • mode_13h - Wednesday, April 1, 2020 - link

    Yes, the only CPUs set to have it are server chips, one of which won't even be generally available (i.e. Cooper Lake).

    For inference, you'll probably find fp16 more than adequate, and still better than int8. The two main reasons BFloat16 is gaining popularity are: better training efficiency and smaller silicon footprint. Easy conversion to fp32 is an added bonus, but Intel had fp16/fp32 conversion instructions since Ivy Bridge.
  • p1esk - Thursday, April 2, 2020 - link

    No one in their right mind would use fp16 or bfp16 for inference. It makes zero sense because even INT4 is enough. Nvidia gets it, Intel still doesn’t.
  • Machinus - Thursday, April 2, 2020 - link

    14nm is not a real product anymore.

Log in

Don't have an account? Sign up now