GPU & Gaming Performance

Sitting alongside the different CPU cores in the Galaxy S8’s two SoCs are two different GPU configurations. The Snapdragon 835 includes an Adreno 540 GPU that uses the same basic architecture as the Adreno 530 found in Snapdragon 820/821. While the new Adreno 540 remains a black box, Qualcomm says it improved performance and efficiency by eliminating some bottlenecks, tweaking the register file and ALUs, and improving depth rejection. Qualcomm also used the move to 10nm to raise the max GPU frequency to 710MHz, a roughly 14% increase over S820’s peak operating point; however, Samsung caps the Adreno 540 in the Galaxy S8 to 670MHz for both the FHD+ and WQHD+ resolution settings.

The S8’s Exynos 8895 SoC comes with an ARM Mali-G71 GPU that uses ARM’s latest Bifrost architecture. We first saw the Mali-G71 in action when we looked at Huawei’s Mate 9 and P10, which both use the Kirin 960 SoC from HiSilicon. Unlike Huawei’s offerings that use the G71 in an 8-core configuration with a peak operating point of 1037MHz, Samsung went wide and slow for its E8895, with 20 cores running at up to 546MHz.

GFXBench ALU 2 (Offscreen)

A Bifrost GPU core can process 1 pixel per clock and up to 12 FP32 FMAs. After accounting for differences in core count and frequency, the S8’s E8895 holds a 32% theoretical throughput advantage over the Kirin 960 in Huawei’s flagships. In the GFXBench ALU 2 test (run offscreen at a fixed 1080p resolution), however, the S8 (E8895) does even better, managing to outperform the Mate 9 and P10 by 59%. The S8 is using a newer GPU driver than Huawei’s phones, which likely accounts for some of this additional performance. The S8 with E8895 is also 27% faster here than the S835 version, which is a bit of an upset considering Adreno’s historically strong ALU performance. The S8 (E8895) even bests the iPhone 7 in this test.

The Adreno 540 in the S835 version of the S8 is not much faster than the Adreno 530 in the Snapdragon 820 phones in this synthetic ALU test, giving up about 5fps relative to the S835 mobile development platform we previously tested due to its lower GPU frequency. Our previous testing also showed that the Adreno 540’s microarchitecture tweaks provide no advantage here, because Adreno 530 and 540 give the same performance at the same frequency. The S8 (S835) is still more than 3x faster than the Galaxy S6 and S5, though.

GFXBench Texturing (Offscreen)

The two different S8 versions swap positions in the offscreen texturing test, with the S8 (S835) pulling ahead of the E8895 version by 22%. The S8 (E8895) offers about the same level of performance as the iPhone 7 and phones using the S820/S821. It also holds a 31% advantage over the Mate 9’s G71 GPU, which happens to be very close to the 32% theoretical value based on core count and frequency.

GFXBench T-Rex HD (Onscreen)

GFXBench T-Rex HD (Offscreen)

In the synthetic tests above, the E8895 S8 had an advantage over the S835 S8 in ALU performance, but the S835 version had the edge in texturing. Now it’s time to see if this holds true while running some strenuous 3D workloads. First up is the older OpenGL ES 2.0-based GFXBench T-Rex game simulation, where the last couple generations of flagship phones have not only been hitting the 60fps V-Sync limit when running at their native onscreen resolution but averaging 60fps over the duration of the test. The S8 is no exception, averaging 60fps while running at their highest WQHD+ resolution. Even though the Galaxy S7, S6, and S5 have lower-resolution displays, they still fall short of the 60fps barrier.

Moving to the 1080p offscreen test, the Galaxy S8 (E8895) tops the chart, pulling ahead of the S835 version by 11% and the Kirin 960 in the Mate 9 and P10 by 20%. Compared to the previous generation S820 phones, including the Galaxy S7, the S8 is either 14% (S835) or 27% (E8895) faster. It’s also interesting to see that the S8’s peak performance is more than 4x higher than the S5 in T-Rex, which is the least strenuous of our game simulation tests (and the only one the S5 can even run).

GFXBench Car Chase ES 3.1 / Metal (On Screen)

GFXBench Car Chase ES 3.1 / Metal (Off Screen 1080p)

The GFXBench Car Chase game simulation uses a more modern rendering pipeline and the latest features, including tessellation, found in OpenGL ES 3.1 plus Android Extension Pack (AEP). Like many current games, it stresses ALU performance to deliver advanced effects.

In the onscreen test, with the S8 set to its highest WQHD+ resolution, the two versions perform roughly the same. The E8895 S8’s small advantage is due to a slight difference in resolution: on the E8895 S8 the game ran at 2560x1440, keeping the nav buttons visible, while the S835 version defaulted to running the game full screen at 2678x1440. As expected, the S8 is faster than the S7, which uses a WQHD 2560x1440 resolution, by at least 15% compared to the S820 version and a more noticeable 57% compared to the E8890 version and its Mali-T880MP12 GPU. Stepping back one more generation to the Galaxy S6 shows that peak performance has more than doubled. It should be noted that all of the phones above the S8 in this chart benefit from using lower-resolution 1080p displays. When set to its FHD+ display mode, the S8 does outperform the OnePlus 3T by 1-2fps.

The S8 jumps to the top of the offscreen chart where resolution is no longer a factor. The E8895 version outperformed the S835 version in the GFXBench ALU 2 test, so it’s not too surprising to see the same hierarchy in this workload, although the E8895 S8’s margin of victory is narrower at only 8%. The S8 (E8895) is also faster than the Mate 9 and P10 by 55%, almost the same difference we saw in the GFXBench ALU 2 test. Again, at least some of this advantage comes from the S8’s newer GPU driver. With the S835 inside, the S8 is at least 16% faster than the S7 (S820) and the other S820 phones.

3DMark Sling Shot 3.1 Extreme Unlimited - Overall

3DMark Sling Shot 3.1 Extreme Unlimited - Graphics

3DMark Sling Shot Extreme uses either OpenGL ES 3.1 on Android or Metal on iOS and stresses the GPU and memory subsystems by rendering offscreen at 1440p (instead of 1080p like our other tests).

The Galaxy S8 delivers the highest peak graphics performance in this test. The E8895 version performs as well as the iPhone 7, while the S835 version does even better, topping the Exynos SoC by 17%. In GFXBench Car Chase, which also stresses ALU performance, the E8895’s 20-core Mali-G71 GPU outperformed the S835’s Adreno 540, but the order flips in this workload. In the first graphics subtest, which emphasizes geometry processing and uses simpler shaders, the S835 is 9% faster than E8895, while in the second graphics subtest, which uses more mathematically complex shaders and adds volumetric illumination, the S835 is 21% faster than E8895.

Compared to the previous generation, the S8 (E8895) is only 9% faster than the S7 (E8890) in the combined graphics test, which is a little disappointing considering the E8890 uses the older Mali-T880 GPU with only 12 cores. The gap between the S8 (E8895) and the S7 (S820) is not much different, but it is 2.6x faster than the Galaxy S6.

Basemark ES 3.1 / Metal

Basemark ES 3.1 / Metal Onscreen Test

Basemark ES 3.1 / Metal Offscreen Test

The demanding Basemark ES 3.1 game simulation uses either OpenGL ES 3.1 on Android or Metal on iOS. It includes a number of post-processing, particle, and lighting effects, but does not include tessellation like GFXBench 4.0 Car Chase.

The iPhones take the lead in the onscreen test, partially because of their lower-resolution displays and partially because they are using Apple’s Metal graphics API, which dramatically reduces driver overhead when issuing draw calls. The Mate 9 and P10 also pull ahead of the S8 when running this test onscreen purely because their displays top out at 1080p. The S8 does deliver the best onscreen performance among the QHD resolution phones, with the S835 version outpacing the Galaxy S7 (S820), LG G6, Pixel XL, and other S820 phones by at least 18%. The E8895 S8 performs particularly well, posting a result 56% higher than the S835 version.

Hardware comparisons are a little easier when rendering at a fixed resolution offscreen. The S8 (E8895) is the fastest Android phone in this test, and its wider Mali-G71 GPU configuration bests the Kirin 960’s high-frequency approach by 16%. This is about half the E8895’s theoretical advantage when looking solely at compute/texturing resources, suggesting the E8895 is bottlenecked elsewhere. The S8 (E8895) is also considerably faster than the S835 version in this test, with the gap growing to 50%. We’ve already seen the E8895 outperform the S835 in other shader-intensive workloads, and ARM’s Mali GPUs historically handle this test’s workloads well, so this is not a huge surprise.

In addition to the results shown above, Basemark ES 3.1 also measures the performance of specific graphical features. The E8895 outperforms the S835 in all of these subtests, but it does particularly well when performing SSAO (screen space ambient occlusion), a technique used for calculating soft shadows, where it’s 58% faster than the S8 (S835). The delta between the E8895 and S835 versions shrinks to only 16% in the post-processing test (depth of field, antialiasing, etc) and 8% in the particle instancing test.

Overall the Galaxy S8 delivers excellent peak graphical performance. It offers a significant performance uplift over the Galaxy S5 and S6, although its gains over the S7 and last year’s crop of S820/S821 flagships are not as impressive. The performance delta between the E8895 and S835 versions of the S8 varies depending on workload, but the E8895 S8 is faster in most of our tests.

System Performance Battery Life & Thermal Stability
Comments Locked


View All Comments

  • goatfajitas - Friday, July 28, 2017 - link

    /edit - buy what suits you
  • zodiacfml - Friday, July 28, 2017 - link

    It is not that big. The "taller" aspect ratio exaggerates the diagonal. To the article, the 10nm SoC now seems more valuable than benchmarks/reviews I've seen from other sites. Since the Pixel is going to be expensive, taller, no storage expansion and without a headphone jack, I have no ideal phone yet this year. The Mi Mix 2 or the LG V30 might.
  • philehidiot - Saturday, July 29, 2017 - link

    Just as a side point, I went from a HTC M9 to an S8. I tried and tested the S8 and S8+. Bear in mind I have small hands to the point where I also pack a pair of socks to compensate. If you're American or not quite so crude that means I prefer a 9mm to a .45. I found the elongated screen of the S8 to be just about tolerable and the advantages for multitasking do outweigh the occasional situation where I need to reach the far end of the screen and can't do it. I suspect most people with normal hands will find the S8 to be perfectly fine from a usability standpoint. Certainly the S8+ I would strongly recommend you try a live model before you buy and perhaps consider waiting for the new Note if big screens are your bag.

    As for the carrying something that big you haven't heard the worst of it. It's well built - teardowns show this. Equally it's still made of glass for crying out loud. You NEED a case (and what's the point in making something so aesthetically amazing when you have to cover it??!!) and not a light one either. I have a leather fold out case which allows me to watch stuff on the phone at an angle and also takes some cards. Interestingly, it has two magnets right next to where the cards live. I got locked out of my hotel room due to this. Regardless, the necessary beef and size of case required to protect such a fragile device means the size is doubled. If HTC had continued down the metal line I'd have gone with them but it's all about bloody glass these days and I'm sick of it.
  • Tttimothy2355 - Friday, July 28, 2017 - link

    Apple stocks galaxy awesome
  • syxbit - Friday, July 28, 2017 - link

    >>"Our initial look at Snapdragon 835 revealed that its Kryo 280 performance cores are loosely based on ARM’s Cortex-A73 while the efficiency cores are loosely based on the Cortex-A53"

    Why would you write such a blatant lie. It's not LOOSELY based at all. It's >95% the same chip. QCOM have made minor tweaks just to be able to market it as their own design.
  • tipoo - Friday, July 28, 2017 - link

    Where would the A10 fall on the ratio/GHz chart I wonder?
  • name99 - Friday, July 28, 2017 - link

    We can guess.
    You can see the A9 results here:

    Eyeballing it, they are on average about 1.5x the current A73 results.
    A10 results are about 50% faster again, while running at the same frequency as the CPUs referenced in the article, so basically about twice the IPC of the current A73 crop of champions.

    One thing that stands out in comparing the SPEC results across all these devices is the massive jump in 175.vpr. A9 (which, like I said, is at around 1.5x for most results) has a value of 2017. This is about in line with what we see for Snapdragon 821. Then we get these massively (2x larger than I'd expect) scores for the other high-end ARM cores.

    My guess is that something changed in the compiler in the past year or so. (Since the article doesn't say whether gcc or llvm was used, I can't investigate further.) My guess is likewise that this wasn't something nefarious, some "cheat" to make SPEC results look better --- no-one cares about SPEC2000 on ARM64 anyway --- but rather some general improvement in the compiler (perhaps loop unrolling/data placement, but most likely autovectorization) that managed to MASSIVELY improve ARM64 performance on this particular piece of code.
    Presumably (if the change is in LLVM...) Apple picks up the same improvement, but sadly we never got to see the A10 SPEC results. Maybe A11?

    So summary
    - Apple's IPC seems to now be at around 2x ARM competitors for most purposes. (It's at around 1.25x Intel's; but to be fair Intel can clock higher; but to be fair Intel uses more juice)
    - something interesting happened to 175.vpr on ARM64 in the past year or so, and if anyone knows, they should speak up!
  • Nullify - Saturday, July 29, 2017 - link

    I was hoping for Anand to do a deep dive on the A10. Perhaps they're saving it for the A11? Should be the first ARM core in the world to break 4,000 single core on Geekbench, making it a full 2X faster than the 8895 or 835. It's truly amazing how much further ahead Apple is.
  • tuxRoller - Saturday, July 29, 2017 - link

    How big are those some cores, again?
    It's not like this is magic, and these companies know his to make very high IPC if you don't care about cost. Apple has built a massive core, and they pay the price in silicon.
    ARM, and most of their licensees, are optimizing for silicon area efficiency, not absolute performance.
  • Meteor2 - Saturday, July 29, 2017 - link

    I think that, as alluded to above, Android and iOS are diverging so much that there's little point comparing Apple IPC to ARM or whoever, you may as well compare it to Power or SPARC.

Log in

Don't have an account? Sign up now