CPU Compute

One side I like to exploit on CPUs is the ability to compute and whether a variety of mathematical loads can stress the system in a way that real-world usage might not.  For these benchmarks we are ones developed for testing MP servers and workstation systems back in early 2013, such as grid solvers and Brownian motion code.  Please head over to the first of such reviews where the mathematics and small snippets of code are available.

3D Movement Algorithm Test

The algorithms in 3DPM employ uniform random number generation or normal distribution random number generation, and vary in various amounts of trigonometric operations, conditional statements, generation and rejection, fused operations, etc.  The benchmark runs through six algorithms for a specified number of particles and steps, and calculates the speed of each algorithm, then sums them all for a final score.  This is an example of a real world situation that a computational scientist may find themselves in, rather than a pure synthetic benchmark.  The benchmark is also parallel between particles simulated, and we test the single thread performance as well as the multi-threaded performance.  Results are expressed in millions of particles moved per second, and a higher number is better.

3D Particle Movement: Single Threaded

3D Particle Movement: Multi-Threaded

N-Body Simulation

When a series of heavy mass elements are in space, they interact with each other through the force of gravity.  Thus when a star cluster forms, the interaction of every large mass with every other large mass defines the speed at which these elements approach each other.  When dealing with millions and billions of stars on such a large scale, the movement of each of these stars can be simulated through the physical theorems that describe the interactions.  The benchmark detects whether the processor is SSE2 or SSE4 capable, and implements the relative code.  We run a simulation of 10240 particles of equal mass - the output for this code is in terms of GFLOPs, and the result recorded was the peak GFLOPs value.

N-Body Simulation

Grid Solvers - Explicit Finite Difference

For any grid of regular nodes, the simplest way to calculate the next time step is to use the values of those around it.  This makes for easy mathematics and parallel simulation, as each node calculated is only dependent on the previous time step, not the nodes around it on the current calculated time step.  By choosing a regular grid, we reduce the levels of memory access required for irregular grids.  We test both 2D and 3D explicit finite difference simulations with 2n nodes in each dimension, using OpenMP as the threading operator in single precision.  The grid is isotropic and the boundary conditions are sinks.  We iterate through a series of grid sizes, and results are shown in terms of ‘million nodes per second’ where the peak value is given in the results – higher is better.

Explicit Finite Difference Solver (2D)Explicit Finite Difference Solver (3D)

Grid Solvers - Implicit Finite Difference + Alternating Direction Implicit Method

The implicit method takes a different approach to the explicit method – instead of considering one unknown in the new time step to be calculated from known elements in the previous time step, we consider that an old point can influence several new points by way of simultaneous equations.  This adds to the complexity of the simulation – the grid of nodes is solved as a series of rows and columns rather than points, reducing the parallel nature of the simulation by a dimension and drastically increasing the memory requirements of each thread.  The upside, as noted above, is the less stringent stability rules related to time steps and grid spacing.  For this we simulate a 2D grid of 2n nodes in each dimension, using OpenMP in single precision.  Again our grid is isotropic with the boundaries acting as sinks.  We iterate through a series of grid sizes, and results are shown in terms of ‘million nodes per second’ where the peak value is given in the results – higher is better.

Implicit Finite Difference Solver (2D)

CPU Real World IGP Compute
Comments Locked

48 Comments

View All Comments

  • mfenn - Tuesday, November 19, 2013 - link

    Expanding upon this point: the data in Anandtech articles is always top notch, but it is becoming more and more obvious that there are two tiers of reviewers when it comes to delivering insight. Anand, Brian, Ryan, and Jarred write good conclusions based on their data and don't care about any blowback from the manufacturers. Dustin and Ian seem beholden to the manufacturer's PR departments and just parrot whatever talking points they're given. It's really disappointing.
  • Gen-An - Tuesday, November 19, 2013 - link

    And who exactly do you think is going to be interested in a kit like this, other than overclockers? The review fit the product and the target audience. It's not for general users, never will be, and doesn't need to be reviewed as if it were.
  • Gen-An - Monday, November 18, 2013 - link

    Patriot has changed the ICs on this kit without changing the SKU. I have two of the 2x4GB kits that only have 8 chips on a single side of the PCB and none on the other, and use Hynix H5TQ4G83MFR 4Gbit ICs (the same ones that are on those DDR3-3000+ kits) and clock accordingly. One kit I bought but took back was like these in the review, double-sided sticks with 16 chips per stick (8 per side) and using a relatively new IC, Hynix H5TQ2G83DFR, which can't clock as high as H5TQ2G83CFR unfortunately.
  • IanCutress - Monday, November 18, 2013 - link

    With the demand for MFR seemingly strong, and other companies other than Patriot going after 2400MHz and up, I guess going to CFR was more a financial choice.

    Companies seem rather reluctant to tell me which ICs they use, and popping a heatspreader off is no mean feat nowadays, with accidents happening regularly: http://forum.hwbot.org/showpost.php?p=207472&p...

    That's compounded by the fact that sometimes the IC # is removed and replaced with the company name over and over. Any suggestions?
  • Gen-An - Tuesday, November 19, 2013 - link

    CFR would have been preferable to DFR, which as you review and this one show, doesn't like going higher than 2600: http://www.techstation.it/articoli/patriot-viper-i...
  • chekk42 - Monday, November 18, 2013 - link

    Ian, whatever happened to low latencies? I'm currently running a 1600MHz CL7 kit which I bought 2+ years ago, but I only ever see CL9 (and up) kits in reviews or for sale these days.
  • joos2000 - Monday, November 18, 2013 - link

    Lower latencies doesn't yield the same performance returns as upping the clock frequencies, that's why.
    http://www.anandtech.com/show/7364/memory-scaling-...
  • ShieTar - Tuesday, November 19, 2013 - link

    The reason is that defining latency as a multiple of clocks is rather silly with a large range of clock speeds available concurrently. What your CL7 means is that you have a latency of 4.38 ns (7/1600MHz). The fastest latencies in other clockings available are:

    1066 CL7 => 6.56 ns
    1333 CL7 => 5.25 ns
    1600 CL6 => 3.75 ns (But only on 2GB kits)
    1866 CL8 => 4.29 ns
    2133 CL9 => 4.22 ns
    2400 CL9 => 3.75 ns
    2666 CL10 => 3.75 ns
    2800 CL11 => 3.93 ns
    3000 CL11 => 3.67 ns

    So as a matter of fact, all kits tested in this review, except for the ADATA ones, have shorter latencies than your own set.
  • Gigaplex - Tuesday, November 19, 2013 - link

    You missed the part where they asked for low latency 1600 and you quoted a 1600 at CL6 without saying where it's from. Like they said, most 1600 kits come at around CL9 which is around 5.63ns. This matters somewhat when Intel CPUs such as the i7 4770K are rated at 1600, any higher and you're running out of spec.
  • ShieTar - Tuesday, November 19, 2013 - link

    Not sure how I "missed" that, it doesn't say anything about a 1600 kit at CL9 in the question :
    " I only ever see CL9 (and up) kits in reviews "
    Well, most kits in reviews and announced sales are probably not 1600 at this point in time. In the review above you see 5 kits at 2400+, with only a single 1600 kit thrown in for completeness. So I assumed that the original poster was expecting DDR3 2400 to also come with CL7. Sorry if that assumption was incorrect.

    The quoted CL6 kits are "OCZ Reaper HPC Edition" (OCZ3RPR1600C6LV4GK) and "Super Talent Chrome Series" (WB160UX6G6). I think both are actually discontinued, because you can buy a 2400 CL9 set and just run it at 1600 CL6. As shown above, you could even buy a 2400 CL10 set and get a little lucky and still run it at 1600 CL6 (same latency as the tested 1866 CL7)

    So sure, DDR3 1600 kits are rarely sold with very low latencies today, that's because low-latency kits are validated and sold at higher frequencies. This does not matter very much, since all kits come with a JEDEC setting to run 1600 initially, and everybody who knows he needs better latencies can lower them by hand to the actual achievable latency. Kits sold as 1600 are really mainly for people looking for cheap memory. Which is fine, as most reviews show little to no relevant gain from faster memory for most tasks anyways.

Log in

Don't have an account? Sign up now