Cezanne is monolithic
Cezanne is designed in such a way that all units of the CPU are located on a single die. This applies in particular to the memory controller. In contrast, the Chilplet design of Matisse and Vermeer implement a clear separation of the functional units. The memory controller is located on a separate I/O die. The dies are interconnected via the Infinity Fabric to exchange data. Knowledgeable users know that this has a negative impact on access latencies. Can the monolithic design help to at least partially compensate for this?
A look at a die shot from Cezanne tells us that the memory controller is connected to the cache slices via the Infinity Fabric. Since the Hot Chips 2021 it is known that Zen 3 has a ring bus to connect the cache slices. The transition from the ring bus to the IF costs additional cycles, so it is basically to be expected that the memory latencies do not turn out as good as with an Intel CPU.
Memory latencies in comparison
We will now take a look at the memory latencies with the AIDA64 memory benchmark. A few remarks in advance. The tool has recently faced a certain amount of criticism from well-known reviewers regarding the consistency and the implemented methodology. In essence, the criticism is that the values are too low, which could be due to the fact that the test depth is set too low. Comparisons with other tools indeed suggest that the criticism is justified. We will therefore use other tools in the future. Relative statements should be possible based on the AIDA64 results for now. In absolute terms, we publish the figures with reservations.
To better assess the optimization status of Cezanne's design a fixed clock speed of 4GHz is chosen, which is applied to both the 5700G and the 3950X. This gives us comparable conditions as the clock speed affects RAM latency. This is most likely due to AIDA64's test depth does not go far enough. The Zen 2 processor is configured in the BIOS so that only one CCD with 8 cores remains active. The reason why the Ryzen 5000 APU should compete against a Zen 2 CPU is obvious. Both CPUs have a maximum of 16MB L3 cache, which can be accessed by 1 thread at most without having to perform expensive remote accesses to the L3 cache of the other CCX module. Of course, the latter only applies to the 3950X. The 5700G has a 16MB composite of cache slices whose interconnect topology is implemented as a ring bus.
The results are quite surprising and open the first field for criticism of Cezanne's design. With the same clock speed, comparable cache size and the same memory kit with identical settings, the RAM latencies are very similar in comparison with the previous generation, which also comes as an MCM design. This raises the question how this is possible. Was the memory interface and the memory controller itself designed with certain compromises in terms of energy aspects? Whatever the reason is, the negative impact on memory latencies is obvious.
Overclocked to 2166MHz memory clock with a synchronous IF on the other hand, the latencies can be optimized drastically. The value drops to about 50ns in total. The i9-10900K, on the other hand, has a latency of almost 36ns with a 2200MHz memory clock speed. This clearly shows the advantage of a memory controller that is directly connected via a ring bus.
Methodology
Frametimes were captured using CapFrameX version v1.6.4.
Configuration of CX
- Overlay refresh rate 1000ms
- Auto-disable OSD active
- Run History and Aggregation active
- Outlier tolerance 3%
- 3 valid runs with a time of 20 seconds = 1 valid session
- Sensor logging active with a refresh rate of 500ms
- Default configuration of sensor logging for standard benchmarks
Game settings
- 720p resolution
- Reduction of render scaling if required
- Reduce AA/AF/AO to a minimum
- Disable Post Processing
- Maximize all other settings to maximize drawcalls.
Overall, demanding custom scenes were used to maximize stress on CPUs and memory.
Metrics
- Average FPS
- 1% percentile, which is not sensitive to reproducible and especially random outliers.
Test system
An open test bench without further active cooling was used for the gaming benchmarks and the cooler tests. The board is mounted horizontally on the test bench. The ambient temperature during the cooler test was 24°C with fluctuations in the range of ±0.5°C. A be quiet! Dark Power Pro 650 watt power supply provided the components with power. The CPUs were cooled with the Noctua NH-D15 chromax.black. All Ryzen CPUs ran on a Gigabyte X570 Aorus Master. Graphics output was handled by a PowerColor Red Devil Limited Edition AMD Radeon RX 6800 XT, which was overclocked to 2500/2100MHz.
5700G (4GHz)
CPU: AMD Ryzen 7 5700G@4GHz
RAM: G.Skill 4x8GB 3200MT/s CL16-16-16-36 1T Gear Down Mode enabled
3950X (4GHz)
CPU: AMD Ryzen 9 3950X@4GHz 1 CCD 8 Cores
RAM: G.Skill 4x8GB 3200MT/s CL16-16-16-36 1T Gear Down Mode enabled
5700G OC
CPU: AMD Ryzen 7 5700G 100MHz Offset PBO
RAM: G.Skill 2x16 4333MT/s CL16-17-16-34 1T Gear Down Mode enabled
5700G RAM OC Timings
5900X OC
CPU: AMD Ryzen 9 5900X 100MHz Offset PBO
RAM: G.Skill 2x16 3800MT/s CL14-15-14-28 1T Gear Down Mode enabled
Game benchmarks
The benchmark suite used is significantly smaller than the basic suite normally used for launch reviews. The selected games represent a subset which strongly puts stress on the CPUs. The achieved frame rates are partly significantly below what modern monitors can display. The scenes and settings can be viewed on Youtube, so everyone who is interested can reproduce the benchmarks.
- Crysis Remastered
- Cyberpunk 2077
- Hitman 3
- Metro Exodus Enhanced Edition
- Shadow of the Tomb Raider
- Star Wars: Jedi Fallen Order
IPC Comparison at 4GHz
All percentage deviations relative to the mean result in an average lead of 4.22% for the R7 5700G. The APU is not faster in all games. Hitman 3 and Star Wars: Jedi Fallen Order can be attributed to the Zen 2 CPU. The lead in Metro Exodus and Crysis Remastered is interesting. The Zen 3 cores with the higher IPC have a significant advantage here.
5700G 4GHz vs stock Settings
The R7 5700G boosts up to 4.5-4.55 GHz in games. That is more than 10% clock speed. On average, this only leads to 3.22% more gaming performance related to the average FPS. According to the sensor data from CapFrameX, the package power is always well below the maximum of 88 watts. The exact reason for the poor scaling is unclear.
5700G vs. 5700G OC
The performance increase is partly remarkable. Shadow of the Tomb Raider runs 28.6% faster when using faster memory with tighter timings. On average, overclocking the memory and CPU enables around 20% more performance. This corresponds to 1-2 generational leaps.
It should be explicitly noted that this comparison was made with different RAM kits. However, this is also intended in the sense that it should be shown which additional performance can be achieved with an expensive RAM kit compared to a "normal" one. The fact that 2x8GB compete against 2x16GB has no measurable effect on the results outside the margin of error, because the 2x16GB kit is dual ranked. Due to 4-way interleaving there is no advantage or disadvantage here. Only when the kits are clocked differently and different timings are used a difference in performance is measurable.
5900X OC vs 5700G OC
Although the 5700G has much more potential in terms of RAM OC and the L3 cache has become much larger compared to the predecessor (Renoir), the Cezanne APU doesn't stand a chance against the R9 5900X, whose overclocking potential has reached its limits with a transfer rate of 3800MT/s. To be more precise, the IF of the R9 has reached its clock speed limits here, because the memory clocks significantly higher as seen with the results of the 5700G. Thus, the memory itself and the motherboard can be ruled out as limiting factors.
All in all the R9 can outperform the APU by 13.8% when both CPUs are heavily overclocked. Hitman 3 in particular stands out here with over 25%. In contrast, the difference in Metro Exodus is only 4.7%. The difference does not come from the clock speed, because the 5900X boosted up to 4.7 GHz on average in the tested gaming workloads.
How good is the boxed cooler?
The 5700G comes with a boxed cooler. For this test the cooler was mounted with the ex works applied thermal paste. The CPU was put into a very high load state for 10 minutes using Cinebench R23. Since no loudness meter was available for the test, no quantitative statements can be made about the loudness. The fan produced high-frequency noise during the load test, but it was not subjectively unpleasant at any time.
Unfortunately, the temperatures are rather poor. The CPU (Tctl/Tdie) settles relatively quickly at 95°C. The Noctua NH-D15, on the other hand, performs almost 30°C better with about 68°C. Without any reserve for higher ambient temperatures, the boxed cooler is not recommendable.
Conclusion with disillusionment
For whom is Cezanne disappointing now? Basically for everyone who expects the same from the 5700G as from one of the newer Intel CPUs with an integrated graphics unit: advantages of a monolithic design and high gaming performance. Despite the fast Zen 3 cores, Cezanne can hardly set itself apart from a Zen 2 CPU with 8 cores. Even stock settings are not particularly helpful for this, because the clock speed scales rather moderately with the performance. The optimization potential is considerable but an overclocked 5900X is still significantly faster with almost 14%, which would not be different for an R7 5800X, as expected.
The fact that the 5700G does not support PCIe Gen 4 and AV1 encoding also falls by the wayside does not help the attractiveness of the product. The (speculative) question is, why does AMD bring a compromising design on the interface and the memory controller for the desktop? How costly would a corresponding optimization have been?
If you can do without the included boxed cooler, the features just mentioned above and if your are rather looking for an efficient CPU with high application performance and solid graphics performance (iGPU), the 5700G is a very good choice.
From a special perspective the R7 5700G is only partly recommendable for ambitious gamers despite the high optimization potential. It would be desirable that AMD will offer stronger models for the desktop in the future that can better take advantage of a monolithic design.