Impact of memory settings on server performance using the example of Transcend DDR4-2400 ECC modules (TS1GHR72V4H)
There is an opinion that in the server all BIOS settings that relate to RAM should be left in the 'Auto' value, install modules in pairs, of the same manufacturer and the same volume, buy only one manufacturer, and even one episode.
We received two Transcend DDR4-2400 ECC RDIMMs for the test, and today we will check the ancient myths and see what memory generally affects when it comes to two types of virtual machines: VDI (virtual desktop ) and SQL. Our test bench has the following configuration:
- Intel Xeon E5-2603 V4
- Motherboard on Intel C-612 chip
- VMware ESXi 6.7 Hypervisor
- FreeNAS 11.2 NFS DataStorage
- Debian 9 Stretch (Sysbench OLTP)
- MariaDB 10.1.26
- Windows 7 x64 (AIDA64)
To maximize the availability of RAM in the OLTP test simulating the database load, we used the following database server settings. During testing, the virtual machine consumed an average of 1.5 GB of RAM, so we do not depend on the amount of memory.
query_cache_type = onquery_cache_limit = 2Mquery_cache_size = 32Mquery_cache_min_res_unit =8
join_buffer_size = 1Mread_rnd_buffer_size = 1Mmax_heap_table_size = 32Mtmp_table_size = 32M
thread_cache_size = 32innodb_sort_buffer_size = 2Mmax_allowed_packet = 16Minnodb_log_file_size = 128Mexpire_logs_days = 10max_binlog_size = 100M
innodb_flush_log_at_trx_commit = 2innodb_flush_method = O_DIRECTtransaction-isolation = READ-COMMITTEDdefault-storage-engine = innodbinnodb_buffer_pool_size = 4Ginnodb_file_per_table = 1
The virtual machine with the test was stored on a volume created by FreeNAS, installed on the same test bench and connected via NFS. Thus, we distanced ourselves as much as possible from the disk system, hoping that the LARC cache of this operating system for storage will smooth out any database calls to storage, and again will create a load on memory. Each OLTP test was run three times with the following parameters:
sysbench --test = oltp --oltp-table-size = 100000 --mysql-db = test --mysql-user = root --mysql-password = 12345 --max-time = 300 --oltp-read-only = off --max-requests = 0 --num- threads = 16 --percentile = 99 run
The results of the third test were recorded as final results.
Transcend TS1GHR72V4H Memory Modules
In registered memory modules RDIMM, communication between the RAM controller installed in the processor and the DRAM chips on the memory strip is carried out directly, and only address and command signals are buffered in an additional chip installed on the strip memory, called a register. This arrangement allows the production of 4-rank memory modules for use in servers with 12 or more DIMM slots. Simply put, the register allows you to install more memory per server.
TS1GHR72V4H memory module is 2-rank DDR4 2400 DIMM. The volume is recruited by 16 SEC 725 microcircuits of the K4A4G08 5WE BCRC series manufactured by Samsung. These chips belong to the E-die generation, which itself is designed for frequencies above 3000 MHz, that is, on DDR4-2400 memory dies, these chips can be said to `` rest ''.
The main CAS latency is 17 clock cycles, the operating temperature range is from 0 to 85 degrees Celsius, although according to our tests, as will be shown later, the memory does not heat up at half the maximum allowed values.
Interestingly, our test suite used different register drivers. On one plate - Inphi iDDR4RCD2-GS01, and on the other - IDT 4RCD0124KCO. For the first, the manufacturer declares the permissible temperature as much as 125 degrees Celsius, for the second there is no such information, but in general, having studied the documentation for the chips, we can say that they are identical. In both cases, temperature sensors Microchip AT30TSE004A were installed on the memory dies.
In general, as regards the design of memory modules, there is perhaps nothing more to say: DRAM chips have a huge margin in frequency, the register has a huge margin in heating. It is important for server components to understand that the equipment is far from operating at its limits.
So, the first thing we will start with is to see what disabling ECC in BIOS gives. Technically, the error correction is done by a register chip, so I don't think we'll get any speed difference.
Synthetics and the real problem showed diametrically opposite results. In AIDA64, the latency increased slightly, while in the database, on the contrary, it decreased.
The next step is to enable ADR, the hardware asynchronous DRAM update, a feature designed for non-volatile NvDIMM memory. When enabled, the memory controller triggers a hardware interrupt and automatically refreshes the memory pages contained in the DRAM modules. This allows you to keep the information stored in memory up to date in the event of a power outage. It shouldn't matter for regular RDRAM.
I would even say that this function slows down the memory subsystem, so you should not enable it.
Mirroring RAS Mirror is a function designed to create redundancy at the level of memory banks. Here you can draw an analogy with RAID 1 for hard drives: the controller divides the total amount of RAM into 2 channels, one of which duplicates the other. If data corruption occurs in one of the channels, the controller restores data from the second channel. Unlike full mirroring, RAS Mirroring requires support from the operating system, which must be told at the boot stage how much memory will work in fail-safe mode. Let's say you have 128GB of memory - you can enable address space mirroring up to 4GB and 20GB of address space above 4GB. Thus, the total available memory for the operating system will be 113 GB, but the memory occupied by the OS kernel will be protected from unrecoverable errors, which will increase the reliability of the server.
If you do not touch the operating system settings, then enabling RAS Mirror in BIOS will do nothing, except for a slight decrease in performance.
Rank Sparing is another form of crash protection. Unlike RAS Mirror, this is a completely hardware function, the essence of which is that half of the memory is kept in reserve in case of a failure in the main memory channel. If a recoverable error occurs in a DIMM, its contents are copied to spare memory, and the failed rank or the entire DIMM is disabled. For the operating system, if Rank Sparing is enabled, exactly half of the total RAM is available.
That is, if RAS Mirror is able to protect against an error in memory and prevent the operating system from freezing, then Rank Sparing takes effect after the error is detected and prevents the use of the faulty DIMM. As you can see from the test results, enabling this feature negatively affects both the synthetic test and the real application.
Manual alignment of channels and ranks. I have no doubt that the BIOS automatically determines the optimal interleaving mode for ranks and memory channels for maximum performance. We have two two-rank modules installed in our server, so let's set them manually to 2 Channel Interleaving + 4 Rank Interleaving.
There is a slight performance impact, but within the margin of error.
First conclusion: everything is already set up before us
The first conclusion suggests itself: in automatic mode, the memory is already tuned for maximum performance. Any errors in manual tuning can lead to slight fluctuations in performance in real applications, which are at the level of error. If there is no need to configure memory rank mirroring, then it is better not to go into the server memory subsystem: there are no advanced latency settings, and the BIOS of the server board will not allow setting the DIMM frequency higher than specified in the processor documentation.
What to do?
If we need to squeeze the maximum out of the memory subsystem, then we need to deal not with the BIOS settings, but with the hypervisor settings, in particular, the distribution of vCPU resources between virtual machines. The recommended values will not always be the best in terms of performance, and if there are fewer physical processor cores than are allocated to guest operating systems, then the rule of `` more = better '' no longer works, and the optimal resource ratio is established empirically.
The AIDA64 test looks very interesting, it is quite easy for the server - it does not use the disk subsystem from the word "at all". But when configuring 6 vCPUs for Windows 7, we have 2 more vCPUs allocated for FreeNAS, which is in an idle state. Nevertheless, we see a tangible blow to the memory subsystem, and this is exactly the price for Overprovisioning virtual processor cores. The 99% OLTP latency shows that performance degradation also occurs in real-world applications.
Memory Power Saving
Each Transcend TS1GHR72V4H memory module has a built-in temperature sensor, which can be used to track component heating. The main source of heat is, as you might guess, the register chip, and although its electrical power is small, the area of the microcircuit is very small, and the temperature should be removed with an intense air flow. In the BIOS of server motherboards, you can enable power saving for RDIMMs, but some administrators fear that this will degrade machine performance. Check it out?
Definitely - memory power saving should be enabled, the difference in heating of RDIMM modules in a real task is almost 10%, and the effect on speed is less than the permissible measurement error.
Unbalanced third channel
At the beginning of the article it was said that there is a recommendation to install all memory modules of the same size, the same manufacturer, and so on and so forth in the server. Let's add another 2-rank 16GB Kingston KVR21R15D4 / 16 module to our two Transcend TS1GHR72V4H modules of 8 GB each. Theoretically, we should use 3 memory channels, which will give a significant increase in system performance.
Synthetics shows that all rumors about the same memory modules in different channels are not worth a broken cent - three channels give a huge increase in memory speed compared to two. But practice shows that there is zero sense from such an increase, and this is the best time to take stock of our article.
TS1GHR72V4H memory modules come with a lifetime warranty, which is not surprising given that it uses E-Die generation DRAM chips. Today virtualization allows you to extend the life of the server, to which less resource-intensive applications can be transferred over time or used as storage, so a long warranty period is the most important characteristic to look for when choosing components.
Any IT administrator knows that with the release of DDR4, the impact of memory speed on computer performance has become very high. In gaming PCs and desktops, the correct selection of RAM modules can squeeze 10-15% in games or resource-intensive applications such as Photoshop. But in servers that do not allow overclocking the RAM frequency, there is no point in bothering and trying to increase the speed of the memory subsystem. Even an additional memory channel provides less speed gain than a proper configuration of a virtual machine.
Therefore, when choosing RDIMM modules for a computational node, you can simply take Transcend TS1GHR72V4H memory sticks with a lifetime warranty. These modules feel great at frequencies below 2400 MHz, so you can count on server upgrades. The built-in thermal sensor is determined by the BIOS and hardware monitoring of the operating system, the power-saving mode does not affect the performance of the machine, but has a 100% positive effect on the memory life.
Mikhail Degtyarev (aka LIKE OFF)
08 / 10.2018