Synology UC3200 Review: affordable dual-controller iSCSI storage

Perhaps since Synology released its first rackmount NAS, at all press conferences the company has been asked when will they release a 2-controller solution? The manufacturer himself explained for a long time that the company has a High Availability solution for organizing NAS into a cluster, that in the modern world of file access and local clouds it is possible and necessary to achieve fault tolerance by software, but apparently surrendering under the onslaught of questions, it was released at the end last year, its 2-controller storage of the Active-Active type, stepping into a new territory for SAN devices.

Synology UC3200

As is often the case in our work, getting the UC3200 for the test was not an easy task: there was too much demand from integrators for a test sample. But, thanks to the worldwide quarantine, I managed to feel this storage system in all its glory

IP SAN is the correct term for iSCSI storage

Synology UC3200 is a relatively new type of storage system designed for converged networks that are increasingly prevalent in the mid-market. We have already written many times, but I will again allow myself to recall the basic principle of network convergence: in your company, all network traffic is transmitted via TCP over a copper cable (twisted pair). This includes storage traffic (NFS, FC, iSCSI) and application traffic, and voice, and prioritization and fault tolerance are determined by conventional network switches. As a result, you save on equipment due to the unification of the data transmission medium: copper twisted pair is used everywhere, or at least optics, but still only an Ethernet network, which is easy to manage and maintain.

Usage pattern

And although networks have become "converged" since the time they learned to pack FC traffic in them, the use of FCoE does not give any advantages over the "native" Ethernet protocols. In fact, if the task is to set up a centralized storage system for virtualization hosts, you will choose between NFS and iSCSI protocols. The dispute about which protocol is better is devoid of any sense: it is a question of the integrator's preferences and the requirements of a particular installation, and in terms of speed these two types of access are approximately the same, both support Multipath to resist communication disruptions, both protocols are offloaded by modern network controllers.

Synology has only implemented support for the iSCSI protocol in the UC3200, so this device can no longer be classified as a NAS, because it provides block access to LUNs, but it should not be classified as a SAN, because there is no FC protocol here, and you don't have to buy separate switches for SANs. And in order to learn how to "properly prepare" the UC3200, it is important to understand that we are facing an independent type of storage system, very cheap by the standards of Synology servers, created for those who need to configure iSCSI volumes once and forget about the entire storage infrastructure for several years. The developer is positioning this device as storage for Microsoft Hyper-V, OpenStack Cinder or VMware ESXi servers, which we will use in our tests.

DSM UC: new operating system

We are accustomed to the fact that Synology NAS is cool, and therefore expensive, because usually in one box you have powerful scalable software for video surveillance, a file backup system, and a backup system for servers and work computers, and your own hardware and container virtualization ... none of this is present in DSM UC's operating system, nor is there a "package" feature that allows you to install applications from the Synology repository. Why is it so? Because before you is a highly specialized device focused on increased reliability and speed. Fortunately, the DSM interface has been preserved with the same storage manager, iSCSI manager and search.

At the center of the web interface is High Availability Manager, a high availability technology that synchronizes two identical Synology controllers with each other. In general, the HA manager was available and tested on Synology NAS, but here it just works inside one device, synchronizing two controllers.

For each "head", statistics on CPU load, memory usage, disks, partitions and iSCSI volumes are available, as well as the ability to configure "performance alarms" - warnings about too high processor load, increased network latency or when accessing to the disk. Notifications can be sent to you by E-Mail, SMS or Push technology, which I find most convenient.

iSCSI storage organization

Three years ago, Synology relied on BTRFS, and today this file system is the default on all devices in the company. In the UC3200, all LUNs are stored as files on created volumes, and functions such as space reorganization and snapshots use the Copy on Write technologies used in BTRFS, so there is no longer support for the good old EXT4. Pay attention to the LUN type when creating it: "Thick Provision" promises higher speed and is recommended for database maintenance, while "Thin Provision" has advanced features: defragmentation, space reorganization to reclaim unused space and up to 256 snapshots per each LUN. The latter you can not only store and protect from deletion, but also replicate to another Synology NAS, keeping off-site copies of your LUNs.

The UC3200 has a feature not found in other Synology NAS: for LUNs, you can enable buffered access for both read and write operations. This option will be useful for arrays on ordinary hard disks used for backups, archives of video surveillance systems, for workstations of a video editing operator, in general for sequential access.

In total, you have access to up to 128 LUNs, up to 128 iSCSI targets and up to 32 internal volumes. Here it is necessary to clarify that in Synology's terminology, a "Storage Pool" is a disk array that is divided into partitions on which LUNs are already stored. You can assign the created pool to the first or second controller, and then change the binding to balance the load between the processors. Yes, the beauty of 2-controller storage systems is that you can create multiple storage pools by distributing the load between the heads.

Construction

In terms of topology, Synology UC3200 represents two servers connected through a switch. The head unit has 12 3.5 "SAS drive bays, which can accommodate both 2.5" SSD and 3.5 "HDD. At the time of writing this review, supported hard drives up to 16 TB and SSDs up to 3.84 TB, and the full list can be found on this page.

Synology UC3200

Each node is a separate server assembled on a 4-core Xeon D-1521 processor with a frequency of 2.4-2.7 GHz. The Xeon D family is specifically designed for NAS and low power devices. This processor has a 2-channel DDR4 ECC Registered PC2133 memory controller, and one 8GB RAM module is installed in each UC3200 node.

In our article on exploring caching in Synology servers, we found out what NAS caches in memory and iSCSI LUN data to speed up read operations, so in certain cases you can avoid wasting disk bays on SSDs, but rather increase the amount of RAM to speed up the disk array. By the way, speaking about using SSD caching, I would like to note that VMWare ESXi starting from version 6.7 U2 has a very good caching mechanism on some storage volume.

Interestingly, the storage system has a function to synchronize data stored in memory, which works as follows: let's say you had to restart one of the controllers to install updates or because of a breakdown. As soon as it connects to the active state, the second controller will transfer the data cached in RAM to it, so you do not have to spend time warming up its cache, even if this controller was a backup and did not have active storage pools. Thus, the storage system remains not only working during the shutdown of one of the controllers, but also constantly `` warmed up ''. The same is true for SSD cache with the only difference that the dual-port SAS SSD drives "do not notice" shutdown of one of the controllers, and do not run the risk of data loss.

When choosing drives, we recommend taking a closer look at the Seagate Exos X16 series of hard drives. These are helium hard drives with a volume of 10 TB or more, designed for round-the-clock work in data centers. These hard drives support 512E technology, so they can work with sectors of 512 bytes and 4 kilobytes. The sequential access speed reaches 260 MB/s, that is, already four hard drives will be enough to fully load a 10-gigabit channel. These drives have very low standby power consumption of only 5 W, which results in a record low relative power consumption of 0.31 W per 1 TB of capacity. The reliability of the drives is evidenced by their 5-year warranty.

Since we're on the subject of scalability, the accessories available include memory modules, 10GBase-T Synology 10Gb Network Controllers and 10/25Gb Intel, Marvell and Mellanox Network Controllers. By default, each of the nodes has 2 regular 1-gigabit ports and one 10-gigabit 10GBase-T port, so the issue of expanding the number of network connections can be especially acute if you use optics (read our article on the differences between 10GBase-T and SFP + in 10 Gigabit networks). For capacity expansion, the Synology UC3200 can be connected to two RX1219SAS disk enclosures, which are JBODs for 12 3.5 '' drives with a fail-safe SAS expander and power supply.

Expansion scheme

In fact, such an expansion scheme can withstand failure at the same time: three power supplies, two SAS expanders and one controller, but the daisy chain connection of expansion shelves will not allow you, for example, to pull out the middle shelf RX1219SAS on a working machine. Of course, the likelihood that you will need to do this is negligible, well, unless in the future Synology has an expansion shelf for 2.5-inch drives, and you decide to replace the HDD with an SSD ... But let's not suck out almost unrealistic use cases from our fingers. , but let's see how the storage system behaves when working out various failures, because I consider this indicator the most important for this device.

Fault tolerance testing

When configuring the storage system, you need to choose which network ports will operate in a fail-safe mode so that when one controller is disconnected, their IP addresses are duplicated on the second controller. Here, simple mirroring is provided: port 1 on controller A is reserved with port 2 on controller B, and so on. Note that for fault tolerance, the reserved ports must have a static IP address, the same subnet, gateway, and even MTU. These are quite normal and understandable requirements, and to see how fault tolerance works in NAS, let's start with synthetic tests.

Testbed configuration:

  • Processor: AMD EPYC 7551p
  • Cooling: Noctua NH-U9 TR4-SP3
  • Motherboard: ASRock Rack EPYCD8-2T
  • Network Card: Intel X550-T2
  • Memory: 48GB Transcend DDR4-2400
  • SSD: Transcend TS1TSSD230S

NAS:

  • 4 x Seagate Exos 16Tb HDD
  • RAID 10

OS:

  • VMWare ESXi 6.7U3
  • Windows Server 2016
  • iSCSI Connection
  • LUN file system - NTFS, 4kb

To do this, connect a regular LUN of the Thin Provision type in Windows Server 2016 and look at the latency of access to the volume under different conditions. The first test is a 5-minute random read of a 4K sector, in which we see good consistent access stability throughout the interval.

Access time test

When disabling the active controller in Random 4K reading mode, the idle time is just some record low - only 13 seconds, and I won't be mistaken if I say that 99 % of applications will not even feel this slight delay and will not result in service outages.

Access time

Working on a backup controller in read mode is also no different from working on the main one, except perhaps with a slightly increasing latency, which will be visible on some SSD models, but from a practical point of view will not affect the service.

It takes about 120 seconds for the main controller to return, but the interruption in disk access is already about 20 seconds, and as we can see from the graph, access to the array is interrupted twice.

Controller access time

The results demonstrated by the Synology UC3200 are, if not a miracle, then a real breakthrough, because such a short switching time from the main controller to the backup controller is characteristic of much more expensive machines. On this one could give a curtain, but first you need to make sure that in real life everything will be as smooth as on synthetic tests. Let's repeat the above for a 2-stream 50/50 4K Rnd Read/Write load.

Время доступа

Our disk subsystem is based on hard drives with a spindle speed of 7200 RPM, and of course the access time jumps a lot. Over time, obviously, predictive algorithms give up and the maximum latency increases. The switching time from the active controller to the standby controller is already noticeably increasing - up to 20 seconds, but it still remains relatively low for a device of this price level. Returning the controller to active mode interrupts the storage system for 15 seconds, after which the overall array latency is noticeably reduced.

Let's move on to testing directly in VMware

Earlier in the text, I mentioned that ESXi 6.7 has a very good caching system, and I'm wondering how it will affect the switch time? First of all, let's plug in the `` fat '' iSCSI volume with standard parameters and fully partition it in VMFS 6, then create a virtual disk on it and forward it to the guest Windows Server 2016 for testing.

Mounting iSCSI volume in VMware ESXi

The virtual machine itself is on a different disk during testing, so any manipulations with the iSCSI volume will not affect its performance.

Testing with disabled cache

Disk subsystem downtime is slightly higher than Windows

Testing with cache enabled

And caching increases it even more. It's surprising to me that VMware's iSCSI initiator performs worse than Microsoft's, so if you need high speed iSCSI volumes under Windows, it's better to forward the LUN from the Synology UC3200 directly to the guest system and back it up using storage. It will be faster than ESXi virtual disk. But even in such a simple version, when the virtual disk image is on an iSCSI-attached volume, the switching time to the backup controller is too short for the guest OS to issue a disk access error.

Warranty

Synology UC3200 has a 5-year warranty, during which the manufacturer undertakes to keep spare parts for storage systems available in warehouses, however, there is no extension of the warranty period, and extended service packages of the “NBD” level are only expected in the future.

Recommendations when ordering

The Synology UC3200 has an average retail price of $ 7,500 in a diskless configuration, no rack rails, and 8GB of RAM per controller. No additional licenses are required to operate the device, and even with the available 8 GB of memory, the system works quickly and without brakes, and due to the lack of additional functions, it is not required to expand the memory. If your company already uses a Synology Server and would like to purchase the UC3200 through a tender, please specify compatibility with Snapshot Replication, which only works between Synology devices, as a prerequisite. This will secure yourself against the supply of analogs.

Overall, the Synology UC3200 is an interesting replacement for SAN devices that does not use Vendor Lock for hard drives and SSDs, which means you don't have to worry that your company will face restrictions on HDD/SSD shipment in the future. This device is for those who need increased reliability with extremely short switching times between controllers, which not every storage manufacturer has even in Active-Active mode. If you still need a minimal set of business packages, including Surveillance Station , then Passive has a 2-controller model for you, like SA3200D they say this is a completely different story.

Mikhail Degtyarev (aka LIKE OFF)
08/04.2020


Read also:

Storage and backup of virtual machines on QNAP NAS

In small companies, NAS can significantly reduce infrastructure costs. Those who decommissioned servers yesterday by moving resources to the NAS are canceling their subscriptions to the software that QNAP performs out of the ...