Building a NAS, Part 1: Requirements
A friend would like a new NAS, with a combo of features and performance that would normally place them in the $10k+ range for enterprise grade storage. I offered to see what I could do with hardware from Ebay, and I figured I’d write down my thought process for this kind of thing, in case it’s of use to other homelab shoppers.
I should note that I’m writing this as I go, so I have no idea what the conclusion is going to be. We might end up with something boring like “just buy this Western Digital MyCloud thing”, or it might be a more elaborate hardware adventure. Who knows!
This post is part of a series:
When I start on a new machine build for someone, my first questions are always aimed at teasing out requirements. Often the conversation starts like “I need a new NAS, any recommendations?” With only that to go on, I could suggest anything from a single USB external hard drive, all the way up to a hundred-drive NVMe grid computing monster in the 6-figure range.
The true need is probably somewhere in the middle. So, for a NAS, here’s some conversation starters I used to try and tease out requirements.
- How much do you need to store? Are we talking about 100GiB, 10TiB, 1PiB?
- How much bandwidth in and out of the NAS? Are we looking at 100Mbps, 1Gbps, 25Gbps?
- How precious is the data? Is it scratch files you delete every day, or irreplaceable family photos?
- How available does the data need to be? Is it okay if you can’t access it for a week, or is a minute of unavailability unacceptable?
- Does it need to be rackmounted, or have some other special form factor?
- Will the server do storage only, or also colocated compute-heavy tasks?
- What’s the budget? This influences how crazy we can go in meeting other requirements.
- What are the power and noise budgets? Does the NAS need to be whisper quiet because it’s sitting in the living room? How much is it okay to add to the monthly electricity bill?
- How hands-on do you want to be? Do you want it to be set-and-forget, or will it be a project you’ll tinker with? How much tinkering?
- Any specific data access patterns you know will be happening? Large sequential reads, continuous append writes, …
- Any specific technologies you want to be using? I want to uncover hidden requirements implied by statements like “I need SSDs”.
After that chat, for this build we ended up with the following:
- Need storage for about 5TiB, with ample room for future growth. At this kind of size, I budget for a minimum 2-3x future increase. It’s relatively cheap to do, and as a society we’re not trending towards having less data.
- Several types of usage profiles mixing together:
- Recording video from several networked cameras, i.e. 24x7 sequential write load with occasional short sequential reads.
- Backup storage/syncing, a stream of random access reads and writes of varying intensity.
- General network share serving, 1-2 people worth of infrequent mixed read/write profiles.
- 10Gbps networking, and the NAS should be able to saturate that bandwidth, in both read and write, to/from a single client. (This changed after this post’s publication, so the rest of this post is written assuming 1Gbps. I’ll correct this in future design posts.)
- Needs to be rackmountable.
- Relatively loose constraint on power and noise. Server won’t be in main living spaces, and power is relatively cheap in the area. Quiet and efficient preferred, but no hard ceilings.
- No strict price constraint, but aiming for somewhere in the mid 4 figures. Let’s call it a ceiling of $5000.
- Data should have reasonable durability: drive failures should not lead to unavailability, but beyond that offsite backups and the restoration delay they imply are acceptable.
- Server will also run some compute workloads, notably some realtime video processing.
- Should be reasonably hands-off once installed, but some ongoing tinkering expected - doesn’t need to be completely set and forget.
- Some specific constraints on technologies:
- Runs Linux, no BSD or Illumos.
- Preferably runs ZFS for the data integrity goodness.
- Must have ECC RAM.
- Can host Docker containers and VMs.
- Should be capable of hosting all major types of storage, for future-proofing: SATA HDD, SATA SSD, NVMe SSD.
- Would be great if we can use Proxmox as the OS.
Exploring the solution space
Okay, that’s a lot to take in. Let’s sit with those requirements for a moment, and let some deductions fall out…
What kind of box?
Rackmount and ECC RAM eliminates almost all the consumer off the shelf options. We’re looking at entry level enterprise gear and up.
Good news is, at our price point there are enterprise “off the shelf” options. Synology, ixSystems, QNAP, Buffalo all make boxes that fit comfortably in budget. For example, you could get a Synology RackStation RS2418RP+ for $2400, leaving $2600 for drives.
However, at this price point most appliances are anemic on RAM and CPU. The RS2418RP+ for example comes with 4GiB installed, and a wimpy Celeron processor. This is no good in several dimensions (running VMs, in-RAM disk cache for performance). Adding RAM is possible at a cost, but we’re stuck with the CPU unless we go up to the next tier of hardware… which blows out the budget.
My friend also has a love/meh relationship with appliances. They tend to enforce being too hands-off. That’s great if you’re a business, but it gets in the way of tinkering. ixSystems and FreeNAS is a reasonable compromise on that axis, but our requirement to run Linux eliminates that option.
So, this is pushing us in one of two directions: used enterprise gear, to get a higher tier of hardware at budget prices; or a “whitebox” build, in which we pick out individual components and assemble our own system.
What kind of storage?
We have to use traditional “spinning rust” hard drives as the main storage pool. For the number of bytes we need to store, they’re the only thing that hits a $/TiB that makes sense for the budget.
For reference, at time of writing you can get a 6TB NAS drive for $230. The equivalent storage in SATA SSDs costs $780, and $2400 if you want NVMe SSDs. And that’s before any kind of redundancy, or space constraints on our SATA and PCIe busses.
Does this square with our requirement of saturating 1Gbps in and out of the box? Quick napkin math: a basic NAS drive is going to ballpark 100-150MiB/sec in either direction, or 800-1200Mbps. So, turns out, we don’t have to worry too much for raw throughput! We’ll need a little attention to performance when we pick out specific drives and storage layouts, but the order of magnitude is right.
This also tells us that SSDs as primary storage would be wasteful, as we can’t utilize anywhere near their full bandwidth with 1Gbps networking.
One more thing to consider is fsync-heavy loads. These will fall short of theoretical throughput numbers, because we’re forcing the OS to wait for specific bytes to commit to drives before continuing. This means we’re bounded by the round-trip latency of our drives, which is relatively poor for spinning rust.
At this stage it’s not clear to me that the write load we’re dealing with is going to have that problem. If it does, and we’re using ZFS for storage, we can throw in dedicated ZIL (ZFS Intent Log) devices to give our server a hefty write buffer that hides the spinning rust latency.
To be continued
That’s all for this post. We’ve gone from requirements to a very vague idea of what we’re looking for: a rackmount server, either used enterprise appliance or custom whitebox, with traditional HDD storage, supplemented by SSDs.
There are still a lot of details to work out, and we’ll be returning to the requirements to guide us through that. In part 2, we’ll do a bit more napkin math to work out minimum hardware requirements.