qlyoung's wiki

tiered storage

There are two schools of thought regarding storage. One prioritizes resource efficiency and consistency; data should be stored in one location, and the location made accessible to all clients. This maximizes total available storage space at the expense of availability and access speed. The other stores data where it is needed; this may be in multiple places, thus trading storage for availability and access speed, and introducing the potential for inconsistency in multihomed data.

For a long time I was firmly in the first school. Files live in one big beautiful cluster that you can access from any device. I see the appeal. You can focus all your efforts making that cluster as big and beautiful as you want, as resilient as you want. There's only one copy of data, so you never have inconsistencies. But after lots of time using a big beautiful cluster, I kept having the experience that I wanted some file and I just couldn't get to it. And when I could get to it the experience of browsing around, searching, transferring files all over a network just…sucks. After you buy into the paradigm, after a while you get used to it and forget that it doesn't have to be like that.

In the past my laptop didn't have enough space to store all my data. Small form factor drives just weren't big enough, and then they got big enough but they were so expensive that it only made sense to put them in a workstation (desktop). In 2025, annoyed by the downsides of network storage, I started thinking about how to improve the experience. I realized I have two tiers of data.

working set

I have a set of files I frequently access; my music, photos, documents, and projects. This is what I think of as my “working set”. It doesn't include stuff that falls in the bucket of “huge binary files” - movies, 300gb DNxHD Resolve projects, that sort of thing. Right now my working set is ~4tb.

other stuff

Some things are just too big to store on-device. At today's storage unit prices, storing 30tb of movies that you access once a year on your laptop doesn't make sense yet. Storing those things on a really big cluster accessed via the network is fine.

thoughts

After thinking about these tiers I realized:

  1. My working set is ~4tb
  2. m.2 NVMe drives can now accommodate that working set for a reasonable price
  3. which means I can now have my working set completely stored on my laptop

My desktop already had 10tb of storage and my home server 12tb. Due for a laptop upgrade anyway, I got 8tb of storage on my new machine and then set about collating my working set into an organized directory tree to prepare it for syncing. I then engaged in a massive synchronization campaign that took about a week to settle out. With careful preparation there were almost no sync conflicts and my working set is now replicated on all devices.

At this point in time I exist in storage nirvana. I work on my desktop. Grab my laptop and catch an airplane; all the data I was working on is on my laptop and I pick up where I left off. No need to connect to airplane wifi. Get where I'm going, connect my laptop; it's syncing everything to my home server and desktop in the background.

special case: phone

I only have 1tb of storage on my phone, so it can't store everything; and even if I wanted to, syncthing on iOS basically doesn't work because of the lack of background services. This is completely fine though, because 1) I don't really do much work on my phone and 2) my phone is always online. If I need access to something from my phone, I can download it from my home server. If my phone is completely offline and I *really* need access to my data, it's on my laptop. And if my phone is offline and I don't have my laptop, well, I will do something else.

Further thoughts

As it turns out, this is one of the few concepts in computing that I didn't invent /s. From wikipedia, the free encylopedia:

HSM is a long-established concept, dating back to the beginnings of commercial data processing.

Hierarchical storage management

Panorama theme by desbest
tiered_storage.txt · Last modified: by qlyoung
CC Attribution-Noncommercial-Share Alike 4.0 International Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International