qlyoung's wiki

This is an old revision of the document!


tiered storage

There are two schools of thought regarding storage. One prioritizes resource efficiency and consistency; data should be stored in one location, and the location made accessible to all clients. This maximizes total available storage space at the expense of availability and access speed. The other stores data where it is needed; this may be in multiple places, thus trading storage for availability and access speed, and introducing the potential for inconsistency in multihomed data.

For a long time I was firmly in the first school. Files live in one big beautiful cluster that you can access from any device. I see the appeal. You can focus all your efforts making that cluster as big and beautiful as you want, as resilient as you want. There's only one copy of data, so you never have inconsistencies. But after lots of time using a big beautiful cluster, I kept having the experience that I wanted some file and I just couldn't get to it. And when I could get to it the experience of browsing around, searching, transferring files all over a network just…sucks. After you buy into the paradigm, after a while you get used to it and forget that it doesn't have to be like that.

I finally realized that I fundamentally hate network storage as a primary storage location. It requires a network and even when you have one it's just too slow.

Yes, some things are just too big to store on device. At today's storage unit prices, storing 30tb of movies that you access once a year on your laptop doesn't make sense yet for most people. And for that, networked storage is fine - you rarely access those items.

After assessing my current data distribution and doing some consolidation I established that 8tb could store all of my frequently accessed files, which I think of as my “working set”. This covers my music, photos, documents, and projects. It doesn't include stuff that falls in the bucket of “huge binary files” - movies, 300gb DNxHD Resolve projects, that sort of thing.

As it turns out, this is one of the few concepts in computing that I didn't invent. From wikipedia, the free encylopedia:

HSM is a long-established concept, dating back to the beginnings of commercial data processing.

Hierarchical storage management

Panorama theme by desbest
tiered_storage.1760459959.txt.gz · Last modified: by qlyoung
CC Attribution-Noncommercial-Share Alike 4.0 International Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International