Synchronisation [SBS.14.22]
In the previous post, we left off by defining our blocks of data and noting their primary storage host.
20 years ago, that would have been the complete picture. You had files on your computer at home. If you wanted them somewhere else, you had to put them on an external drive and move them there yourself. And now you have two copies! Careful you don't update the wrong oneβ¦
Windows 95 shipped with Microsoft Briefcase, a feature intended to make this shuttling-of-data-between-computers easier. It was a cool idea, but I don't remember it working very well; certainly, it was never very popular. πΌ
Dropbox
Then, in April 2007, Dropbox launched. It changed everything: suddenly, I could access my files anywhere I wanted. Dropbox introduced the concept of synchronisation to the general public.1
The first MacBook Air was launched in January the next year. If you don't watch the whole thing, make sure to watch Steve's reveal, another iconic moment in technology. With that, the number of people carrying a computer around with them exploded.
Suddenly, data was everywhere at once.
Everywhere?
So here's the problem. Theoretically -- in an ideal world -- your data could be everywhere at once. Nothing technically prevents it, and if you can do it, you probably should. It's way simpler.
Let's draw that diagram to see what it looks like.

In reality -- and this is my actual problem -- I have more data than can fit on a laptop. You might think that cloud storage space is the limiting factor, but it isn't. Cloud storage is essentially infinite: just pay a few extra quid a month. But this laptop has a 500GB hard drive and the only way to change that is by buying another laptop.
Selective/partial sync
The solution is obvious enough: don't synchronise everything. Only synchronise to each machine that which you require on that machine. It's great that this is a solutionβ¦
β¦but we just cracked open the Complexity Egg. Before: nice neat situation. π₯ After: complex mess all over the bench. Get a cloth. πͺ£

The problem with selective sync
To be clear: technically, this is no bother. This is what computers excel at.
The problem is that what was once simple, clear, and unambiguous -- all of my data is here and there -- is no longer the case. And now you have to think, this piece of data β¦ do I want it here and there? Can I afford to keep it here? How long do I need to keep it here? And so on.
Embedded data; the different classes of data; different sync methods
Here's another consideration, which I'll just touch on briefly lest we get too far into the weeds. Once these introductory articles are finished we can go deeper on some of these topics. Just ask.
Data isn't neat. You don't just have one folder of stuff over here and another folder of stuff over there. You have one parent folder and it contains a bunch of stuff and some of that stuff is more important, and within it you might have a subfolder which is yet more important, and it's all embedded and inter-related.
And there isn't just one way of synchronising things. I use iCloud Drive for my Documents folder. Apple Photos has its own method of synchronisation. And I use 3rd-party software called Syncthing to keep data in sync with my server. I'll talk about Syncthing in a future post.
But for now, don't let that worry you. Baby steps.
My diagram
Here's how my real-world diagram looks.

This is where I think the diagram really starts to shine. Because I can still understand what's going on there. And as things change, I can come back and update it.
I've found that a general rule is that if you can't draw it, it's too complex. For sure I've built myself situations like that in the past. If yours is like that, it might be time to simplify.
That's more than enough for one day. Go and update your diagram, and don't forget to ask for help on the forum or Discord if you need it.
In this series
Here's the table of contents for this mini-series.
- 22.00.0101 My data storage & backup strategy
- 22.00.0115 Storage, data, & backups [SBS.14.20]
- 22.00.0116 Data [SBS.14.22]
- 22.00.0119 Synchronisation [SBS.14.22]
Footnotes
-
The launch post on Hacker News is famous in nerd circles because of the top-voted response in that thread.
User
BrandonM
responded (emphasis mine) that 'for a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem'.This massively understated just how much simpler Dropbox was. Drew Houston became very rich. Brandon did not. β©