The New S3 is Magic!
I recently tweeted
in between seasons, Star Trek writers freelance as persistence system marketers
And it blew up and created wild controversy across the internet (not really, it got zero likes).
What I mean by this is "no impact" is doing a lot of work here.
Let's consider that this is old system:
A) client submits write request
B) S3 enqueues job to write to all replicas
C) S3 responds to client that write is done
And this new system (Design 1 - I'll consider another possible New System design later in the post):
A) client submits write request
B) S3 writes to all replicas
C) S3 responds to client that write is done
Regarding performance: it probably means "same or better performance as previous S3 incarnation, X% of the time" and they don't tell us what X is. Surely B in the new system must take more time than B in the old system, all other things being equal?
Regarding availability: I'm not sure what is being asserted here. In the old system, I'm guessing that if the job doing the writing lost access to one of the replicas, client reads would still be able to continue. In the new system, perhaps this system is the same. So, availability hasn't changed, but in both systems the cost is durability, which isn't discussed.
Lets consider another design (Design 2) for the new system:
On client write:
A) client submits write request
B) S3 writes to a catalog indicating that the object is being written
C) S3 enqueues a job to write to all replicas
D) S3 responds to client that write is done
E) when job from C is done, S3 writes to catalog that object is not being written
On client read:
A) client submits read request
B) S3 checks if object is currently being written.
B1) if it is not, S3 returns object
B2) if it is, S3 waits for write to finish and then returns object
Regarding performance: AWS marketing is probably saying "for X% of the time, B2 happens infrequently enough and is fast enough to be acceptable"
Regarding availability, similar to Design 1. But, there's a new moving part here, the catalog, which is a new point of failure simply because it's an additional part of the system and also because it is a database. We don't know how the catalog is implemented, but its failure rate must be_multiplied_ against the failure rate of the rest of the system. So, removing the catalog would significantly decrease the S3 failure rate. They are asserting that even with this multiplier, the failure rate is the same or better than before. Great! But it's still higher than it otherwise would have been.
To be clear I am very happy with this new behavior in S3 and I trust that it's going to be a big win for the vast majority of workloads and situations. It's just the marketing language that I'm criticizing.