Consistency Models of Cloud Storage Services
This article compares consistency models of all major cloud storage providers. The CloudRail Unified API makes integrating different cloud storage services into your code easy. It provides easy to use, abstracted interfaces that allow you to treat groups of those services similarly by using the exact same methods. Yet, the services are not 100% the same and one important distinctive attribute is the service’s consistency model. It can impact how you write your code and is thus something every developer should be aware of.
What are consistency models?
Services that store files in the cloud adhere to one or a mix of a number of different consistency models. Consistency models reflect which assumptions can be made over the availability of data after a transaction. The most common consistency models are (variations of) read-after-write consistency and eventual consistency.
Read-after-write consistency means that data is available for reading immediately after it was written. For example, if you upload a file to a service with read-after-write consistency you will be able to move or download the file immediately afterwards.
Eventual consistency on the other hand means that data will be available for reading after it was written but makes no assertion as to when. This can lead to stale and inconsistent views of the data. For example, if you upload a file to a service with eventual consistency you might not be able to move or download it immediately afterwards. In fact, listing the contents of a folder you just uploaded a file to might not show the file and if you try to delete it, this might fail with the service claiming that the file does not exist.
Read-after-write consistency is generally easier to work with but also more expensive to maintain in terms of money and availability. If the cloud data is distributed over multiple locations, maintaining read-after-write consistency can be hard. It means letting read requests wait until the system has assured that previous writes have successfully and completely propagated. Consequently, if one data node fails, the system might have to stall the reads until the node comes back up.
Which service supports which model?
This is why different cloud storage providers support different consistency models and the following table shows which of the CloudRail supported services uses which model:
Service Name | Measured Consistency | Official Consistency | Remarks | Unified API |
---|---|---|---|---|
Dropbox | Read-after-write | ? | Cloud Storage | |
Google Drive | Read-after-write | ? | Cloud Storage | |
OneDrive | Eventual | ? | Cloud Storage | |
OneDrive Business | Eventual | ? | Seems to become consistent quicker than the standard version | Cloud Storage |
Box | Read-after-write | ? | Cloud Storage | |
Egnyte | Read-after-write | Read-after-write | Cloud Storage | |
Amazon S3 | Eventual | Eventual | Read-after-write consistency for a few special cases | Enterprise Cloud Storage |
Azure | Read-after-write | Read-after-write | Enterprise Cloud Storage | |
Google Cloud | Mixed | Mixed | List operations for objects and buckets are eventually consistent, the rest is read-after-write consistent | Enterprise Cloud Storage |
Backblaze | Read-after-write | ? | Enterprise Cloud Storage | |
Rackspace (Swift) | Eventual | Eventual | Enterprise Cloud Storage |
How do I deal with it?
Dealing with read-after-write consistency is easy, it works like one would intuitively expect it to. Call a (CloudRail) cloud storage function to write a file and make sure that no error/exception has been raised by the SDK. Then, call any kind of reading (CloudRail) cloud storage function on that file and it will work just fine.
Dealing with eventual consistency is harder. You need to program more defensively, so for example you will want to (repeatedly) check if a change has been made on the cloud storage after writing before you perform any kind of read operation. Alternatively, you can expect the read operation to fail and retry after a while if it does. This requires more implementation effort but works for both eventual and read-after-write consistent storages.