Header Ads

AWS-S3: Data Consistency Model

Now since you understand that Amazon-S3 is just a safe place to store your objects in the cloud. You really have to understand the data consistency model for S3 going into your exam. It is often asked in exam in scenario based questions. I will add some real AWS exam questions at the end of this article to make it more clear. for now let's get to the concepts.

So how does S3 keep your data consistent? Well, basically S3 provides two ways:

It has read after write consistency for PUTS of new objects and then has eventual consistency for overwrite PUTS and DELETES.

Now I know what you're thinking, what does that mean?

Read after write consistency : For PUTS of new objects, all that means is if you upload a file to S3, you are able to read it immediately. So you are able to read it straight after writing to it, So for first upload of the file, its immediately consistent.

Eventual consistency:Now if you were to go in and overwrite or update that object, then it's only going to be eventual consistency So let's say we've got version one of the file, and then we upload version two, and you immediately try and read that object. Basically, what's going to happen is, you might get version one, or you might get version two. But if you wait like a second, you’re always going to get version two. So that is eventual consistency for overwrite PUTS and DELETES.So even when you're overwriting a file, or you're deleting a file, eventually it's going to be consistent but might not be immediately.

So point to remember that is asked often in exam:

  1. you have read after write consistency for PUTS of new objects and you have eventual consistency for overwrite PUTS and DELETES.
  2. So in other words, if you write a new file and read it immediately afterwards, you will be able to view that data.
  3. If you update an existing file or delete a file and read it immediately, you may get the older version, basically the changes can take a little bit of time to propagate.
Now time for some real questions:

Ques: A customer has written an application that uses Amazon S3 exclusively as a data store. The application works well until the customer increases the rate at which the application is updating information. The customer now reports that outdated data occasionally appears when the application accesses objects in Amazon S3.

What could be the problem, given that the application logic is otherwise correct?

A. The application is reading parts of objects from Amazon S3 using a range header.

B. The application is reading objects from Amazon S3 using parallel object requests.

C. The application is updating records by writing new objects with unique keys.

D. The application is updating records by overwriting existing objects with the same keys.

Ans: D .S3 has eventual consistency for Get-Put-Get and Overwrites (Put-Put-Get)

 

Ques: A company's development team plans to create an Amazon S3 bucket that contains millions of images. The team wants to maximize the read performance of Amazon S3.

 Which naming scheme should the company use?

A. Add a date as the prefix.

B. Add a sequential id as the suffix.

C. Add a hexadecimal hash as the suffix.

D. Add a hexadecimal hash as the prefix.

Ans: D, Click here for details

Ques: A customer has a production application that frequently overwrites and deletes data, the application requires the most up-to-date version the data every time it is requested.
Which storage should a Solutions Architect recommend to bet accommodate this use case?

A. Amazon S3

B. Amazon RDS

C. Amazon RedShift

D. AWS Storage Gateway

Ans: B. As we know S3 is eventually consistent for overwrites. It is not suitable for such scenarios. RedShift is for analytics. AWS Storage Gateway is for on-premise to AWS storage connection.

 

Ques: A web application experiences high compute costs due to serving a high amount of static web content.
          How should the web server architecture be designed to be the MOST cost-efficient?

A. Create an Auto Scaling group to scale out based on average CPU usage.

B. Create an Amazon CloudFront distribution to pull static content from an Amazon S3 bucket.

C. Leverage Reserved Instances to add additional capacity at a significantly lower price.

D. Create a multi-region deployment using an Amazon Route 53 geolocation routing policy.

Ans: B. Though we have not discussed cloudfront yet but for distributing static content globally most cost effective is CF with S3 as origin. As s3 is best for hosting static content.

I will keep updating the post with more real exam questions.


No comments

Powered by Blogger.