Amazon S3 Glacier
S3 provides various storage classes so we need to understand which storage type is suitable for a given situation.
Since requirement can be of different types like sometimes we need to frequently access the data or sometimes we data needed is infrequent but need to be accessed without delay.
So based on these requirements we can decide from below storage classes:
1. Standard
2. Standard-IA
3. One Zone-IA
4. Glacier Instant Retrieval
5. Glacier Flexible Retrieval
6. Glacier Deep Archive
7. Intelligent-Tiering
1. Standard S3:
S3 Standard provides high durability, availability for frequently accessed data since it delivers low latency and high throughput.It is designed to deliver 99.99% availability.This is the default type of S3 storage.
This is best suited for performance sensitive use cases where latency should be kept low and data should be retrieved quickly.
1.1 Used for frequently accessed data.
1.2 Provides 99.99% Availability
1.3 Low Latency and High Throughput
2. Standard IA (Infrequent Access)
S3 Standard-IA is suitable for data that is accessed less frequently, but requires rapid access when needed.
It offers high durability, high throughput, and low latency of S3 Standard but with a low cost for storage and retrieval.
It is suitable when we want data to store for long term and need not it for frequent access like backup or disaster recovery files.
2.1. Used for data which is needed infrequently but needs to be accessed fast.
2.2. Provides 99.9% availability (less available than Standard S3 by 0.09%)
3. Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)
S3 One Zone-IA is used when data needs to be accessed less frequently, but requires rapid access when needed.
Where other S3 Storage Classes stores data in a minimum of three Availability Zones , this storage type stores data in a single AZ and costs 20% less than S3 Standard-IA.
This is best suited when we are storing secondary backup copies of on-premises data or for re-creatable data or when data is already replicated in another AWS region so availability can be compromised.
3.1 Used for infrequently accessed data which can be re-creatable.
3.2 Same low latency and high throughput performance of S3 Standard
3.3 Provides 99.5% availability
Use case 1:
If any organisation wants to store logs to S3 bucket which are not going to be accessed so frequently and is not being stored anywhere else then we can’t go for One Zone-IA since we can’t ensure the availability of the data when required (in case of a natural disaster in that zone).
So we can go for S3 Standard-IA in this case.
Use case 2:
An organisation is planning to store some images which are not accessed on a regular basis.S3 bucket is being used for store raw image which will be retrieved and processed by some process and final image will be stored to another S2 bucket.
Here for storing raw image, we can use S3 One Zone-IA since type of data is of re-creatable in nature.
Before discussing next 3 storage types,we should understand what exactly is AWS Glacier.
AWS Glacier:
It is similar to S3 only but the only difference is cost.It is the backup and archival storage provided by AWS.It is an extremely low cost, durable, secure storage service that is ideal for backups and archival purposes.
Cost of storing the same amount of data in AWS Glacier will be drastically reduced if we compare it to S3.
AWS Glacier Terminology:
Vaults:Vaults in AWS Glacier are similar to buckets in S3.These are virtual containers which are used to store data.
Archives: Data gets stored in vaults in form of Archives. Archives in Glacier are similar to Objects in S3. Similar to S3 bucket, we can store virtually unlimited data in AWS Glacier hence data can store an unlimited number of archives in a vault.
4. Glacier Instant Retrieval
S3 Glacier Instant Retrieval provides up to 68% lower cost (than S3 Standard-Infrequent Access) and is best suited for long-lived data that is accessed once per quarter but requires millisecond retrieval.
It is also designed for rarely accessed data that still needs immediate access in performance-sensitive cases.It has a minimum storage duration of 90 days.
5. S3 Glacier Flexible Retrieval
S3 Glacier Flexible Retrieval is best suited for data that does not require immediate access, but needs the flexibility to retrieve large sets of data. e.g This type of retrieval is ideal for backup or disaster recovery.
In AWS Glacier, data is not readily available for retrieval. We can retrieve data in 3 possible ways:
Expedited:
Retrieval time is between 1 to 5 minutes.
Standard:
Retrieval time is between 3 to 5 hours
Bulk:
Retrieval time is between 5 to 12 hours.
6. Glacier Deep Archive
S3 Glacier Deep Archive provides lowest cost storage i.e up to 75% lower cost (than S3 Glacier Flexible Retrieval) and best suited for long-lived archive data that is accessed less than once per year.
It is designed for organisations which wants to retain data mianly for compliance requirements like in financial services or healthcare, media and entertainment and public sector.Such organization retains data for 7-10 years or longer to meet customer or regulatory requirements.
S3 Glacier Deep Archive has 2 retrieval options:
Standard: Retrieval time is within 12 hours
Bulk: Retrieval time is within 48 hours
S3 Glacier’s Deep Archive has storage duration of 180 days.
7. Intelligent-Tiering
S3 Intelligent Tiering automatically moving data to the most cost-effective access tier based on usage and access frequency without any retrieval fee, performance impact.
AWS will automatically monitor the store object and based on above criteria it will move the object to frequently access or infrequently accessed category.
AWS maintains 3 tier for Intelligent monitoring:
There are three split tiers according to retention periods:
Frequent Access tier: This is the default tier
Infrequent Access tier: Will store objects which are not accessed for 30 days
Archive Instant Access tier: Will store objects which are not accessed for 90 days
Even for longer term it supports 2 more tiers:
Archive Access tier: This is configurable from 90 days to 700+ days
Deep Archive Access tier: This is also configuration based from 180 days to 700+ days
Related articles:
Gud article