Exploring Storage Options on AWS (S3, EFS, EBS)

TL;DR
In this tutorial, I’m going to explore the different storage options on AWS. We’ll look into the acronyms like Amazon EFS, Amazon EBS, and Amazon S3 (that probably most are familiar with or have maybe heard about).
AWS
A in AWS stands for Amazon. And, while everyone these days will immediately think of the ever so slightly rich (rightfully so!) Jeff:
Amazon, in fact, is not just about selling books (or, well, everything else for that matter), they do a lot of other things. Anyways, sorry for the Jeff sidetrack; Amazon Web Services (AWS) is a flexible, cost-effective, easy-to-use cloud computing platform. What this means is that they offer a lot of various solutions for you, the developer, so that you don’t have to manage your own server. Pretty sweet, right?
AWS Storage Services Overview
AWS has a detailed 54-page long whitepaper which explains the different storage services and features available in the AWS Cloud, and you can check it out here. They provide an overview of each storage service or feature and describe usage patterns, performance, durability and availability, scalability and elasticity, security, interfaces, and the cost model. I’ll try to do the same thing but in a bit more condensed manner.
Name | TL;DR |
---|---|
Amazon Simple Storage Service (S3) | scalable and highly durable object storage in the cloud |
Amazon Elastic File System (EFS) | scalable network file storage for Amazon EC2 instances |
Amazon Elastic Block Store (EBS) | block storage volumes for Amazon EC2 instances |
Amazon CloudFront | a global content delivery network (CDN) |
| | |
Amazon Glacier | low-cost highly durable archive storage in the cloud |
Amazon EC2 Instance Storage | temporary block storage volumes for Amazon EC2 instances |
AWS Storage Gateway | on-premises storage appliance that integrates with cloud storage |
AWS Snowball | service that transports large amounts of data to and from the cloud |
I’m going to explain first four in a bit more detail, and just briefly cover the last four.
Amazon Simple Storage Service (S3)
secure, durable, highly scalable object storage at a low cost
Four common usage patterns for Amazon S3:
- store and distribute static web content and media – great for fast growing websites that have a lot of user-generated content, such as video and photo-sharing sites
- host entire static websites – HTML, CSS, JS, images
- data store for computation and large-scale analytics, such as financial transaction analysis, clickstream analytics, and media transcoding
- backup and archiving of critical data – you can easily move cold data to Amazon Glacier, as we’ll cover below
Amazon S3 doesn’t suit all storage situations. In the following table I’ll present some storage needs for which you should consider other AWS storage options:
Storage Need | Solution | AWS Services |
---|---|---|
File system | Amazon S3 uses a flat namespace and isn’t meant to serve as a standalone, POSIX-compliant file system. | EFS |
Structured data with query | Amazon S3 can’t be used as a database or search engine as it doesn’t offer query capabilities to retrieve specific objects. When you use Amazon S3, you need to know the exact bucket name and key for the files you want to retrieve from the service. | DynamoDB, RDS, CloudSearch |
Rapidly changing data | Data that must be updated very frequently might be better served by storage solutions that take into account read and write latencies, such as Amazon EBS volumes, Amazon RDS, Amazon DynamoDB, Amazon EFS, or relational databases running on Amazon EC2. | EBS, EFS, DynamoDB, RDS |
Archival data | Data that requires encrypted archival storage with infrequent read access with a long recovery time objective (RTO) can be stored in Amazon Glacier more cost-effectively | Glacier |
Dynamic website hosting | Although Amazon S3 is ideal for static content websites, dynamic websites that depend on database interaction or use server-side scripting should be hosted on Amazon EC2 or Amazon EFS | EC2, EFS |
Amazon S3 stores your data across multiple devices and multiple facilities within your selected geographical region. Error correction is built-in, and there are no single points of failure. Amazon S3 is designed to sustain the concurrent loss of data in two facilities, making it very well suited to serve as the primary data storage for mission-critical data.
Amazon S3 bucket can store a virtually unlimited number of bytes.
Amazon S3 is highly secure. It provides multiple mechanisms for fine-grained control of access to Amazon S3 resources, and it supports encryption.
You can use versioning to preserve, retrieve, and restore every version of every object stored in your Amazon S3 bucket.
You can also enable access logging, where each access log record provides details about a single access request, such as the requester, bucket name, request time, request action, response status, and the error code if any.
You can access S3 via their REST API, and they have SDKs a lot of popular programming languages.
As for the price, you pay only for the storage you actually use. For new customers, AWS provides the AWS Free Tier, which includes up to 5 GB of Amazon S3 storage, 20000 get requests, 2000 put requests and 15 GB of data transfer out each month for one year, for free. You can find pricing information at the Amazon S3 pricing page.
Amazon EFS
Amazon Elastic File System (EFS) delivers a simple, scalable, elastic,
highly available, and highly durable network file system as a service to EC2 instances.
Amazon EFS supports highly parallelized workloads and is designed to meet the performance needs of big data and analytics, media processing, content management and web serving.
Due to its nature, you wouldn’t use it for storing archival data, relational database data, or temporary storage.
Amazon EFS is designed to be as highly durable and available as Amazon S3.
You can check out the pricing, or learn more about it here.
Amazon EBS
Amazon Elastic Block Store (Amazon EBS) volumes provide durable block-level storage for use with EC2 instances.
Amazon EBS volumes are network-attached storage that persists independently from the running life of a single EC2 instance. After an EBS volume is attached to an EC2 instance, you can use the EBS volume as a physical hard drive, typically by formatting it with the file system of your choice.
EBS also provides the ability to create point-in-time snapshots of volumes, which are stored in Amazon S3.
Amazon EBS is well-suited for use as the primary storage for a database or file system, or for any application or instance (operating system) that requires direct access to raw block-level storage.
Due to its nature, you wouldn’t use it for temporary storage, multi-instance storage, highly durable storage, static data or web content.
You can find pricing information for Amazon EBS here.
If you want to learn more about managing Amazon EBS volumes, I found this tutorial helpful.
Amazon CloudFront
Amazon CloudFront is a content-delivery web service that speeds up the distribution of your website’s dynamic, static, and streaming content by making it available from a global network of edge locations.
When a user requests content that you’re serving with Amazon CloudFront, the user is routed to the edge location that provides the lowest latency (time delay), so content is delivered with better performance than if the user had accessed the content from a data center farther away.
Amazon CloudFront supports all files that can be served over HTTP. These files include dynamic web pages, such as HTML or PHP pages, and any popular static files that are a part of your web application, such as website images, audio, video, media files or software downloads. For on-demand media files, you can also choose to stream your content using Real-Time Messaging Protocol (RTMP) delivery. Amazon CloudFront also supports delivery of live media over HTTP.
CloudFront is ideal for distribution of frequently accessed static content that benefits from edge delivery, such as popular website images, videos, media files or software downloads.
You can find pricing information here.
Other solutions
Amazon Glacier can reliably store your data for as little as $0.007 per gigabyte per month. Amazon Glacier enables you to offload the administrative burdens of operating and scaling storage to AWS so that you don’t have to worry about capacity planning, hardware provisioning, data replication, hardware failure detection, and repair, or time-consuming hardware migrations.
Amazon EC2 instance store volumes (also called ephemeral drives) provide temporary block-level storage for many EC2 instance types. This storage consists of a preconfigured and pre-attached block of disk storage on the same physical server that hosts the EC2 instance for which the block provides storage.
AWS Storage Gateway connects an on-premises software appliance with cloud-based storage to provide seamless and secure storage integration between an organization’s on-premises IT environment and the AWS storage infrastructure.
AWS Snowball accelerates moving large amounts of data into and out of AWS using secure Snowball appliances. At less than 50 pounds it is light enough for one person to carry. It is entirely self-contained, with a power cord, one RJ45 1 GigE and two SFP+ 10 GigE network connections on the back and an E Ink display and control panel on the front. It is water-resistant and dustproof and serves as its own rugged shipping container.
Conclusion
In this article, we explored some storage options on AWS. We’ve looked at the solutions such as EFS, EBS, S3, etc. We’ve covered pro’s and con’s of each, and I hope that this will help you when deciding which storage to use on AWS.
Leave a Comment