Hosting private static content in S3 using AWS Cloudfront
AWS S3 is always a best place to keep your static content for your website due to the nature of high durability and also high availability. And it’s always highly recommend to set your S3 bucket as private, but how to host a web static file without a public access? This is where the AWS Cloudfront (CDN) come into the picture.
Cloudfront is a managed CDN service provided by AWS where it will help to cache your content at all the edge location around the world so that it’s can increase the speed when your client try to browse the your website. For example, the content in S3 bucket which located in Singapore region able to be cache at US region which can reduce the latency when your client try to access your website.
In order to access the content in your private S3 bucket using the Cloudfront, AWS created a policy which call “Origin Access Identity” or in short call OAI which only authorised the file access from Cloudfront. In this post, I will share on how to create the Cloudfront with the OAI permission.
If you wish to know more detail about the OAI, you may refer to AWS documentation.
Create bucket with no public access
Go to S3 console and click on “Create Bucket”, select the region you wish to host your S3 bucket and enter the bucket name, by default the bucket is block all public access which AWS try to reduce the human mistake which we may accidentally created a public bucket. The rest of the setting may leave it as default.
Create Cloudfront CDN distribution
- Go to Cloundfront console and click on “create distribution”.
- Select the delivery content as “Web”.
- On the creation page, select the S3 bucket as the origin domain name, than make sure select “Restrict Bucket Access” as Yes, than “Grant Read Permission on Bucket” as Yes, Update Bucket Policy.
- For the Viewer Protocol Policy, select “Redirect HTTP to HTTPS” as this will auto redirect the request to HTTPS. (this is optional step)
- Then click on “Create Distribution” to save it. It may take up several minutes before the Cloudfront is ready to be serve your content.
Here what’s happening under the hood is that, CloudFront will create IAM credential for a entity called OAI (think of this as a user but you can’t see them under IAM) and grant permission to access the bucket for that identity to read objects from the bucket. When a public user request an object, say PDF file or something, CloudFront will get it from S3 bucket (since CloudFront has read permission) and cache is in an edge location and present it to the user.
What will happen after you click on the create distribution is that the Cloudfront will actually help you to create a new OAI as well as a bucket policy in your bucket to allow the GetObject actions from the Cloudfront.
Don’t worry about your existing bucket policy if you created before you actually set up the Cloudfront, it will help to append the new policy into your existing policy.
If you not select the Update Bucket Policy when you creating the distribution, no worry, you also can manually insert the policy into your bucket policy.
You can check your OAI at Cloudfont console, just go to Security -> Origin Access Identity tab. Actually the OAI is just another type of role which created specifically for Cloudfront to access the S3 bucket.
Check the bucket policy in S3
You may go to S3 console -> Select your bucket -> Permission -> Bucket Policy, you should see the Cloudfront added policy exists.
Once the Cloudfont distribution is deployed, you can test the setup by uploading a file to S3.
You may try to access the file using the S3 link https://david-cf-bucket.s3-ap-southeast-1.amazonaws.com/cloudfront.png (this bucket will be deleted as this is only for demo purpose), you should get the access denied error message like following.
But if you try to access the file using the Cloudfront domain, than you should able to see the file display on the browser.
By this, it’s show that we can allow able to access the S3 content using the Cloudfront as the entry point but we are not able to access the file directly using the S3 URL.
By masking up our private S3 by using Cloudfront not only make sure we can share out our private content from S3 to the public securely, we also actually can minimise the cost of serving the static content, as we know, S3 charges is base on the Get and Put object count as well as the bandwidth, but using Cloudfront as the CDN, we can reduce the number of Get and Put of the object as Cloudfront will help us to cache the content at the edge location which closer to the end user.
There are something need to be bear in mind where it may take few minutes for the changing on Cloudfront to take effect. And the content cache in the cloudfront edge location will not auto refresh if you have any changing at your S3 bucket, the default TTL of the cache is 24 hours, but you may change it to the lower or even 0 to stopping Cloudfront to cache your content.
You may also invalid your content which already being cache by Cloudfront, but it may cost you money from doing that.
Originally published at https://tech.david-cheong.com on May 19, 2020.