How to Prevent Hotlinking by Using AWS WAF, Amazon CloudFront, and Referer Checking
At some point, you might have to deal withhotlinking: when third parties embed in their websites the content they find on your websites. The third-party website does not incur the cost of hosting the content, which means your website can end up paying for the content other sites use.
Now, you can use
Process overview
You can address hotlinking in a number of ways. For instance, you can validate theRefererheader (sent by a browser to indicate to the server which page they were referred from) at yourweb server (for example, by using the Apache modulemod_rewrite), and issue either a redirect back to your site’s main page, or return a “403 Forbidden” error to the visitor’s browser.
Ifyou are using a CDN such as CloudFront to speed up your site’s delivery of content, validating theRefererheader at the web server becomes less practical. The CDN stores a copy of your content in theedgeof its network of servers, so even if your web server validates the original request’s headers (in this case, theReferer), additional requests for that content must be validated by the CDN itself because they are unlikely to hit theoriginweb server.
The following diagram illustrates this process.
As illustrated in the preceding diagram, when (1) a request comes in from an end-user client to (2) an Amazon CloudFront edge location, the edge location attempts to return a cached copy of the file requested. This request, if fulfilled from the cache, is considered acache hit. In the case of acache miss—when the content is either not in the edge, or is not valid (for instance, if the content is out of date)—the (3) request goes back to the origin (for example, the origin could beAmazon S3) for a new copy of the object. In the case of a cache hit, the origin cannot apply any validation to the end user’s request, because the edge server does not need to contact the origin in order to fulfil the end user’s request.
I will show you how to inspect headers with AWS WAF to block or allow requests at the CDN.
Solution implementation—two approaches
Terms
The following list includes key terms I use in this post:
- AWS WAF configurations consist of aweb access control list(web ACL), which is associated with a given CloudFrontdistribution.
- Each web ACL is a collection of one or morerules, and eachrule can have one or morematch conditions.
- Match conditions are made up of one or morefilters,which inspect components of the request (such as its headers or URI)to match for certain conditions.
AWS WAF setup
I will show two approaches to preventing hotlinking:
- A separate subdomain– All static files (such as images or styling components like CSS) to be protected are separated ontoa separate subdomain such as static.example.com so that I only need to validate theRefererheader.
- The same domain– Static files sit under a folder on the same domain. In this case, I will also extend our example to check for an emptyRefererheader.
Approach 1: A separate subdomain
In this case, I create an AWS WAF rule set that contains a single rule with a single match condition, which in turn comprises a singlefilter. The match condition checks theRefererheader and verifies that it contains a given value. If the rule is matched, the traffic is allowed. Otherwise, the default rule blocks the traffic. In the following steps,I show how to set this up by using the AWS WAF console.
Step 1: Determine what you need to protect.
Because I have all of my static files on a separate subdomain (static.example.com), accessed only from example.com, I will block hotlinking for any files accessible under static.example.comthat do not have aRefererendingwithexample.com.
Because AWS WAF web ACLs can be applied only to Amazon CloudFront, be sure that you already have a distribution set up to serve this traffic. In this blog post, I will not cover the creation of CloudFront distributions, but thisvideocovers this in more detail.
Step 2: Create and name a new web ACL.
Because this is the first time I have created a web ACL, I open theAWS WAF console(shown in the following screenshot) and then clickGet started.
If you have created a web ACL before, clickCreate web ACLon the AWS WAF console landing page.
I then provide the name of the web ACL I am creating. At the same time, the page will automatically populate an associatedAmazon CloudWatchmetric name. CloudWatch is a monitoring service that allows you to gather and report on metrics of various services. This CloudWatch metric can be used later to report on how your newly created AWS WAF configuration is being used. After I have supplied the name of the web ACL, I clickNextto go to the next page.
Step 3: Create a string match condition onReferer.
In theString match conditionssection, I clickCreate condition. I could use several types of conditions, but for AWS WAF to evaluate a string of aRefererheader, I choose a string match condition(see the following screenshot). This string match condition will inspect theRefererheader on web requests for any string containingexample.com/, which will allow me to embed content from other sites under my domain.In this case, I will not allow a blankReferer. I will assume that only our website can embed content under this domain.
If you need to increase security further, you can have additional match conditions for only validReferervalues by usingStarts With(be sure to include the protocol, such ashttp://orhttps://). For example, by using a value such ashttps://example.com, you could prevent someone from registering stealfromexample.com and using that to hotlinkyour content, or you could prevent someone from includingexample.com/in the domain itself.
I have also included aTransformationmatch, which changes the header to lowercase before parsing it. This is not required for most modern browsers; however, HTTP header fields can be case sensitive.
Be sure to clickAdd another filterafter you have entered the configuration information, or your original filter will not be populated. Then clickCreate.
When the string match condition has been populated, it will appear in theString match conditionssection (see the following screenshot). I can now use this string match condition in a rule.
I clickNextto go to theCreate rulespage.
Step 4: Create a new rule with the specified string match condition.
I now need to create a new rule that will filter based on the string match condition I just created.
First, I clickCreate rule, which allows me to specify aNamefor the rule, an associatedCloudWatch metric name, and the logic behind how the conditions are applied (as shown in the following screenshot).
After I have specified the conditions, I clickCreate, and the new rule is added automatically to the web ACL. As shown in the following screenshot, I set this newly created rule toAllow, and theDefault ActiontoBlock.I then clickNext.
Step 5: Associate the new rule withthe relevant CloudFront distribution (and test with cURL).
From theResourcedrop-down list on theChoose AWS resourcepage, I can choose the relevant CloudFront distribution used for my static site delivery, which will allow me to easily associate the newly created AWS WAF web ACL with this distribution.
I clickReview and create, which gives me a review page covering all of the details I have covered so far.
I have checked that this is the correct distribution, so I can clickConfirm and create. This will begin the process of associating the web ACL with my CloudFront distribution, which will typically take around 10–15 minutes.
The result
Now when I request fileswithout thewhitelistedRefererheader, the requests are blocked at the CDN. However, valid requests still are allowed through.
When a third party embeds our content (request blocked at the CDN)
» curl –H "Referer: https://example.net/" -I https://static.example.com/favicon.ico
« HTTP/1.1 403 Forbidden
When I embed our content (request allowed through the CDN)
» curl –H "Referer: https://example.com/" -I https://static.example.com/favicon.ico
« HTTP/1.1 200 OK
With Approach 1, Imustmake the request with a whitelistedRefererheader, and in this case, all paths are filtered. In Approach 2, I will allow a blankRefererheader, and I also will show how to filter by a given URL path.
Approach 2: All content under the same domain, with filtering by path
In this second approach, I will create an AWS WAF web ACL that contains multiple rules with additional match conditions, which in turn comprise multiple filters. As with the first approach, the match condition looks at theRefererheader; however, I now validate it in two ways: first, I validate whether it contains my expected header, and if not, I move on to my second validation, which checks to see whether it has any “URL style”Refererheader. This allows me to access the assets directly in a browser when the assets are not otherwise embedded in a website, but still provides protection against hotlinking.
I also validate the path (in this case/wp-content) used in the request, which allows AWS WAF to protect individual folders under a single domain name
Step 1: Determine what you need to protect.
As in the first approach, rather than filter on everything under a domain in this second approach, I will filter based on the path,/wp-content. This allows me to protect my uploaded content that sits under/wp-content, but without having to separate this out into a separate subdomain.
Step 2: Create and name a new web ACL.
As with the previous approach, I create a new web ACL in theAWS WAF consoleby clickingCreate web ACL. I will assume you have already created the web ACL from Approach 1, but if not, check Approach 1’s instructions about this step.
As I did in the first approach, I supply the name of the web ACL I am creating. After I have supplied the name of the web ACL, I clickNextto go to the next page.
Step 3: Create string match conditions on theReferer.
For Approach 2, I am assuming that everything exists under a single domain, so rather than using the catch-allexample.com/, I choose the more securehttps://example.com/, and I mark the header asStarting Withthis value. Because I am explicitly filtering on one header, I need to watch out for two things:
- Switching betweenwww.example.comandexample.comin my application.
- Switching betweenhttps://andhttp://in my application.
If either of these switches occurs, I will see a “403 Forbidden” error returned instead of my embedded files. In this example, all content is delivered directly throughhttps://example.com/.
First string match condition
To create these match conditions, I clickCreate conditionnext toString match condition, as shown in the following screenshot.
I then configure the new match conditions and filters, as shown in the following screenshot.
Again, remember to clickAdd another filterbefore you clickCreate, or the filter will not be added to the condition.
Second string match condition
After I have created this string match condition for Approach 2, I need to create two more string match conditions—one for the URL path (/wp-content) itself, and one to validate whether there is noReferer, which is useful in scenarios with noncompliant client applications or where you need to directly link (from an email, for instance).
For the URL itself, I want to protect content under/wp-content, so I will create a string match to validate that case. I go through the same steps as before. This time, I change the part of the request to filter onURI, and the value to match as/wp-content, as shown in the following screenshot.
Again, clickAdd another filter, and then clickCreateto create my second string match condition, which is shown in the following screenshot.
Third string match condition
With the first two string match conditions created, I move on to create my final string match condition, which I will use to determine whether theRefereris set or not set.
Again, I clickCreate conditionabove our previously created conditions. This time, I will create a filter matching on theRefererheader, and match on the presence of://.
Again, clickAdd another filterbefore clicking andCreate. I then have all three of my string match conditions that I will use, as shown in the following screenshot.
I clickNextto go to theCreate rulespage.
Step 4: Create two rules with the specified string match conditions created in Step 3.
Creating the rules in Approach 2 is more complex than in Approach 1. Now I need to create two rules: one that validates a validRefererheader, and one that validates requests with noRefererheader.
Rule 1: Validate aRefererheader.
This first rule matches on the presence of theRefererheader (https://example.com/) and the URL (/wp-content). First, I clickCreate rule, which allows me to specify aNamefor the rule, an associatedCloudWatch metric name, and the logic behind how the conditions are applied (as shown in the following screenshot).
When the rule has been populated as above, I clickCreateand the rule is automatically added to my web ACL.
Rule 2: Validate requests with noRefererheader.
This second rule is similar to the first, and matches when theRefererheader includes://. I use this as a simple way to check whether theRefererheader has been set at all. If it has, I choose to block the request, which is configured when all the rules are created and added to the web ACL.
Again, once the rule has been populated as above, I pressCreateand the rule is automatically added to my web ACL.
After I have created both of these rules, and the rules have been added to my web ACL, I can take advantage of the AWS WAF ordering capabilities, which control the order in which rules are applied. I have chosen to:
- Match for the path of/wp-contentand aRefererthat is valid. If so, allow the request.
- Match for a path of/wp-contentand aRefererthat is invalid. If so, block the request.
- Otherwise, allow all requests by default (for paths that are not/wp-content).
This order of operations results in the following rule configuration.
Step 5: Associate the new rule withthe relevant CloudFront distribution (and test with cURL).
From theResourcedrop-down list on theChoose AWS resourcepage, I can choose the relevant CloudFront distribution used for my static site delivery, which will allow me to easily associate the newly created AWS WAF web ACL with this distribution.
I clickReview and create, which gives me a page that shows all of the details I have covered so far in Approach 2.
If the details look right, I clickConfirm and create. Again, it will take 10–15 minutes to push changes out.
The Result
As with Approach 1, I have filtering at the CDN, but this time the filtering is based on the pathanddirect linking is allowed (without aRefererheader).
Here I use cURL to verify that the new AWS WAF web ACL correctly protects my content. I use the–Hargument to send a differentRefererheader to the CloudFront distribution, which allows me to test as if I am embedding my content in an unauthorized page.
When a third party embeds our content
» curl –H "Referer: https://example.net/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 403 Forbidden
When our content is directly linked (with noReferer)
» curl -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK
When I embed our content
» curl –H "Referer: https://example.com/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK
If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about this solution or its implementation, start a new thread on theAWS WAF forum.