2013-12-07

How to Store Your Media Files in Amazon S3 Bucket

In this article, I will show you how to use Amazon Simple Storage Service (S3) to store your media files in the cloud. S3 is known and widely used for its scalability, reliability, and relatively cheap price. It is free to join and you only pay the hosting and bandwidth costs as you use it. The service is provided by Amazon.com. S3 tends to be attractive for start-up companies looking to minimize costs.

S3 uses a concept of buckets which is like a storage database. Each bucket has its own url. Inside the buckets you have folders and under that you have files. In fact, directories don't actually exist within S3 buckets. The entire file structure is actually just one flat single-level container of files. The illusion of directories is actually created based on having the file names like dirA/dirB/file.

If you want to browse the files in a folder-like structure, you can use Transmit FTP client on Mac OS X. It supports S3 services. Amazon browser-based console also has interface for browsing or uploading files.

OK. Now let's have a look how to set up a Django project which will use S3 for media files.

1. Create a bucket

At first you will need to create a bucket at S3 and make it accessible for all visitors. Login to your Amazon Web Services console. Click on "Services" in the menu, then on the "S3". Click on the button "Create bucket" and enter your bucket name like "mywebsite.media" or even better without dots like "mywebsitemedia". Choose a region there which is the closest to your target audience, for example, if your website is for Europeans, choose "Ireland". Go to the properties of the bucket, expand "Permissions" section, click on the "add bucket policy" button and enter the following:

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "AllowPublicRead",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::mywebsite.media/*"
        }
    ]
}

2. Install boto and django-storages

Amazon Web Services provide a python library called boto for accessing the API. There is a django app called django-storages which allows to use AWS S3 as the main storage. So your next step is to activate your virtual environment and install latest versions of boto and django-storages.

pip install boto==2.19.0
pip install django-storages==1.1.8

3. Set up django-storages for your project

Add the following to the settings.py:

INSTALLED_APPS = [
    # ...
    'storages'
    # ...
]
DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
AWS_S3_SECURE_URLS = False       # use http instead of https
AWS_QUERYSTRING_AUTH = False     # don't add complex authentication-related query parameters for requests
AWS_S3_ACCESS_KEY_ID = '...'     # enter your access key id
AWS_S3_SECRET_ACCESS_KEY = '...' # enter your secret access key
AWS_STORAGE_BUCKET_NAME = 'mywebsite.media'

# the next monkey patch is necessary if you use dots in the bucket name
import ssl
if hasattr(ssl, '_create_unverified_context'):
   ssl._create_default_https_context = ssl._create_unverified_context

4. Create your models with FileField or ImageField

Let's create a Profile model with avatar field.

def upload_avatar_to(instance, filename):
    import os
    from django.utils.timezone import now
    filename_base, filename_ext = os.path.splitext(filename)
    return 'profiles/%s%s' % (
        now().strftime("%Y%m%d%H%M%S"),
        filename_ext.lower(),
    )

class Profile(models.Model):
    # ...
    avatar = models.ImageField(_("Avatar"), upload_to=upload_avatar_to, blank=True)
    # ...

Whenever you save an instance of the Profile with the new avatar picture, avatar will be uploaded to S3 bucket. To show it in a template, you will need something like <img src="{{ profile.avatar.url }}" alt="" /> where the image source will look like "http://mywebsite.media.s3.amazonaws.com/profiles/20140214203012.jpg".

5. Use the storage to manipulate file versions

If you need to create a thumbnail version of your image, it's probably best to overwrite the save method of the model and trigger the generation of the thumbs there. Let's add some methods to the Profile class:

class Profile(models.Model):
    # ...

    def save(self, *args, **kwargs):
        super(Profile, self).save(*args, **kwargs)
        self.create_avatar_thumb()

    def create_avatar_thumb(self):
        import os
        from PIL import Image
        from django.core.files.storage import default_storage as storage
        if not self.avatar:
            return ""
        file_path = self.avatar.name
        filename_base, filename_ext = os.path.splitext(file_path)
        thumb_file_path = "%s_thumb.jpg" % filename_base
        if storage.exists(thumb_file_path):
            return "exists"
        try:
            # resize the original image and return url path of the thumbnail
            f = storage.open(file_path, 'r')
            image = Image.open(f)
            width, height = image.size

            if width > height:
                delta = width - height
                left = int(delta/2)
                upper = 0
                right = height + left
                lower = height
            else:
                delta = height - width
                left = 0
                upper = int(delta/2)
                right = width
                lower = width + upper

            image = image.crop((left, upper, right, lower))
            image = image.resize((50, 50), Image.ANTIALIAS)

            f_thumb = storage.open(thumb_file_path, "w")
            image.save(f_thumb, "JPEG")
            f_thumb.close()
            return "success"
        except:
            return "error"

    def get_avatar_thumb_url(self):
        import os
        from django.core.files.storage import default_storage as storage
        if not self.avatar:
            return ""
        file_path = self.avatar.name
        filename_base, filename_ext = os.path.splitext(file_path)
        thumb_file_path = "%s_thumb.jpg" % filename_base
        if storage.exists(thumb_file_path):
            return storage.url(thumb_file_path)
        return ""

As you might have guessed, the avatar can be placed in the template using something like

{% if profile.get_avatar_thumb_url %}
    <img src="{{ profile.get_avatar_thumb_url }}" alt="" />
{% endif %}

where the image source will look like "http://mywebsite.media.s3.amazonaws.com/profiles/20140214203012_thumb.jpg".

Conclusion

When you have a basic overview about Amazon Simple Storage Service, it is quite easy to use it in Django projects with existing third-party libraries. For flexibility, if you need to modify uploaded files, storage object should be used instead of the default os methods. That way, you can simply switch to the default file storage for local development.

2 comments:

  1. I was wondering if there was a way to have the no https and no auth query params settings apply to just these FileFields and not other uses of S3/boto you might have in your system?

    ReplyDelete
  2. Bluehost is ultimately one of the best web-hosting company for any hosting services you might need.

    ReplyDelete