py-s3headerize

Let’s say you’re hosting a website out of an AWS S3 bucket. And it works great!

But the Cache-Control and Content-Type HTTP headers aren’t quite right.

Sure, S3 can take a good guess at Content-Type, but it’s not perfect. And S3 doesn’t set Cache-Control for you at all.

You could host your website via CloudFront and have a Lambda@Edge function set these values, but I wrote a Python package called py-s3headerize to set the metadata for these headers directly onto the S3 objects.

Installation

pip install s3headerize

Configuration

py-s3headerize needs a configuration file which describes each HTTP header and the values to set.

For example, this configuration will set the Cache-Control header to max-age=300, public on all .html objects:

- header:        Cache-Control
  when:
    - extension: .html
      then:      max-age=300, public

This configuration will set the Cache-Control header to:

  • max-age=300, public on all .html objects.
  • max-age=604800, public on all .css objects.
- header:        Cache-Control
  when:
    - extension: .html
      then:      max-age=300, public
    - extension: .css
      then:      max-age=604800, public

This configuration will set the Cache-Control header to:

  • max-age=300, public on all .html objects.
  • max-age=604800, public on all .css objects.
  • max-age=31536000, public on all other objects.
- header:        Cache-Control
  when:
    - extension: .html
      then:      max-age=300, public
    - extension: .css
      then:      max-age=604800, public
  else:          max-age=31536000, public

The Content-Type header is defined using the same format:

- header:        Cache-Control
  when:
    - extension: .html
      then:      max-age=300, public
    - extension: .css
      then:      max-age=604800, public
  else:          max-age=31536000, public

- header:        Content-Type
  when:
    - extension: .woff2
      then:      font/woff2

Command-line usage

The following arguments are available:

  • --bucket: Name of the S3 bucket.
  • --header-rules: Path to the configuration file.
  • --dry-run: Optional switch to perform a dry-run.
  • --key-prefix: Optional key prefix for the S3 objects to update.
  • --log-level: Optional log level.

For example, this command will update the metadata on all the objects in the S3 bucket named my-bucket according to the rules described in rules.yaml.

python -m s3headerize --bucket       my-bucket \
                      --header-rules rules.yaml

For reference, I use py-s3headerize via the command-line in my Jekyll website pipeline.

Code usage

The rules must be described as a dictionary following the same schema as the file:

rules = [
    {
        'header': 'Cache-Control',
        'when': [
          {
            'extension': '.html',
            'then': 'max-age=300, public'
          }
        ],
        'else': 'max-age=31536000, public'
    },
    {
        'header': 'Content-Type',
        'when': [
          {
            'extension': '.woff2',
            'then': 'font/woff2'
          }
        ]
    }
]

To update all the objects within a bucket, use the BucketHeaderizer class:

from s3headerize import BucketHeaderizer

rules = [
    {
        'header': 'Cache-Control',
        'when': [
          {
            'extension': '.html',
            'then': 'max-age=300, public'
          }
        ]
    }
]

bucket_headerizer = BucketHeaderizer(header_rules=rules)
bucket_headerizer.update(bucket='my-bucket')

To update just a single object, use the ObjectHeaderizer class:

from s3headerize import BucketHeaderizer

rules = [
    {
        'header': 'Cache-Control',
        'when': [
          {
            'extension': '.html',
            'then': 'max-age=300, public'
          }
        ]
    }
]

object_headerizer = ObjectHeaderizer(bucket='my-bucket',
                                     header_rules=rules,
                                     key='index.html')
object_headerizer.update()

Report a bug, request a feature or ask a question

Please use the issues page to raise a bug, request a feature or ask a question.

Licence

py-s3headerize is published under the MIT License.

Comments