amazon-s3-with-python

Managing Amazon S3 with Python

Managing Amazon S3 with Python

Whenever you're building and maintaining infrastructure for your company or an open-source project, you're probably using Amazon Web Services (AWS).

S3 Buckets are a great resource offered by AWS that you can wrap into Python Packages or Classes to help you maintain infrastructure in a standard format. Amazon Web Services offers many different services, which can be managed and implemented using multiple different languages; one such language is Python. 

Python has been a staple for building client and server-side applications, as well as *aaS services.  Python is flexible and has huge community support, which enables developers to create a large list of libraries at our disposal. When it comes to AWS services and Python, I recommend that you use the Boto3 libraries.

Before we begin, please make sure that you have both Python 3.6.x and Pip (or Pip3) installed. You can verify that you have these installed by running the following commands:

For Python:

python –version

--or--

python3 –version

For pip (Python Package Index):

pip –version

--or--

pip3 –version
 

These commands will ensure that you can follow along without any issues. If you are using an older version of Python or Pip, then you may experience different results or run into issues.

You can find instructions for installing Python on your respective operating system here. If you are not familiar with pip you should read the official documentation here.

Now that we have verified that we have Python and pip is installed we should run the following command to install the Boto3 package:

pip3 install boto3

Congratulations, you now have the Boto3 package installed on your machine. We can now start to interact with AWS S3 service. Before we do, though, let’s make sure we understand the basics of S3.

AWS Components

AWS S3 is a file storage service that allows individuals to manage items as two main components. The first is called Buckets, which are containers of data or files. The second component is the actual data or files.  These files can be of any type, but they are typically a binary file (executable, dll, etc.) or a document or image. 

To use the boto3 library, we should open up your IDE. I use Visual Studio Code, but you can use any IDE you would like. Open up your IDE and create a new empty file named AmazonS3.py. Within that new file, we should first import our Boto3 library by adding the following to the top of our file:

import boto3

Setting Up S3 with Python

I’m assuming that we don’t have an Amazon S3 Bucket yet, so we need to create one. Using our Boto3 library, we do this by using a few built-in methods. The first method we use, will set us up to use the remaining methods in the library. This method creates a client authentication object that we will use to create a new S3 Bucket or any other operations against AWS S3.

To do this, we need to make sure that we have an Amazon Web Services Account and Access Keys. Getting these keys is beyond the scope of this post, but we will need the following to continue accessing AWS services:

  • AWS Access Key
  • AWS Secret Key
  • AWS Region (e.g. us-east-1)

If your organization has IAM roles and provides AWS Session or some other means, use those credentials instead. Now that we have our necessary credentials, we need to create a S3 Client object using the Boto3 library:

            # ‘s3’ -> Identify that we are using S3

            # region_name=’us-east-1’ -> The AWS Region we want to use

            # aws_access_key_id=’AWS_ACCESS_KEY’ -> Our AWS Access Key

            # aws_secret_key_id=’AWS_SECRETY_KEY’ -> Our AWS Secret Key

 

S3_OBJECT = boto3.client(

's3',

                            region_name=’us-east-1’,

                            aws_access_key_id=’AWS_ACCESS_KEY,

                            aws_secret_access_key=’AWS_SECRETY_KEY’

)

We now have a new Python Object that we can use to call specific available methods. You can find more information about these methods here. We are going to use the create_bucket method to create a new AWS S3 Bucket.

S3_OBJECT.create_bucket(

                Bucket=’New_Bucket_Name’

)

This will create a bare-bones S3 Bucket that we can use to store S3 Objects in, but we do have additional parameters that we could use when creating our Bucket. You have the ability to do the following, but more information can be found here:

  1. Set ACLs on the bucket
  2. Set a Location restraint
  3. Grant permissions (Full control, read, write, etc.)

Working with S3 Buckets

With the Boto3 library, we can also retrieve a list of Buckets already created in our account (or what our account has permissions to view) by using the list_buckets method. You can call this method by using the same Boto S3 Object we created previously:

S3_Object.list_buckets()

We can either save the output from this method and print it, or we can use it as a pre-check before creating any new buckets. Use the following snippet of code:

                myBucketName = 'My Bucket Name 01'

for (b in S3_Object.list_buckets().Buckets):

                                if (b.Name.lower() == myBucketName.lower()):

                                               print('Bucket Named %s already exists' % b.Name)

                                else:

                                               print('Bucket Named %s DOES NOT exist' % myBucketName)

                                               # Create new bucket here

Uploading Files to AWS

Now let’s actually upload some files to our AWS S3 Bucket. This time, we will use the upload_file method. With this method, we need to provide the full local file path to the file, a name or reference name you want to use (I recommend using the same file name), and the S3 Bucket you want to upload the file to. Here is an example:

                Import os

myBucketName = 'My Bucket Name 01'

                file = ‘/Users/username/Desktop/my_awesome_file.log

               if(os.path.exists(file)):

                              filename = os.path.basename(file)

                              S3_OBJECT.upload_file(file, myBucketName, filename)

               else:

                               raise

Managing Other Aspects of S3

Python, and the Boto3 library, can also allow us to manage all aspects of our S3 Infrastructure.  This includes, but not limited to:

  • ACLs (Access Control Lists) on both S3 Buckets and Objects (files)
  • Control logging on your S3 resources
  • Upload a static website and host it on S3 using BucketWebsite
  • And more!

Congratulations!  You now have an AWS S3 Bucket and Object at your disposal. Now that we understand the basics of using Python to manage our S3 resources, we could wrap these into a Python Package or Class that will help us maintain our infrastructure in a standard format.

As more of our infrastructure moves towards the cloud, more organizations may choose to utilize services from Amazon Web Services. Having the ability to manage our data containers (Buckets) using multiple languages allows for flexibility when it comes to creating tools and general management of our resources.


Comments
Comments are disabled in preview mode.
Loading animation