We ran into a weird problem when we tried to stream to an S3 file using boto3, and all the posts on Stack Overflow had wildly inaccurate and generally non-working solutions, so I’m posting this hoping that maybe it will save someone some time.
The problem: You write an S3 upload in python, and it gives you the following error:
ValueError: the bucket 'XXX' does not exist, or is forbidden for access (ClientError('An error occurred (AccessDenied) when calling the CreateMultipartUpload operation: Access Denied'))
The error clearly spells out that this is a permission problem, so you spend some time trying to add the proper permissions. You learn that there is no such thing as an S3:CreateMultipartUpload permission – boto3 uses the normal s3:PutObject permission. So you google some more.
Then you think it’s an ACL permission – nope.
Then you think maybe your encrypted S3 bucket is the problem and you need to add kms:GenerateDataKey permission? But no, you use encryption with Amazon S3 managed keys (SSE-S3) and it does not require extra kms permissions. Another dead end. How did it ever work for other people?
Then you throw all the permissions that exist on the user and it’s still failing. What gives?
You enable boto3 debug logs with
boto3.set_stream_logger('') but the log looks okay, except that it gets a 403 access denied from Amazon.
Then your brilliant colleague Fatih Elmali reads the code and says that regardless of all the examples Amazon has published, the following is not enough:
client = boto3.client('s3', aws_access_key_id=...)
The proper way to setup authentication for a boto3 s3 client is the following:
session = boto3.Session(aws_access_key_id=...)
client = session.client('s3')
This will set up the proper session authentication and streaming to an S3 file object will work.