With AWS Signature Version 4 (AWS4) you have the option of uploading the payload in fixed or variable-size chunks.
This chunked upload option, also known as Transfer payload in multiple chunks or STREAMING-AWS4-HMAC-SHA256-PAYLOAD feature in the Amazon S3 ecosystem, avoids reading the payload twice (or buffer it in memory) to compute the signature in the client side.
Signing single vs multiple chunks payloads
In AWS Signature Version 4 using the HTTP Authorization header is the most common method of providing authentication information. There are two supported options in S3:
The first option, 'Transfer payload in a single chunk', is the way how Ceph RGW S3 authenticates requests under AWS4 right now. It supports signed and unsigned payloads.
For this case the major client-side drawback with signed payloads uploads is the necessity to read the payload twice or buffer it in memory. For big files this approach may be inefficient so you might prefer to upload data in chunks instead.
The second option, 'Transfer payload in multiple chunks (chunked upload)', is the new functionality being added upstream and backported to Jewel.
In this case you break up the payload in fixed or variable-size chunks. By uploading data in chunks, you avoid accesing the payload twice to calculate the signature in the client side.
Signing the chunks
Each chunk signature computation includes the signature of the previous chunk. The seed signature is generated using the request headers only:
Bear in mind the seed signature is used in the signature computation of the first chunk only. For each subsequent chunk, you create a chunk signature that includes the signature of the previous chunk.
The string to sign used in each subsequent chunk signature is:
The chunk signatures chaining ensures you send the chunks in correct order.
Uploading chunked payloads with the AWS SDK for Java
Feel free to use this code based on the AWS SDK Java's UploadObjectSingleOperation example.
- AWS Signature Version 4 goes upstream in Ceph
- Ceph, a free unified distributed storage system
- On S3, endpoints, regions, signatures and Boto 3