Skip to content

Add support for etag validation in resumableFileDownload #6042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 22, 2025

Conversation

RanVaknin
Copy link
Contributor

Motivation and Context

Added support for ETag validation in resumable file downloads. The Transfer Manager now validates both the lastModified timestamp and the ETag when resuming a download. This aims to make sure that if the object has changed on S3 (indicated by a different ETag), the download will restart from the beginning rather than attempting to resume, which could lead to corrupted files.

Modifications

  • Added s3ObjectEtag field to ResumableFileDownload to store the ETag
  • Updated the serialization/deserialization logic in ResumableFileDownloadSerializer to include the ETag values.
  • Updated ResumableRequestConverter to compare the stored ETag with the current S3 object's ETag.
  • Added logging when an ETag mismatch is detected

Testing

  • Tests for different ETag values causing downloads to restart
  • Tests for matching ETag values allowing downloads to
  • Tests for null/missing ETag values
  • Tests for case sensitivity in ETag comparisons
  • Test for logging the hint about download being restarted

@RanVaknin RanVaknin requested a review from a team as a code owner April 16, 2025 21:18
@RanVaknin
Copy link
Contributor Author

Working on adding an integ test.

Copy link

@alextwoods alextwoods left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - looks good overall.

I think could use an integration test case added in S3TransferManagerDownloadPauseResumeIntegrationTest

Copy link
Contributor

@zoewangg zoewangg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to update DefaultFileDownload#doPause to capture etag?

@@ -62,6 +62,11 @@ public static byte[] toJson(ResumableFileDownload download) {
jsonGenerator,
"s3ObjectLastModified");
}
if (download.s3ObjectEtag().isPresent()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add unit tests in ResumableFileDownloadSerializerTest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I added coverage for that new Etag field in the existing test. Do I need to add a standalone test just for that, or is this sufficient?

@RanVaknin RanVaknin force-pushed the rvaknin/S3Transfer-manager-resume-download-add-Etag branch from 9977303 to b64302f Compare April 17, 2025 22:47
@RanVaknin
Copy link
Contributor Author

RanVaknin commented Apr 17, 2025

Don't we need to update DefaultFileDownload#doPause to capture etag?

Yes good call. Actually the integ test I wrote caught that.

Fixed now.

Copy link
Contributor

@zoewangg zoewangg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's kick off integ tests to make sure they are passing if we haven't already

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@RanVaknin RanVaknin added this pull request to the merge queue Apr 21, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 22, 2025
@RanVaknin RanVaknin added this pull request to the merge queue Apr 22, 2025
Merged via the queue into master with commit ededdd5 Apr 22, 2025
31 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants