Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage/transfermanager): prototype #10045

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

BrennaEpp
Copy link
Contributor

No description provided.

@product-auto-label product-auto-label bot added the api: storage Issues related to the Cloud Storage API. label Apr 25, 2024
Copy link
Contributor

@tritone tritone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few initial comments, overall looks like a good start

type Downloader struct {
client *storage.Client
config *transferManagerConfig
work chan *DownloadObjectInput // Piece of work to be executed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this should be send-only and output should be receive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are sending and receiving from both channels in different places in the downloader. Unidirectional channels could be used in subcomponents or if we were providing the channel to the user, but I don't see how we could implement this with unidirectional channels - if we only received from output, who would send us the output (and vice-versa for work)?

"google.golang.org/api/iterator"
)

// Downloader manages parallel download operations from a Cloud Storage bucket.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically bucket can be specified per object. Let's just say that it manages a set of parallelized downloads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, that wording is clearer.

storage/transfermanager/downloader.go Outdated Show resolved Hide resolved
storage/transfermanager/downloader.go Outdated Show resolved Hide resolved
storage/transfermanager/downloader.go Outdated Show resolved Hide resolved
storage/transfermanager/downloader.go Outdated Show resolved Hide resolved
storage/transfermanager/downloader.go Show resolved Hide resolved
return crc32cHash.Sum32(), w.Close()
}

type testWriter struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't have to be in this PR, but we should add a version of this (well, some kind of DownloaderBuffer that implements WriterAt) to the library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd have to look more into that. This is a very barebones implementation that is likely not at all efficient (and doesn't really work as a WriterAt yet).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's fine for now.

storage/transfermanager/option.go Outdated Show resolved Hide resolved
}

// Start workers in background.
for i := 0; i < d.config.numWorkers; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we could optimize this by spinning up workers as needed when there are objects enqueued? Doesn't have to be in this PR though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though I'm not sure how much that would optimize this by... I guess it depends on the num of workers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah something we can test out later.

@BrennaEpp BrennaEpp marked this pull request as ready for review May 13, 2024 23:59
@BrennaEpp BrennaEpp requested review from a team as code owners May 14, 2024 00:01
@BrennaEpp BrennaEpp changed the title draft(storage/transfermanager): prototype feat(storage/transfermanager): prototype May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants