skypilot-users

What is a potential solution for handling large datasets in computer vision projects with changing data frequently?

Georgehwp is looking for a solution to handle large datasets in computer vision projects with data that changes frequently, such as daily appends. Romil Bhardwaj suggests using SkyPilot's in-built bucket mounting feature for streaming data from the bucket. Additionally, Romil mentions the tradeoffs in performance compared to using s5cmd. Georgehwp also shares his experience with s5cmd, stating that it was faster than the default usage of the AWS S3 CLI.

Ge

Georgehwp

Asked on Feb 19, 2024

  • Consider using SkyPilot's in-built bucket mounting feature for streaming data from the bucket.
  • Evaluate the tradeoffs in performance compared to tools like s5cmd.
  • Experiment with s5cmd for faster retrieval of datasets compared to the default usage of AWS S3 CLI.
Feb 19, 2024Edited by