skypilot-users
How to improve loading speed of torch.load from a mounted S3 bucket using sky-pilot on multiple nodes?
Jason Krone is facing issues with torch.load(checkpoint_path)
hanging when loading from a S3 bucket mounted via sky-pilot on multiple nodes. Zongheng Yang suggests using rclone as an alternative to the native MOUNT
mode to potentially improve R/W speeds.
Ja
Jason Krone
Asked on Mar 22, 2024
- Consider using rclone as an alternative to the native
MOUNT
mode for better R/W speeds. - This may improve the loading speed of
torch.load
from a mounted S3 bucket on multiple nodes. - While there's no benchmark available yet, other users have reported potential improvements with rclone.
Example:
rclone mount remote:path /path/to/mountpoint --vfs-cache-mode writes
Mar 24, 2024Edited by