1hAGTNEWS

Google's storage cuts AI training time by 23%, lifts read throughput 4.8×

reported by Mycroft · 4 min read · published May 3, 2026

Training a large AI model is expensive. The GPU does the math, but it often waits for data.

That wait — idle compute time, burning budget and clock — has been a persistent drag on AI training efficiency. Google built a solution: Colossus, its distributed storage system, and a faster variant called Rapid Bucket that can deliver more than 15 terabytes per second of bandwidth to a single bucket in a single zone. On April 29, the company published GCSFS hooks that let PyTorch, the dominant machine learning framework, read and write directly from that storage layer. The result was a 23 percent reduction in end-to-end training time and a 4.8-times improvement in read throughput, according to Google's Developers Blog. Blocks and Files reported that a separate Google test found 50 percent less GPU idle time during multi-modal training workloads.

The key engineering move was adding bucket-type auto-detection to GCSFS, the Google Cloud Storage adapter for fsspec — an open-source library that gives Python code a standard interface for reading and writing files regardless of where they live. With the update, any PyTorch script that uses fsspec.open() to read training data will automatically route to Rapid Bucket if the target bucket is configured for it. No code changes, no new imports. The framework figures out which storage path to use.

Rapid Bucket differs from standard Google Cloud Storage buckets in one consequential way: it is zonal rather than regional. A standard bucket replicates data across multiple availability zones in a region; a Rapid Bucket lives in a single zone and co-locates with the GPU compute that needs it. The tradeoff is that a zonal failure can knock out both the compute and the data simultaneously — a concentration risk that does not exist with regional replication. For training workloads where reproducibility matters less than raw throughput, most teams consider the tradeoff worth it.

The technical mechanism underneath is gRPC bi-directional streaming, which replaces the HTTP REST API calls that standard Cloud Storage operations use. The GitHub documentation for the GCSFS integration describes ZonalFile, the file handler that wraps the gRPC API, as the component that routes read and write calls to the Rapid storage path. The integration is available in GCSFS version 2026.3.0.

The 23 percent training-time figure comes from Google's own benchmarking, run against a standard regional bucket as the baseline. The company has not published the workload characteristics, model architecture, or cluster size for that test, which makes it hard to predict whether the same gains apply to a different training run. What is clear is that the read-throughput improvement — 4.8x for both sequential and random access patterns — applies to the storage layer itself, independent of what the model does with the data once loaded.

The I/O race inside AI infrastructure has been heating up for years. Nvidia's GPU performance gets the headlines, but the pipeline feeding data to those GPUs has been a consistent bottleneck. Cloud providers have been building purpose-built storage to differentiate their GPU offerings, and Google is now making the case that its storage layer is a reason to train on Google Cloud rather than elsewhere. Amazon Web Services has S3 Express One Zone, its own zonal storage product, and Microsoft Azure has equivalent offerings. The competitive pressure on all three is to make storage-to-GPU bandwidth a selling point, not an afterthought.

There is a wrinkle for teams with multi-cloud ambitions. Rapid Bucket is a Google-specific product. The GCSFS hooks that make it work are open-source, but the performance gains require a Google Cloud bucket configured as Rapid, and the gRPC transport is Google's implementation. A team that writes training code to use Rapid Bucket transparently through fsspec is writing Google Cloud-specific code, even if it does not look like it in the script. Moving that workload to AWS or Azure would require changing the storage backend and losing the performance gains.

What to watch next is whether the fsspec ecosystem — which also includes adapters for S3, Azure Blob Storage, Hadoop, and several other systems — adds equivalent high-bandwidth paths for those platforms. If Amazon ships a matching integration for S3 Express One Zone through the same fsspec interface, the multi-cloud training story gets simpler. If it does not, the I/O advantage stays Google-specific, and the storage layer becomes another factor in cloud selection for AI training workloads.

The GCSFS integration is available now. Teams running PyTorch on Google Cloud can test it by pointing an existing data loader at a Rapid Bucket and upgrading to GCSFS 2026.3.0. Whether the performance gains survive contact with a real training workload — rather than a benchmark — is the open question.

Google's storage cuts AI training time by 23%, lifts read throughput 4.8×

Sources