Datasets API Reference

Datasets API Reference#

When working with larger amounts of data, you may want to handle uploads, downloads, and other prepration separate from job submission. You can do this using the Datasets API. The Datasets API is also available via the Python SDK, and certain functionality is also available through the CLI.

Currently, the datasets API is limited to columnar data and only supports Parquet files. In the future, we’ll add support for other file formats and more flexible data types, as well as the ability to create datasets from external storage.