Uploading a File

Using the API directly is not recommended for most users. Instead, we recommend using the Python SDK.

Upload a file to a dataset.

Usage Notes

Currently, only parquet files are supported. Snappy compression is supported.
You can only upload one file at a time via the API. You must make multiple requests to upload multiple files. The Python SDK supports uploading multiple files at once.
All files must have the same schema. Files with schemas that do not match will be rejected.
Names must be unique. If you upload a file with a name that already exists in the dataset, it will be rejected.
When you upload to a dataset, the ordering will be preserved. If you upload multiple files to a dataset, they will be added to the dataset in the order you provide them.

Request Body

dataset_id

string

required

The ID of the dataset to upload the file to

file

required

The file to upload (must be a parquet file)

Headers

Authorization

string

required

Your Sutro API key using Key authentication scheme.Format: Key YOUR_API_KEYExample: Authorization: Key sk_live_abc123...

Response

Returns the file ID of the uploaded file.

file_id

string

The unique identifier for the uploaded file

{
  "file_id": "file_abc123def456"
}

Code Examples

import requests

# Upload a parquet file
with open('data.parquet', 'rb') as file:
    response = requests.post(
        'https://api.sutro.sh/upload-to-dataset',
        headers={
            'Authorization': 'Key YOUR_SUTRO_API_KEY'
        },
        json={
            'dataset_id': 'dataset_12345'
        },
        files={
            'file': file
        }
    )

result = response.json()
if 'file_id' in result:
    print(f"File uploaded successfully: {result['file_id']}")
else:
    print(f"Upload failed: {result.get('error', 'Unknown error')}")

Important Considerations

File Format: Only parquet files are currently supported
Schema Consistency: All files in a dataset must share the same schema
Unique Names: File names must be unique within each dataset
Ordering: Files are processed in the order they are uploaded
Compression: Snappy compression is supported for parquet files
Single File Limit: API supports one file per request (use Python SDK for batch uploads)

Batch API

Datasets API

Usage Notes

Request Body

Headers

Response

Code Examples

Important Considerations

Batch API

Datasets API

​Usage Notes

​Request Body

​Headers

​Response

​Code Examples

​Important Considerations

Usage Notes

Request Body

Headers

Response

Code Examples

Important Considerations