Listing Datasets

curl --request POST \
  --url https://api.sutro.sh/list-datasets \
  --header 'Authorization: <authorization>'

{
  "datasets": [
    {
      "dataset_id": "dataset_12345",
      "name": "Training Data Q1 2024",
      "created_at": "2024-01-15T10:30:00Z",
      "file_count": 5,
      "total_size_bytes": 1048576,
      "schema": {
        "fields": [
          {"name": "input", "type": "string"},
          {"name": "output", "type": "string"},
          {"name": "category", "type": "string"}
        ]
      }
    },
    {
      "dataset_id": "dataset_12346",
      "name": "Evaluation Dataset",
      "created_at": "2024-01-20T14:15:00Z",
      "file_count": 2,
      "total_size_bytes": 524288,
      "schema": {
        "fields": [
          {"name": "prompt", "type": "string"},
          {"name": "response", "type": "string"}
        ]
      }
    }
  ]
}

POST

list-datasets

Listing Datasets

curl --request POST \
  --url https://api.sutro.sh/list-datasets \
  --header 'Authorization: <authorization>'

{
  "datasets": [
    {
      "dataset_id": "dataset_12345",
      "name": "Training Data Q1 2024",
      "created_at": "2024-01-15T10:30:00Z",
      "file_count": 5,
      "total_size_bytes": 1048576,
      "schema": {
        "fields": [
          {"name": "input", "type": "string"},
          {"name": "output", "type": "string"},
          {"name": "category", "type": "string"}
        ]
      }
    },
    {
      "dataset_id": "dataset_12346",
      "name": "Evaluation Dataset",
      "created_at": "2024-01-20T14:15:00Z",
      "file_count": 2,
      "total_size_bytes": 524288,
      "schema": {
        "fields": [
          {"name": "prompt", "type": "string"},
          {"name": "response", "type": "string"}
        ]
      }
    }
  ]
}

Using the API directly is not recommended for most users. Instead, we recommend using the Python SDK.

List all datasets.

Headers

Authorization

string

required

Your Sutro API key using Key authentication scheme.Format: Key YOUR_API_KEYExample: Authorization: Key sk_live_abc123...

Response

Returns a JSON object containing a list of datasets.

datasets

array

A list of datasets you have access to. Each dataset object contains metadata about the dataset including dataset_id, name, creation time, and other relevant information.

{
  "datasets": [
    {
      "dataset_id": "dataset_12345",
      "name": "Training Data Q1 2024",
      "created_at": "2024-01-15T10:30:00Z",
      "file_count": 5,
      "total_size_bytes": 1048576,
      "schema": {
        "fields": [
          {"name": "input", "type": "string"},
          {"name": "output", "type": "string"},
          {"name": "category", "type": "string"}
        ]
      }
    },
    {
      "dataset_id": "dataset_12346",
      "name": "Evaluation Dataset",
      "created_at": "2024-01-20T14:15:00Z",
      "file_count": 2,
      "total_size_bytes": 524288,
      "schema": {
        "fields": [
          {"name": "prompt", "type": "string"},
          {"name": "response", "type": "string"}
        ]
      }
    }
  ]
}

Code Examples

import requests

response = requests.post(
    'https://api.sutro.sh/list-datasets',
    headers={
        'Authorization': 'Key YOUR_SUTRO_API_KEY'
    }
)

result = response.json()
print(f"Found {len(result['datasets'])} datasets:")

for dataset in result['datasets']:
    print(f"Dataset ID: {dataset['dataset_id']}")
    print(f"Name: {dataset['name']}")
    print(f"Created: {dataset['created_at']}")
    print(f"Files: {dataset['file_count']}")
    print(f"Size: {dataset['total_size_bytes']} bytes")
    print("---")

Dataset Object Fields

Each dataset in the datasets array contains the following fields:

dataset_id: Unique identifier for the dataset
name: Human-readable name of the dataset
created_at: ISO timestamp of when the dataset was created
file_count: Number of files in the dataset
total_size_bytes: Total size of all files in the dataset (in bytes)
schema: Schema information including field names and types

Notes

Datasets are returned in reverse chronological order (newest first)
The response includes metadata about each dataset to help with dataset management
Use the dataset_id from this response with other dataset-related endpoints

Downloading a File Listing Files in a Dataset

Batch API

Datasets API

Listing Datasets

Headers

Response

Code Examples

Dataset Object Fields

Notes

Batch API

Datasets API

​Headers

​Response

​Code Examples

​Dataset Object Fields

​Notes

Headers

Response

Code Examples

Dataset Object Fields

Notes