POST
/
list-datasets
Listing Datasets
curl --request POST \
  --url https://api.sutro.sh/list-datasets \
  --header 'Authorization: <authorization>'
{
  "datasets": [
    {
      "dataset_id": "dataset_12345",
      "name": "Training Data Q1 2024",
      "created_at": "2024-01-15T10:30:00Z",
      "file_count": 5,
      "total_size_bytes": 1048576,
      "schema": {
        "fields": [
          {"name": "input", "type": "string"},
          {"name": "output", "type": "string"},
          {"name": "category", "type": "string"}
        ]
      }
    },
    {
      "dataset_id": "dataset_12346",
      "name": "Evaluation Dataset",
      "created_at": "2024-01-20T14:15:00Z",
      "file_count": 2,
      "total_size_bytes": 524288,
      "schema": {
        "fields": [
          {"name": "prompt", "type": "string"},
          {"name": "response", "type": "string"}
        ]
      }
    }
  ]
}
Using the API directly is not recommended for most users. Instead, we recommend using the Python SDK.
List all datasets.

Headers

Authorization
string
required
Your Sutro API key using Key authentication scheme.Format: Key YOUR_API_KEYExample: Authorization: Key sk_live_abc123...

Response

Returns a JSON object containing a list of datasets.
datasets
array
A list of datasets you have access to. Each dataset object contains metadata about the dataset including dataset_id, name, creation time, and other relevant information.
{
  "datasets": [
    {
      "dataset_id": "dataset_12345",
      "name": "Training Data Q1 2024",
      "created_at": "2024-01-15T10:30:00Z",
      "file_count": 5,
      "total_size_bytes": 1048576,
      "schema": {
        "fields": [
          {"name": "input", "type": "string"},
          {"name": "output", "type": "string"},
          {"name": "category", "type": "string"}
        ]
      }
    },
    {
      "dataset_id": "dataset_12346",
      "name": "Evaluation Dataset",
      "created_at": "2024-01-20T14:15:00Z",
      "file_count": 2,
      "total_size_bytes": 524288,
      "schema": {
        "fields": [
          {"name": "prompt", "type": "string"},
          {"name": "response", "type": "string"}
        ]
      }
    }
  ]
}

Code Examples

import requests

response = requests.post(
    'https://api.sutro.sh/list-datasets',
    headers={
        'Authorization': 'Key YOUR_SUTRO_API_KEY'
    }
)

result = response.json()
print(f"Found {len(result['datasets'])} datasets:")

for dataset in result['datasets']:
    print(f"Dataset ID: {dataset['dataset_id']}")
    print(f"Name: {dataset['name']}")
    print(f"Created: {dataset['created_at']}")
    print(f"Files: {dataset['file_count']}")
    print(f"Size: {dataset['total_size_bytes']} bytes")
    print("---")

Dataset Object Fields

Each dataset in the datasets array contains the following fields:
  • dataset_id: Unique identifier for the dataset
  • name: Human-readable name of the dataset
  • created_at: ISO timestamp of when the dataset was created
  • file_count: Number of files in the dataset
  • total_size_bytes: Total size of all files in the dataset (in bytes)
  • schema: Schema information including field names and types

Notes

  • Datasets are returned in reverse chronological order (newest first)
  • The response includes metadata about each dataset to help with dataset management
  • Use the dataset_id from this response with other dataset-related endpoints