> ## Documentation Index
> Fetch the complete documentation index at: https://docs.sutro.sh/llms.txt
> Use this file to discover all available pages before exploring further.

# Uploading a File

> Upload a file to a dataset

<Warning>Using the API directly is not recommended for most users. Instead, we recommend using the [Python SDK](/python-sdk/setup).</Warning>

Upload a file to a dataset.

## Usage Notes

* Currently, only parquet files are supported. Snappy compression is supported.
* You can only upload one file at a time via the API. You must make multiple requests to upload multiple files. The Python SDK supports uploading multiple files at once.
* All files must have the same schema. Files with schemas that do not match will be rejected.
* Names must be unique. If you upload a file with a name that already exists in the dataset, it will be rejected.
* When you upload to a dataset, the ordering will be preserved. If you upload multiple files to a dataset, they will be added to the dataset in the order you provide them.

## Request Body

<ParamField body="dataset_id" type="string" required>
  The ID of the dataset to upload the file to
</ParamField>

<ParamField body="file" type="file" required>
  The file to upload (must be a parquet file)
</ParamField>

## Headers

<ParamField header="Authorization" type="string" required>
  Your Sutro API key using Key authentication scheme.

  Format: `Key YOUR_API_KEY`

  Example: `Authorization: Key sk_live_abc123...`
</ParamField>

## Response

Returns the file ID of the uploaded file.

<ResponseField name="file_id" type="string">
  The unique identifier for the uploaded file
</ResponseField>

<ResponseExample>
  ```json Successful Upload theme={null}
  {
    "file_id": "file_abc123def456"
  }
  ```

  ```json Error Response theme={null}
  {
    "error": "File with this name already exists in the dataset"
  }
  ```
</ResponseExample>

## Code Examples

<CodeGroup>
  ```python Python theme={null}
  import requests

  # Upload a parquet file
  with open('data.parquet', 'rb') as file:
      response = requests.post(
          'https://api.sutro.sh/upload-to-dataset',
          headers={
              'Authorization': 'Key YOUR_SUTRO_API_KEY'
          },
          json={
              'dataset_id': 'dataset_12345'
          },
          files={
              'file': file
          }
      )

  result = response.json()
  if 'file_id' in result:
      print(f"File uploaded successfully: {result['file_id']}")
  else:
      print(f"Upload failed: {result.get('error', 'Unknown error')}")
  ```

  ```javascript Node.js theme={null}
  const fs = require('fs');
  const FormData = require('form-data');

  const form = new FormData();
  form.append('dataset_id', 'dataset_12345');
  form.append('file', fs.createReadStream('data.parquet'));

  const response = await fetch('https://api.sutro.sh/upload-to-dataset', {
    method: 'POST',
    headers: {
      'Authorization': 'Key YOUR_SUTRO_API_KEY',
      ...form.getHeaders()
    },
    body: form
  });

  const result = await response.json();
  if (result.file_id) {
    console.log(`File uploaded successfully: ${result.file_id}`);
  } else {
    console.log(`Upload failed: ${result.error || 'Unknown error'}`);
  }
  ```

  ```curl cURL theme={null}
  curl -X POST https://api.sutro.sh/upload-to-dataset \
    -H "Authorization: Key YOUR_SUTRO_API_KEY" \
    -F "dataset_id=dataset_12345" \
    -F "file=@data.parquet"
  ```
</CodeGroup>

## Important Considerations

* **File Format**: Only parquet files are currently supported
* **Schema Consistency**: All files in a dataset must share the same schema
* **Unique Names**: File names must be unique within each dataset
* **Ordering**: Files are processed in the order they are uploaded
* **Compression**: Snappy compression is supported for parquet files
* **Single File Limit**: API supports one file per request (use Python SDK for batch uploads)
