Python SDK

The Python SDK provides a Pythonic way to interact with the API. In many prototyping scenarios, you may find it most convenient to use the Python SDK and CLI to interact with Sutro. See the installation guide to install the SDK.

Basic Methods

Setting your API key

When you initialize the SDK, you can set your API key by calling the set_api_key method. Additionally, you can set your API key by running the sutro login command in the CLI. Set the API key for the Sutro API.
Parameters:
  • api_key (str): The API key to set.
Returns: None

Getting quotas

Get your current quotas.
Returns: list: A list of quotas, one for each priority level. Contains row_quota and token_quota for each priority level.

Job Methods

Listing jobs

List all jobs associated with the API key.
Returns: list: A list of job details.

Getting job status

Get the status of a job by its ID.
Parameters:
  • job_id (str): The ID of the job to retrieve the status for.
Returns: dict: The status of the job.

Getting job results

Get the results of a job by its ID.
Parameters:
  • job_id (str): The ID of the job to retrieve the results for.
  • include_inputs (bool, optional): Whether to include the inputs in the results. Defaults to False.
  • include_cumulative_logprobs (bool, optional): Whether to include the cumulative logprobs in the results. Defaults to False.
  • with_original_df (Union[pl.DataFrame, pd.DataFrame], optional): Original DataFrame to join results with. Defaults to None.
  • output_column (str, optional): Name of the column containing results. Defaults to “inference_result”.
Returns: Union[pl.DataFrame, pd.DataFrame]: Results as a DataFrame.
  • If with_original_df is provided: Returns the same type as the input DataFrame with results added as a new column
  • If with_original_df is None: Returns a polars DataFrame by default
The DataFrame will contain:
  • inputs column (if include_inputs=True). Each cell contains the input string given to the model.
  • inference_result column (or custom name via output_column)
  • cumulative_logprobs column (if include_cumulative_logprobs=True)
Example:
# Get just the results
results = sutro.get_job_results(job_id)
# Returns: pl.DataFrame with one column 'inference_result'
>
# Get results with inputs
results = sutro.get_job_results(job_id, include_inputs=True)
# Returns: pl.DataFrame with columns ['inputs', 'inference_result']
>
# Add results back to original DataFrame
df_with_results = sutro.get_job_results(job_id, with_original_df=original_df)
# Returns: Same type as original_df with 'inference_result' column added. Matches the return shape of .infer(...) when stay_attached=True.

Cancelling jobs

Cancel a job by its ID.
Parameters:
  • job_id (str): The ID of the job to cancel.
Returns: dict: The status of the job cancellation.

Dataset Methods

Creating a dataset

Create a new internal dataset.
Returns: dict: A dictionary containing the dataset ID.

Listing all datasets

List all datasets.
Returns: list: A list of dataset IDs.

Listing all files in a dataset

List all files in a dataset.
Parameters:
  • dataset_id (str): The ID of the dataset to list the files in.
Returns: list: A list of file names in the dataset.

Uploading files to a dataset

Upload files to a dataset.
This method uploads files to a dataset. Accepts a dataset ID and file paths. If only a single parameter is provided, it will be interpreted as the file paths.
Parameters:
  • dataset_id (Union[List[str], str], optional): The ID of the dataset to upload the files to. If not provided, the files will be uploaded to a new dataset.
  • file_paths (Union[List[str], str], optional): A list of file paths to upload.
Returns: list: A list of file names in the dataset.

Downloading files from a dataset

Download a file from a dataset.
This method downloads files from a dataset. Accepts a dataset ID and file name. If no file name is provided, all files in the dataset will be downloaded.
Parameters:
  • dataset_id (str): The ID of the dataset to download the file from.
  • files (Union[List[str], str], optional): The name(s) of the file(s) to download. If not provided, all files in the dataset will be downloaded.
  • output_path (str, optional): The directory to save the downloaded files to. If not provided, the files will be saved to the current working directory.
Returns: None