Python SDK#
The Python SDK provides a Pythonic way to interact with the API. In many prototyping scenarios, you may find it most convenient to use the Python SDK and CLI to interact with Sutro.
See the Installation guide to install the SDK.
Basic Methods#
Setting your API key#
When you initialize the SDK, you can set your API key by calling the set_api_key
method. Additionally, you can set your API key by running the sutro login command in the CLI.
- set_api_key(self, api_key: str)#
Set the API key for the Sutro API.
- Parameters:
api_key (str): The API key to set.
Returns: None
Running batch inference#
- infer(self, data, model='llama-3.1-8b', column=None, output_column='inference_result', job_priority=0, output_schema=None, system_prompt=None, dry_run=False, stay_attached=None, truncate_rows=False)#
Run LLM inference on a large list, table, dataframe, or file.
- Parameters:
data (Union[List, pd.DataFrame, pl.DataFrame, str]): The data to run inference on.
model (str, optional): The model to use for inference. Default is “llama-3.1-8b”.
column (str, optional): The column name to use for inference. Required if data is a DataFrame or file path.
output_column (str, optional): The column name to store the inference results in if input is a DataFrame. Defaults to “inference_result”.
job_priority (int, optional): The priority of the job. Default is 0.
output_schema (Union[Dict[str, Any], BaseModel], optional): A structured schema for the output. Can be either a dictionary representing a JSON schema or a pydantic BaseModel. Defaults to None.
system_prompt (str, optional): A system prompt to add to all inputs. This allows you to define the behavior of the model. Defaults to None.
sampling_params (dict, optional): A dictionary of sampling parameters to use for the inference. Defaults to None, which uses the default sampling parameters.
random_seed_per_input (bool, optional): If True, a random seed will be generated for each input. This is useful for diversity in outputs. Defaults to False.
dry_run (bool, optional): If True, return cost estimates instead of running inference. Default is False.
stay_attached (bool, optional): If True, the SDK will stay attached to the job and update you on the status and results as they become available. Default behavior is True for priority 0 jobs, and False for priority 1 jobs.
truncate_rows (bool, optional): If True, any rows that have a token count exceeding the context window length of the selected model will be truncated to the max length that will fit within the context window. Defaults to False.
Returns: Union[List, pd.DataFrame, pl.DataFrame, str]: The results of the inference or job ID.
Monitoring job status#
- attach(self, job_id: str)#
Attach to an existing job and stream its progress in real-time. This has the equivalent behavior of setting stay_attached=True when calling infer(…)
This method connects to a running job and displays live progress updates, including the number of rows processed and token statistics. It shows a progress bar with real-time updates until the job completes.
- Parameters:
job_id (str): The ID of the job to attach to
Returns: None
- Job Status Behavior:
RUNNING
: Streams progress updates with a live progress bar and job statisticsSUCCEEDED
: Notifies that the job already completed and suggests usingsutro jobs results
FAILED
: Displays failure message and exitsCANCELLED
: Displays cancellation message and exits
- Example:
>>> # Attach to a running job to monitor its progress >>> client.attach("job_12345") >>> # Progress bar will display: >>> # Progress: 45%|████████████████ | 450/1000 [00:32<00:45] >>> # Input tokens processed: 12500, Tokens generated: 8300, Total tokens/s: 325.4
Note: This method is ideal for monitoring long-running jobs interactively. For programmatic use cases where you don’t want live progress updates, use the simpler
await_job_completion()
instead.
- await_job_completion(self, job_id: str, timeout: int | None = 7200) list | None #
When deployed as part of a pipeline (Dagster, Airflow, etc) you might not be interested in seeing the progress of the job as it happens. await_job_completion is best for this use case, and should only be used when not using the stay_attached parameter of infer(…), or the attach(…) function.
Waits for a job to complete and return its results upon successful completion.
This method polls the job status every 5 seconds (and prints it out) until the job completes, fails, is cancelled, or the timeout is reached.
- Parameters:
job_id (str): The ID of the job to await.
timeout (Optional[int]): Maximum time in seconds to wait for job completion. Defaults to 7200 (2 hours).
Returns: list | None: The results of the job if it completes successfully, or
None
if the job fails, is cancelled, or encounters an error.- Job Status Outcomes:
SUCCEEDED
: Returns the job resultsFAILED
: ReturnsNone
CANCELLED
: ReturnsNone
- Example:
>>> results = client.await_job_completion("job_12345", timeout=3600) >>> # Job status is RUNNING for job-f9102252-ae2f-4d61-a879-a657e314f2e0 >>> if results: ... print(f"Job completed with {len(results)} results")
Getting quotas#
- get_quotas(self)#
Get your current quotas.
Returns: list: A list of quotas, one for each priority level. Contains
row_quota
andtoken_quota
for each priority level.
Job Methods#
Listing jobs#
- list_jobs(self)#
List all jobs associated with the API key.
Returns: list: A list of job details.
Getting job status#
- get_job_status(self, job_id: str)#
Get the status of a job by its ID.
- Parameters:
job_id (str): The ID of the job to retrieve the status for.
Returns: dict: The status of the job.
Getting job results#
- get_job_results(self, job_id: str)#
Get the results of a job by its ID.
- Parameters:
job_id (str): The ID of the job to retrieve the results for.
include_inputs (bool, optional): Whether to include the inputs in the results. Defaults to False.
include_cumulative_logprobs (bool, optional): Whether to include the cumulative logprobs in the results. Defaults to False.
Returns: Union[List, Dict]: The results of the job. If
include_inputs
isTrue
, the results will be a dictionary withinputs
andoutputs
keys. Ifinclude_inputs
isFalse
, the results will be a list of outputs, in the same order as the inputs.
Cancelling jobs#
- cancel_job(self, job_id: str)#
Cancel a job by its ID.
- Parameters:
job_id (str): The ID of the job to cancel.
Returns: dict: The status of the job cancellation.
Dataset Methods#
Creating a dataset#
- create_dataset(self)#
Create a new internal dataset.
Returns: dict: A dictionary containing the dataset ID.
Listing all datasets#
- list_datasets(self)#
List all datasets.
Returns: list: A list of dataset IDs.
Listing all files in a dataset#
- list_dataset_files(self, dataset_id: str)#
List all files in a dataset.
- Parameters:
dataset_id (str): The ID of the dataset to list the files in.
Returns: list: A list of file names in the dataset.
Uploading files to a dataset#
- upload_to_dataset(self, dataset_id: List[str] | str = None, file_paths: List[str] | str = None)#
Upload files to a dataset.
This method uploads files to a dataset. Accepts a dataset ID and file paths. If only a single parameter is provided, it will be interpreted as the file paths.
- Parameters:
dataset_id (Union[List[str], str], optional): The ID of the dataset to upload the files to. If not provided, the files will be uploaded to a new dataset.
file_paths (Union[List[str], str], optional): A list of file paths to upload.
Returns: list: A list of file names in the dataset.
Downloading files from a dataset#
- download_from_dataset(self, dataset_id: str, files: List[str] | str = None, output_path: str = None)#
Download a file from a dataset.
This method downloads files from a dataset. Accepts a dataset ID and file name. If no file name is provided, all files in the dataset will be downloaded.
- Parameters:
dataset_id (str): The ID of the dataset to download the file from.
files (Union[List[str], str], optional): The name(s) of the file(s) to download. If not provided, all files in the dataset will be downloaded.
output_path (str, optional): The directory to save the downloaded files to. If not provided, the files will be saved to the current working directory.
Returns: None