Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.sutro.sh/llms.txt

Use this file to discover all available pages before exploring further.

Sutro Functions

Summary

Sutro Functions are task-specific classifiers and extractors (primarily) that are aligned with your decision preferences and inexpensive to run. Using Sutro Functions replaces most of the brittle work that usually shows up around prompt engineering, model selection, and manual schema wiring. You describe the task, review uncertain cases, encode justifications, and iterate until the function behaves the way your team wants. Sutro Web UI

Why Sutro Functions?

When you use AI for a repeated business task, the default move is often to reach for a large general-purpose model and keep layering prompt edits or tweaks on top. That usually produces a system that is expensive, inconsistent, and hard to maintain. Its also often a painful, time-consuming process that is not robust against new inputs. Whack-a-mole. A Sutro Function is better suited for this kind of work when you need a task-specific system that is:
  • Highly accurate and aligned with your organization’s preferences
  • Consistent across repeated inputs
  • Cheap enough to run at production scale
  • Easy to redeploy as the task evolves

How does it work?

Start by uploading an unlabeled dataset, choosing a task type, and writing a short task definition. Sutro then iterates on the function by surfacing examples where the best answer is unclear or where your preferences are still ambiguous. You label those cases, explain the decision when needed, and the system uses that feedback to refine the deployed behavior. You can also review high-confidence outputs and correct them if the model is confidently wrong. After a few iterations, the function usually converges on a representation that is much more stable than a one-off prompt. That function can then be deployed to Sutro’s production runtime and updated later as the task changes.

What can I do with a Sutro Function?

Today, production execution supports text-based classification and structured extraction. That covers a wide range of high-value enterprise and research workloads, including:
  • Lead scoring
  • Support routing and triage
  • Document categorization
  • Address normalization
  • Web-page extraction
  • Fraud, scam, or spam detection
  • Semantic tagging for analytics
  • Data quality filtering
  • Product catalog taxonomies
  • Merchant categorization
  • Model and query routing
  • Call transcript analysis
  • Invoice extraction
  • Legal contract extraction
These tasks are often subjective. A function is only useful if it reflects your team’s actual decision preferences, not just a generic model prior.

Running a Function

Once a Function is deployed, invoke it by name through any of the public entry points:

How can I get started?

Sutro Functions is in research preview. If you want access, a walkthrough, or design-partner support, email team@sutro.sh.

FAQ

Even better. Existing labels can pre-populate iterations or act as correlative references, but we still recommend providing justifications where the task is subjective so the function learns your preferences instead of only your labels.
A small, preference-aligned AI system that can be invoked by name through Sutro’s runtime.
Invoke it by name through the Python SDK, the Functions API, or the Batch API.
Both. The interface matters, but the bigger point is that Sutro Functions treats preference alignment as a first-class part of building the model instead of an afterthought bolted onto a generic prompt.
They are meant to be materially cheaper to run than large general-purpose models for repeated tasks. The real benchmark is total task value: higher alignment, lower variance, and lower cost at the same time.
It is still in active development, but the approach is already strong on tasks where a crisp definition, realistic sample distribution, and user preference encoding matter more than raw frontier-model breadth.
The goal is to make it much faster than hand-rolling prompts and post-processing, while producing something more stable in production.
Keep tasks as narrow as possible. If the real business problem is complex, it is often better to decompose it into several smaller functions than to force one function to do everything.
Because subjective tasks usually fail on preference alignment, not raw capability. Bigger models do not automatically learn your business rules, your edge cases, or your tolerance for false positives and false negatives.
Sutro Functions are suited for unstructured input data (text, images, etc.) and do not require upfront data labeling.
Yes. You can return to the function at any time, refine it with new examples, and redeploy.