🐍 Python Client API Reference¶

EGPClient ¶

EGPClient(api_key=None, account_id=None, endpoint_url=None, config_generator=None, log_curl_commands=None)

The SGP client object. This is the main entry point for interacting with SGP.

From this client you can access "collections" which interact with various SGP components. Each collection will have various methods that interact with the API. Some collections may have sub-collections to signify a hierarchical relationship between the entities they represent.

For users within strict firewall environments, the client can be configured to use a proxy via the config_generator argument. Here is an example of how to do Kerberos Authentication through a proxy.

import httpx
from requests_kerberos import HTTPKerberosAuth

from scale_egp.sdk.client import EGPClient, EGPClientConfig, EGPClientConfigGenerator

class KerberosProxyConfigGenerator(EGPClientConfigGenerator):

    def __init__(self, proxy_url: str):
        self._proxy_url = proxy_url

    def generate(self) -> EGPClientConfig:
        return EGPClientConfig(proxies={
            "http://": httpx.Proxy(url=self._proxy_url, headers=self._get_proxy_headers()),
            "https://": httpx.Proxy(url=self._proxy_url, headers=self._get_proxy_headers()),
        })

    def _get_proxy_headers(self) -> httpx.Headers:
        auth = HTTPKerberosAuth()
        negotiate_details = auth.generate_request_header(None, parse_url(self._proxy_url).host, is_preemptive=True)
        return httpx.Headers({"Proxy-Authorization": negotiate_details}, encoding="utf-8")

client = EGPClient(
    api_key="<API_KEY>",
    config_generator=KerberosProxyConfigGenerator("http://proxy.example.com:3128")
)

Parameters:

Name	Type	Description	Default
`api_key`	`str`	The SGP API key to use. If not provided, the `EGP_API_KEY` environment variable will be used. Enterprise customers of SGP should use the API key provided to them by their Scale account manager.	`None`
`account_id`	`str`	The SGP account ID to use. If not provided, the `ACCOUNT_ID` environment variable will be used.	`None`
`endpoint_url`	`str`	The SGP endpoint URL to use. If not provided, the `EGP_ENDPOINT_URL` environment variable will be used. If that is not set, the default SGP endpoint URL `https://api.egp.scale.com` will be used. Enterprise customers of SGP should use the endpoint URL provided by their Scale account manager.	`None`
`config_generator`	`Optional[EGPClientConfigGenerator]`	An instance of EGPClientConfigGenerator, which must implement a generate function that returns an EGPClientConfig object. The client config will be used to inject httpx.Client arguments on demand per request. This is useful for dynamically setting proxies, timeouts, etc.	`None`

application_specs ¶

application_specs()

Returns the Application Spec Collection.

Use this collection to create and manage Application Specs. These are specifications for the AI application you are building. They contain information about the AI application such as its name and description. They are useful to associate your Evaluations with so evaluations can be grouped by application.

Returns:

Type	Description
`ApplicationSpecCollection`	The Application Spec Collection.

chunks ¶

chunks()

Returns the Chunk Collection.

Use this collection to create and manage Chunks.

Returns:

Type	Description
`ChunkCollection`	The Chunk Collection.

completions ¶

completions()

Returns the Completion Collection.

Use this collection if you want to make request to an LLM to generate a completion.

Returns:

Type	Description
`CompletionCollection`	The Completion Collection.

evaluation_configs ¶

evaluation_configs()

Returns the Evaluation Config Collection.

Use this collection to manage Evaluation Configurations. Evaluation Configurations are used to define the parameters of an evaluation.

Returns:

Type	Description
`EvaluationConfigCollection`	The Evaluation Config Collection.

evaluation_datasets ¶

evaluation_datasets()

Returns the Evaluation Dataset Collection.

Use this collection to create and manage Evaluation Datasets or Test Cases within them.

Returns:

Type	Description
`EvaluationDatasetCollection`	The Evaluation Dataset Collection.

evaluations ¶

evaluations()

Returns the Evaluation Collection.

Use this collection to create and manage Evaluations and Test Case Results.

Evaluations are used to evaluate the performance of AI applications. Users are expected to follow the following procedure to perform an evaluation:

Select an Evaluation Dataset
Iterate through the dataset's Test Cases:
For each of these test cases, the user use their AI application to generate output data on each test case input prompt.
The user then submits this data as as batch of Test Case Results associated with an Evaluation.
Annotators will asynchronously log into the SGL annotation platform to annotate the submitted Test Case Results. The annotations will be used to evaluate the performance of the AI application.
The submitting user will check back on their Test Case Results to see if the result field was populated. If so, the evaluation is complete and the user can use the annotation data to evaluate the performance of their AI application.

Returns:

Type	Description
`EvaluationCollection`	The Evaluation Collection.

knowledge_base_data_sources ¶

knowledge_base_data_sources()

Returns the Knowledge Base Data Source Collection.

Use this collection to create and manage Knowledge Bases.

Returns:

Type	Description
`KnowledgeBaseDataSourceCollection`	The Knowledge Base Data Source Collection.

knowledge_bases ¶

knowledge_bases()

Returns the Knowledge Base Collection.

Use this collection to create and manage Knowledge Bases.

Returns:

Type	Description
`KnowledgeBaseCollection`	The Knowledge Base Collection.

model_groups ¶

model_groups()

Returns the Model Group Collection.

Use this collection to create and manage Model Groups.

TODO: Write extensive documentation on Model Groups

Returns:

Type	Description
`ModelGroupCollection`	The Model Group Collection.

model_templates ¶

model_templates()

Returns the Model Template Collection.

Use this collection to create and manage Model Templates.

In order to prevent any user from creating any arbitrary model, users with more advanced permissions can create Model Templates. Models can only be created from Model Templates. This allows power users to create a set of approved models that other users can derive from.

When the model is instantiated from a model template, the settings from the template are referenced to reserve the required computing resources, pull the correct docker image, etc.

Returns:

Type	Description
`ModelTemplateCollection`	The Model Template Collection.

models ¶

models()

Returns the Model Collection.

Use this collection to create and manage Models.

in generative AI applications, there are many types of models that are useful. For example, embedding models are useful for translating natural language into query-able vector representations, reranking models are useful when a vector database's query results need to be re-ranked based on some other criteria, and LLMs are useful for generating text from a prompt.

This collection allows you to create, deploy, and manage any custom model you choose if none of the built-in models fit your use case.

Returns:

Type	Description
`ModelInstanceCollection`	The Model Collection.

question_sets ¶

question_sets()

Returns the Question Set Collection.

Use this collection to create and manage Question Sets.

Returns:

Type	Description
`QuestionSetCollection`	The Question Set Collection.

questions ¶

questions()

Returns the Question Collection.

Use this collection to create and manage Questions.

Returns:

Type	Description
`QuestionCollection`	The Question Collection.

users ¶

users()

Returns the Users Collection.

Use this collection to get information about the currently authenticated user or to get information about other users.

Returns:

Type	Description
`UsersCollection`	The Users Collection.

KnowledgeBaseCollection ¶

KnowledgeBaseCollection(api_client)

artifacts ¶

artifacts()

Returns a KnowledgeBaseArtifactsCollection object for artifacts associated with a knowledge base.

Returns:

Type	Description
`KnowledgeBaseArtifactsCollection`	A KnowledgeBaseArtifactsCollection object.

chunks ¶

chunks()

Returns a KnowledgeBaseChunksCollection object for chunks associated with a knowledge base.

Returns:

Type	Description
`KnowledgeBaseChunksCollection`	A KnowledgeBaseChunksCollection object.

create ¶

create(name, embedding_model_name=None, model_deployment_id=None, metadata=None, account_id=None)

Create a new Knowledge Base. Must pass either embedding_model_name or model_deployment_id.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the Knowledge Base.	required
`embedding_model_name`	`Optional[EmbeddingModelName]`	The name of the embedding model to use for the Knowledge Base.	`None`
`model_deployment_id`	`Optional[str]`	ID for a EmbeddingConfigModelsAPI config.	`None`
`metadata`	`Optional[Dict[str, Any]]`	The metadata of the Knowledge Base.	`None`
`account_id`	`Optional[str]`	The ID of the account to create this Knowledge Base for.	`None`

Returns:

Type	Description
`KnowledgeBase`	The newly created Knowledge Base.

delete ¶

delete(id)

Delete a Knowledge Base by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Knowledge Base.	required

get ¶

get(id)

Get an Knowledge Base by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Knowledge Base.	required

Returns:

Type	Description
`KnowledgeBase`	The Knowledge Base.

list ¶

list()

List all Knowledge Bases.

Returns:

Type	Description
`List[KnowledgeBase]`	A list of Knowledge Bases.

upload_schedules ¶

upload_schedules()

Returns a KnowledgeBaseUploadScheduleCollection object for upload schedules associated with a knowledge base.

Returns:

Type	Description
`KnowledgeBaseUploadScheduleCollection`	A KnowledgeBaseUploadScheduleCollection object.

uploads ¶

uploads()

Returns a KnowledgeBaseUploadsCollection object for uploads associated with a knowledge base.

Returns:

Type	Description
`KnowledgeBaseUploadsCollection`	A KnowledgeBaseUploadsCollection object.

KnowledgeBaseUploadsCollection ¶

KnowledgeBaseUploadsCollection(api_client)

cancel ¶

cancel(knowledge_base, id)

Cancel an upload.

Parameters:

Name	Type	Description	Default
`knowledge_base`	`KnowledgeBase`	The Knowledge Base the upload was created for.	required
`id`	`str`	The ID of the upload to cancel.	required

Returns:

Type	Description
`bool`	True if the upload was canceled, False otherwise.

create_local_upload ¶

create_local_upload(knowledge_base, data_source_config, chunks)

Create a new local upload.

Parameters:

Name	Type	Description	Default
`knowledge_base`	`KnowledgeBase`	The Knowledge Base to upload data to.	required
`data_source_config`	`LocalChunksSourceConfig`	The data source config.	required
`chunks`	`List[ChunkToUpload]`	The chunks to upload.	required

Returns:

Type	Description
`KnowledgeBaseUpload`	The newly created local upload.

create_remote_upload ¶

create_remote_upload(knowledge_base, data_source_config, data_source_auth_config, chunking_strategy_config)

Create a new remote upload.

Parameters:

Name	Type	Description	Default
`knowledge_base`	`KnowledgeBase`	The Knowledge Base to upload data to.	required
`data_source_config`	`RemoteDataSourceConfig`	The data source config.	required
`data_source_auth_config`	`Optional[DataSourceAuthConfig]`	The data source auth config.	required
`chunking_strategy_config`	`ChunkingStrategyConfig`	The chunking strategy config.	required

Returns:

Type	Description
`KnowledgeBaseUpload`	The newly created remote upload.

get ¶

get(id, knowledge_base)

Get an Knowledge Base Upload by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Knowledge Base Upload.	required
`knowledge_base`	`KnowledgeBase`	The Knowledge Base the upload was created for.	required

Returns:

Type	Description
`KnowledgeBaseUpload`	The Knowledge Base Upload.

list ¶

list(knowledge_base)

List all Knowledge Base Uploads.

Returns:

Type	Description
`List[KnowledgeBaseUpload]`	A list of Knowledge Base Uploads.

KnowledgeBaseArtifactsCollection ¶

KnowledgeBaseArtifactsCollection(api_client)

get ¶

get(id, knowledge_base, status_filter=ChunkUploadStatus.COMPLETED.value)

Get a Knowledge Base Artifact by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Knowledge Base Artifact.	required
`knowledge_base`	`KnowledgeBase`	The Knowledge Base the artifact was created for.	required
`status_filter`	`Optional[ChunkUploadStatus]`	Return only artifacts with the given status.	`value`

Returns:

Type	Description
`KnowledgeBaseArtifact`	The Knowledge Base Artifact.

list ¶

list(knowledge_base)

List all Knowledge Base Artifacts.

Returns:

Type	Description
`List[KnowledgeBaseArtifact]`	A list of Knowledge Base Artifacts.

ChunkCollection ¶

ChunkCollection(api_client)

rank ¶

rank(query, relevant_chunks, rank_strategy, top_k=None, account_id=None)

Re-rank a list of chunks against a query.

Parameters:

Name	Type	Description	Default
`query`	`str`	Natural language query to re-rank chunks against. If a vector store query was originally used to retrieve these chunks, please use the same query for this ranking.	required
`relevant_chunks`	`List[Chunk]`	List of chunks to rank.	required
`rank_strategy`	`Union[CrossEncoderRankStrategy, RougeRankStrategy, ModelRankStrategy]`	The ranking strategy to use. Rank strategies determine how the ranking is done, They consist of the ranking method name and additional params needed to compute the ranking. So far, only the `cross_encoder` rank strategy is supported. We plan to support more rank strategies soon.	required
`top_k`	`Optional[int]`	Number of chunks to return. Must be greater than 0 if specified. If not specified, all chunks will be returned.	`None`

Returns:

Type	Description
`List[Chunk]`	An ordered list of the re-ranked chunks.

synthesize ¶

synthesize(query, chunks)

Synthesize a natural language response from a list of chunks.

Parameters:

Name	Type	Description	Default
`query`	`str`	Natural language query to synthesize response from.	required
`chunks`	`List[Chunk]`	List of chunks to synthesize response from.	required

Returns:

Type	Description
`str`	A natural language response synthesized from the list of chunks.

CompletionCollection ¶

CompletionCollection(api_client)

create ¶

create(model, prompt, account_id, images=None, model_parameters=None)

Create a new LLM Completion.

Parameters:

Name	Type	Description	Default
`model`	Union[Literal['gpt-4', 'gpt-4-0613', 'gpt-4-32k', 'gpt-4-32k-0613', 'gpt-4-vision-preview', 'gpt-4o', 'gpt-3.5-turbo', 'gpt-3.5-turbo-0613', 'gpt-3.5-turbo-16k', 'gpt-3.5-turbo-16k-0613', 'text-davinci-003', 'text-davinci-002', 'text-curie-001', 'text-babbage-001', 'text-ada-001', 'claude-instant-1', 'claude-instant-1.1', 'claude-2', 'claude-2.0', 'llama-7b', 'llama-2-7b', 'llama-2-7b-chat', 'llama-2-13b', 'llama-2-13b-chat', 'llama-2-70b', 'llama-2-70b-chat', 'falcon-7b', 'falcon-7b-instruct', 'falcon-40b', 'falcon-40b-instruct', 'mpt-7b', 'mpt-7b-instruct', 'flan-t5-xxl', 'mistral-7b', 'mistral-7b-instruct', 'mixtral-8x7b', 'mixtral-8x7b-instruct', 'llm-jp-13b-instruct-full', 'llm-jp-13b-instruct-full-dolly', 'zephyr-7b-alpha', 'zephyr-7b-beta', 'codellama-7b', 'codellama-7b-instruct', 'codellama-13b', 'codellama-13b-instruct', 'codellama-34b', 'codellama-34b-instruct', 'codellama-70b', 'codellama-70b-instruct', 'gemini-pro', 'gemini-1.5-pro-preview-0409'], str]	The model to use for the completion.	required
`prompt`	`str`	The prompt to use for the completion.	required
`model_parameters`	`Optional[ModelParameters]`	The parameters to use for the model.	`None`

Returns:

Type	Description
`Completion`	The newly created Completion.

stream ¶

stream(model, prompt, account_id, images=None, model_parameters=None)

Stream LLM Completions.

Returns:

Type	Description
`Iterable[Completion]`	The newly created Completion.

ModelTemplateCollection ¶

ModelTemplateCollection(api_client)

Collections class for SGP Models.

create ¶

create(name, endpoint_type, model_type, vendor_configuration, model_creation_parameters_schema=None, model_request_parameters_schema=None, account_id=None)

Create a new SGP Model Template.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the Model Template.	required
`endpoint_type`	`ModelEndpointType`	The type of model this template will create. See Model Types and Schemas	required
`model_type`	`ModelType`	The type of the Model Template.	required
`vendor_configuration`	`ModelVendorConfiguration`	The vendor configuration of the Model Template.	required
`model_creation_parameters_schema`	`Optional[ParameterSchema]`	The model creation parameters schema of the Model Template.	`None`
`model_request_parameters_schema`	`Optional[ParameterSchema]`	The model request parameters schema of the Model Template.	`None`
`account_id`	`Optional[str]`	The account ID of the Model Template.	`None`

Returns:

Type	Description
`ModelTemplate`	The created Model Template.

delete ¶

delete(id)

Delete a Model Template by ID.

Returns:

Type	Description
`bool`	True if the Model Template was successfully deleted.

get ¶

get(id)

Get a Model Template by ID.

Returns:

Type	Description
`ModelTemplate`	The Model Template.

list ¶

list()

List all Model Templates that the user has access to.

Returns:

Type	Description
`List[ModelTemplate]`	A list of Model Templates that the user has access to.

update ¶

update(id, *, name, endpoint_type=None, model_type=None, vendor_configuration=None, model_creation_parameters_schema=None, model_request_parameters_schema=None)

Update a Model Template by ID.

Returns:

Type	Description
`ModelTemplate`	The updated Model Template.

ModelInstanceCollection ¶

ModelInstanceCollection(api_client)

Collections class for SGP Models.

create ¶

create(name, model_type, model_group_id=None, model_vendor=None, model_template_id=None, base_model_id=None, base_model_metadata=None, account_id=None, model_card=None, training_data_card=None)

Create a new SGP Model.

Returns:

Type	Description
`ModelInstance`	The created Model.

delete ¶

delete(id)

Delete a model by ID.

Returns:

Type	Description
`bool`	True if the model was successfully deleted.

deployments ¶

deployments()

Returns a ModelDeploymentCollection for deployments associated with this model.

get ¶

get(id)

Get a Model by ID.

Returns:

Type	Description
`ModelInstance`	The Model.

list ¶

list()

List all models that the user has access to.

Returns:

Type	Description
`List[ModelInstance]`	A list of models.

update ¶

update(id, *, name=None, model_template_id=None, base_model_id=None, model_creation_parameters=None)

Update a Model by ID.

Returns:

Type	Description
`ModelInstance`	The updated Model.

ModelDeploymentCollection ¶

ModelDeploymentCollection(api_client)

create ¶

create(model, name, model_creation_parameters=None, vendor_configuration=None, account_id=None)

Create a new ModelDeployment.

Parameters:

Name	Type	Description	Default
`model`	`ModelInstance`	The Model to associate the ModelDeployment with.	required

Returns:

Type	Description
`ModelDeployment`	The newly created ModelDeployment.

delete ¶

delete(id, model)

Delete a ModelDeployment.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the ModelDeployment.	required
`model`	`ModelInstance`	The Model to associate the ModelDeployment with.	required

execute ¶

execute(id, model, request, timeout=None)

Execute the specified model deployment with the given request.

Returns:

Type	Description
`BaseModelResponse`	The model deployment's response.

get ¶

get(id, model)

Get a ModelDeployment by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the ModelDeployment.	required
`model`	`ModelInstance`	The Model to associate the ModelDeployment with.	required

Returns:

Type	Description
`ModelDeployment`	The ModelDeployment.

list ¶

list(model)

List ModelDeployment.

Parameters:

Name	Type	Description	Default
`model`	`ModelInstance`	The Model to associate the ModelDeployment with.	required

Returns:

Type	Description
`List[ModelDeployment]`	A list of ModelDeployment.

ApplicationSpecCollection ¶

ApplicationSpecCollection(api_client)

create ¶

create(name, description, account_id=None)

Create a new Application Spec.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the Application Spec.	required
`description`	`str`	The description of the Application Spec.	required
`account_id`	`Optional[str]`	The ID of the account to create this Application Spec for.	`None`

Returns:

Type	Description
`ApplicationSpec`	The newly created Application Spec.

delete ¶

delete(id)

Delete an Application Spec by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Application Spec.	required

get ¶

get(id)

Get an Application Spec by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Application Spec.	required

Returns:

Type	Description
`ApplicationSpec`	The Application Spec.

list ¶

list()

List all Application Specs.

Returns:

Type	Description
`List[ApplicationSpec]`	A list of Application Specs.

update ¶

update(id, name=None, description=None)

Update an Application Spec by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Application Spec.	required
`name`	`Optional[str]`	The name of the Application Spec.	`None`
`description`	`Optional[str]`	The description of the Application Spec.	`None`

Returns:

Type	Description
`ApplicationSpec`	The updated Application Spec.

QuestionCollection ¶

QuestionCollection(api_client)

create ¶

create(type, title, prompt, choices, multi=None, dropdown=False, required=False, conditions=None, account_id=None)

Create a new Question.

Parameters:

Name	Type	Description	Default
`type`	`QuestionType`	The type of the Question.	required
`title`	`str`	The title of the Question.	required
`prompt`	`str`	The prompt of the Question.	required
`account_id`	`Optional[str]`	The ID of the account to create this Question for.	`None`
`choices`	`Optional[List[CategoricalChoice]]`	The choices of the Question.	required
`multi`	`Optional[bool]`	Whether the question is multi-select	`None`
`dropdown`	`Optional[bool]`	Whether the question is to be displayed as a dropdown	`False`
`required`	`Optional[bool]`	Whether the question is required	`False`
`conditions`	`Optional[List[dict]]`	The conditions for the question	`None`

Returns: The newly created Evaluation Configuration.

get ¶

get(id)

Get the details of a Question.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Question.	required

Returns:

Type	Description
`Question`	The Question.

list ¶

list()

List Questions.

Returns:

Type	Description
`List[Question]`	A list of Questions.

QuestionSetCollection ¶

QuestionSetCollection(api_client)

create ¶

create(name, questions, account_id=None)

Create a new Question Set.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the Question Set.	required
`questions`	`List[Question]`	The questions in this Question Set.	required
`account_id`	`Optional[str]`	The ID of the account to create this Question Set for.	`None`

Returns: The newly created Evaluation Configuration.

get ¶

get(id)

Get the details of a Question Set.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Question Set.	required

Returns:

Type	Description
`QuestionSet`	The details of the Question Set.

list ¶

list()

List Question Sets.

Returns:

Type	Description
`List[QuestionSet]`	A list of Question Sets.

EvaluationConfigCollection ¶

EvaluationConfigCollection(api_client)

create ¶

create(evaluation_type, question_set, account_id=None)

Create a new Evaluation Configuration.

Parameters:

Name	Type	Description	Default
`evaluation_type`	`EvaluationType`	The type of Evaluation. Only `HUMAN` is supported.	required
`question_set`	`QuestionSet`	The Question Set to associate with the Evaluation.	required
`account_id`	`Optional[str]`	The ID of the account to create this Evaluation Configuration for.	`None`

Returns:

Type	Description
`EvaluationConfig`	The newly created Evaluation Configuration.

delete ¶

delete(id)

Delete an Evaluation Configuration.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Evaluation Configuration.	required

get ¶

get(id)

Get the details of an evaluation config.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Evaluation Configuration.	required

Returns:

Type	Description
`EvaluationConfig`	The Evaluation Configuration.

list ¶

list()

List Evaluation Configurations.

Returns:

Type	Description
`List[EvaluationConfig]`	A list of Evaluation Configurations.

EvaluationDatasetCollection ¶

EvaluationDatasetCollection(api_client)

add_test_cases ¶

add_test_cases(evaluation_dataset, test_cases_data, update_dataset_version=True)

Add new test cases to an existing dataset.

Unless you want to batch up multiple modifications to a dataset and snapshot them all at once, you should leave update_dataset_version=True. See the docs for the update_dataset_version method for more details.

The schema_types currently supported and their corresponding fields are:

Schema Types:

GENERATION

Field	Type	Default
`input`	`str`	required
`expected_output`	`str`	`None`
`expected_extra_info`	`Dict[str, Any]`	`None`

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to add test cases to.	required
`test_cases_data`	`List[Union[GenerationTestCaseData]]`	The test cases to add.	required
`update_dataset_version`	`bool`	Whether to update the dataset version after adding the test cases. Defaults to True.	`True`

Returns:

Type	Description
`EvaluationDataset`	The updated dataset.

add_test_cases_from_file ¶

add_test_cases_from_file(evaluation_dataset, filepath, update_dataset_version=True)

Add new test cases to an existing dataset from a JSONL file.

Unless you want to batch up multiple modifications to a dataset and snapshot them all at once, you should leave update_dataset_version=True. See the docs for the update_dataset_version method for more details.

The schema_types currently supported and their corresponding fields are:

Schema Types:

GENERATION

Field	Type	Default
`input`	`str`	required
`expected_output`	`str`	`None`
`expected_extra_info`	`Dict[str, Any]`	`None`

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to add test cases to.	required
`filepath`	`str`	The path to the JSONL file.	required
`update_dataset_version`	`bool`	Whether to update the dataset version after adding the test cases.	`True`

Returns:

Type	Description
`EvaluationDataset`	The updated dataset.

create ¶

create(name, schema_type, account_id=None)

Create a new empty dataset.

Generally since most users will already have a list of test cases they want to create a dataset from, they should use the create_from_file method instead.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the dataset.	required
`schema_type`	`TestCaseSchemaType`	The schema type of the dataset.	required
`account_id`	`Optional[str]`	The ID of the account to create this dataset for.	`None`

Returns:

Type	Description
`EvaluationDataset`	The newly created dataset.

create_from_file ¶

create_from_file(name, schema_type, filepath)

Create a new dataset that is seeded with test cases from a JSONL file.

The headers of the JSONL file must match the fields of the specified schema type.

The schema_types currently supported and their corresponding fields are:

Schema Types:

GENERATION

Field	Type	Default
`input`	`str`	required
`expected_output`	`str`	`None`
`expected_extra_info`	`Dict[str, Any]`	`None`

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the dataset.	required
`schema_type`	`TestCaseSchemaType`	The schema type of the dataset.	required
`filepath`	`str`	The path to the JSONL file.	required

Returns:

Type	Description
`EvaluationDataset`	The newly created dataset.

delete ¶

delete(id)

Delete an existing dataset.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the dataset.	required

Returns:

Type	Description
`bool`	True if the dataset was deleted, False otherwise.

delete_test_cases ¶

delete_test_cases(evaluation_dataset, test_case_ids, update_dataset_version=True)

Delete test cases from an existing dataset.

Unless you want to batch up multiple modifications to a dataset and snapshot them all at once, you should leave update_dataset_version=True. See the docs for the update_dataset_version method for more details.

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to delete test cases from.	required
`test_case_ids`	`List[str]`	The IDs of the test cases to delete.	required
`update_dataset_version`	`bool`	Whether to update the dataset version after deleting the test cases.	`True`

Returns:

Type	Description
`EvaluationDataset`	The updated dataset.

get ¶

get(id)

Get an existing dataset by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the dataset.	required

Returns:

Type	Description
`EvaluationDataset`	The dataset.

list ¶

list()

List all datasets.

Returns:

Type	Description
`List[EvaluationDataset]`	The datasets.

modify_test_cases ¶

modify_test_cases(evaluation_dataset, modified_test_cases, update_dataset_version=True)

Modify test cases in an existing dataset.

Unless you want to batch up multiple modifications to a dataset and snapshot them all at once, you should leave update_dataset_version=True. See the docs for the update_dataset_version method for more details.

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to modify test cases in.	required
`modified_test_cases`	`List[TestCase]`	The modified test cases.	required
`update_dataset_version`	`bool`	Whether to update the dataset version after modifying the test cases.	`True`

Returns:

Type	Description
`EvaluationDataset`	The updated dataset.

overwrite_from_file ¶

overwrite_from_file(evaluation_dataset, schema_type, filepath)

Overwrite all test cases in existing dataset from a JSONL file.

The headers of the JSONL file must match the fields of the specified schema type.

The schema_types currently supported and their corresponding fields are:

Schema Types:

GENERATION

Field	Type	Default
`input`	`str`	required
`expected_output`	`str`	`None`
`expected_extra_info`	`Dict[str, Any]`	`None`

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to overwrite.	required
`schema_type`	`TestCaseSchemaType`	The schema type of the dataset.	required
`filepath`	`str`	The path to the JSONL file.	required

Returns:

Type	Description
`EvaluationDataset`	The updated dataset.

test_cases ¶

test_cases()

Returns a TestCaseCollection object for test cases associated with the current Evaluation Dataset.

update ¶

update(id, name=None, schema_type=None)

Update the attributes of an existing dataset.

Important: This method will NOT update the version of the dataset. It will only update the attributes of the dataset. If you want to snapshot the current state of the dataset under an incremented version number, you should use the update_dataset_version method instead.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the dataset.	required
`name`	`Optional[str]`	The name of the dataset.	`None`
`schema_type`	`Optional[TestCaseSchemaType]`	The schema type of the dataset.	`None`

Returns:

Type	Description
`EvaluationDataset`	The updated dataset.

update_dataset_version ¶

update_dataset_version(evaluation_dataset)

Update the version of an existing dataset.

This method will snapshot the current state of the dataset under an incremented version number.

Warning: By default, the add_test_cases, delete_test_cases, and modify_test_cases methods will automatically update the dataset version for you. However, if you want to batch up multiple modifications to a dataset and snapshot them all at once, you can set update_dataset_version=False on those methods and then call this method manually afterward.

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to update the version of.	required

Returns:

Type	Description
`EvaluationDataset`	The updated dataset.

TestCaseCollection ¶

TestCaseCollection(api_client)

create ¶

create(evaluation_dataset, schema_type, test_case_data, test_case_metadata=None, chat_history=None)

Create a new test case.

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to create the test case in.	required
`schema_type`	`TestCaseSchemaType`	The schema type of the test case.	required
`test_case_data`	`Union[GenerationTestCaseData]`	The test case data.	required

Returns:

Type	Description
`TestCase`	The newly created test case.

create_batch ¶

create_batch(evaluation_dataset, test_cases)

Create multiple new test cases.

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to create the test cases in.	required
`test_cases`	`List[TestCaseRequest]`	The test cases to create.	required

Returns:

Type	Description
`List[TestCase]`	The newly created test cases.

delete ¶

delete(id, evaluation_dataset)

Delete an existing test case.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the test case.	required
`evaluation_dataset`	`EvaluationDataset`	The dataset to delete the test case from.	required

Returns:

Type	Description
`bool`	True if the test case was deleted successfully, False otherwise.

get ¶

get(id, evaluation_dataset)

Get an existing test case by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the test case.	required
`evaluation_dataset`	`EvaluationDataset`	The dataset to get the test case from.	required

Returns:

Type	Description
`TestCase`	The test case.

iter ¶

iter(evaluation_dataset)

Iterate over all test cases in a dataset.

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to iterate over test cases from.	required

Returns:

Type	Description
`Iterable[TestCase]`	The test cases.

list ¶

list(evaluation_dataset)

List all test cases in a dataset.

Parameters:

Name	Type	Description	Default
`evaluation_dataset`	`EvaluationDataset`	The dataset to list test cases from.	required

Returns:

Type	Description
`List[TestCase]`	The test cases.

update ¶

update(test_case_id, evaluation_dataset, schema_type=None, test_case_data=None, test_case_metadata=None, chat_history=None)

Update an existing test case.

Parameters:

Name	Type	Description	Default
`test_case_id`	`str`	The ID of the test case.	required
`evaluation_dataset`	`EvaluationDataset`	The dataset to update the test case in.	required
`schema_type`	`Optional[TestCaseSchemaType]`	The schema type of the test case.	`None`
`test_case_data`	`Optional[Union[GenerationTestCaseData]]`	The test case data.	`None`

Returns:

Type	Description
`TestCase`	The updated test case.

EvaluationCollection ¶

EvaluationCollection(api_client)

create ¶

create(name, description, application_spec, evaluation_config, tags=None, account_id=None)

Create a new Evaluation.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the Evaluation.	required
`description`	`str`	The description of the Evaluation.	required
`application_spec`	`ApplicationSpec`	The Application Spec to associate the Evaluation with.	required
`evaluation_config`	`EvaluationConfig`	The configuration for the Evaluation.	required
`tags`	`Optional[Dict[str, Any]]`	Optional key, value pairs to associate with the Evaluation.	`None`
`account_id`	`Optional[str]`	The ID of the account to create this Evaluation for.	`None`

Returns:

Type	Description
`Evaluation`	The newly created Evaluation.

delete ¶

delete(id)

Delete an Evaluation.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Evaluation.	required

get ¶

get(id)

Get an Evaluation by ID.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Evaluation.	required

Returns:

Type	Description
`Evaluation`	The Evaluation.

list ¶

list()

List Evaluations.

Returns:

Type	Description
`List[Evaluation]`	A list of Evaluations.

test_case_results ¶

test_case_results()

Returns a TestCaseResultCollection for test case results associated with this evaluation.

update ¶

update(id, name=None, description=None, evaluation_config=None, tags=None)

Update an Evaluation.

Parameters:

Name	Type	Description	Default
`id`	`str`	The ID of the Evaluation.	required
`name`	`Optional[str]`	The name of the Evaluation.	`None`
`description`	`Optional[str]`	The description of the Evaluation.	`None`
`evaluation_config`	`Optional[EvaluationConfig]`	The configuration for the Evaluation.	`None`
`tags`	`Optional[Dict[str, Any]]`	Optional key, value pairs to associate with the Evaluation.	`None`

Returns:

Type	Description
`Evaluation`	The updated Evaluation.