🐍 Python Client API Reference¶
EGPClient ¶
EGPClient(api_key=None, account_id=None, endpoint_url=None, config_generator=None, log_curl_commands=None)
The SGP client object. This is the main entry point for interacting with SGP.
From this client you can access "collections" which interact with various SGP components. Each collection will have various methods that interact with the API. Some collections may have sub-collections to signify a hierarchical relationship between the entities they represent.
For users within strict firewall environments, the client can be configured to use a proxy via the config_generator argument. Here is an example of how to do Kerberos Authentication through a proxy.
import httpx
from requests_kerberos import HTTPKerberosAuth
from scale_egp.sdk.client import EGPClient, EGPClientConfig, EGPClientConfigGenerator
class KerberosProxyConfigGenerator(EGPClientConfigGenerator):
def __init__(self, proxy_url: str):
self._proxy_url = proxy_url
def generate(self) -> EGPClientConfig:
return EGPClientConfig(proxies={
"http://": httpx.Proxy(url=self._proxy_url, headers=self._get_proxy_headers()),
"https://": httpx.Proxy(url=self._proxy_url, headers=self._get_proxy_headers()),
})
def _get_proxy_headers(self) -> httpx.Headers:
auth = HTTPKerberosAuth()
negotiate_details = auth.generate_request_header(None, parse_url(self._proxy_url).host, is_preemptive=True)
return httpx.Headers({"Proxy-Authorization": negotiate_details}, encoding="utf-8")
client = EGPClient(
api_key="<API_KEY>",
config_generator=KerberosProxyConfigGenerator("http://proxy.example.com:3128")
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
api_key |
str
|
The SGP API key to use. If not provided, the |
None
|
account_id |
str
|
The SGP account ID to use. If not provided, the |
None
|
endpoint_url |
str
|
The SGP endpoint URL to use. If not provided, the |
None
|
config_generator |
Optional[EGPClientConfigGenerator]
|
An instance of EGPClientConfigGenerator, which must implement a generate function that returns an EGPClientConfig object. The client config will be used to inject httpx.Client arguments on demand per request. This is useful for dynamically setting proxies, timeouts, etc. |
None
|
application_specs ¶
Returns the Application Spec Collection.
Use this collection to create and manage Application Specs. These are specifications for the AI application you are building. They contain information about the AI application such as its name and description. They are useful to associate your Evaluations with so evaluations can be grouped by application.
Returns:
Type | Description |
---|---|
ApplicationSpecCollection
|
The Application Spec Collection. |
chunks ¶
Returns the Chunk Collection.
Use this collection to create and manage Chunks.
Returns:
Type | Description |
---|---|
ChunkCollection
|
The Chunk Collection. |
completions ¶
Returns the Completion Collection.
Use this collection if you want to make request to an LLM to generate a completion.
Returns:
Type | Description |
---|---|
CompletionCollection
|
The Completion Collection. |
evaluation_configs ¶
Returns the Evaluation Config Collection.
Use this collection to manage Evaluation Configurations. Evaluation Configurations are used to define the parameters of an evaluation.
Returns:
Type | Description |
---|---|
EvaluationConfigCollection
|
The Evaluation Config Collection. |
evaluation_datasets ¶
Returns the Evaluation Dataset Collection.
Use this collection to create and manage Evaluation Datasets or Test Cases within them.
Returns:
Type | Description |
---|---|
EvaluationDatasetCollection
|
The Evaluation Dataset Collection. |
evaluations ¶
Returns the Evaluation Collection.
Use this collection to create and manage Evaluations and Test Case Results.
Evaluations are used to evaluate the performance of AI applications. Users are expected to follow the following procedure to perform an evaluation:
- Select an Evaluation Dataset
- Iterate through the dataset's Test Cases:
- For each of these test cases, the user use their AI application to generate output data on each test case input prompt.
- The user then submits this data as as batch of Test Case Results associated with an Evaluation.
- Annotators will asynchronously log into the SGL annotation platform to annotate the submitted Test Case Results. The annotations will be used to evaluate the performance of the AI application.
- The submitting user will check back on their Test Case Results to see if the
result
field was populated. If so, the evaluation is complete and the user can use the annotation data to evaluate the performance of their AI application.
Returns:
Type | Description |
---|---|
EvaluationCollection
|
The Evaluation Collection. |
knowledge_base_data_sources ¶
Returns the Knowledge Base Data Source Collection.
Use this collection to create and manage Knowledge Bases.
Returns:
Type | Description |
---|---|
KnowledgeBaseDataSourceCollection
|
The Knowledge Base Data Source Collection. |
knowledge_bases ¶
Returns the Knowledge Base Collection.
Use this collection to create and manage Knowledge Bases.
Returns:
Type | Description |
---|---|
KnowledgeBaseCollection
|
The Knowledge Base Collection. |
model_groups ¶
Returns the Model Group Collection.
Use this collection to create and manage Model Groups.
TODO: Write extensive documentation on Model Groups
Returns:
Type | Description |
---|---|
ModelGroupCollection
|
The Model Group Collection. |
model_templates ¶
Returns the Model Template Collection.
Use this collection to create and manage Model Templates.
In order to prevent any user from creating any arbitrary model, users with more advanced permissions can create Model Templates. Models can only be created from Model Templates. This allows power users to create a set of approved models that other users can derive from.
When the model is instantiated from a model template, the settings from the template are referenced to reserve the required computing resources, pull the correct docker image, etc.
Returns:
Type | Description |
---|---|
ModelTemplateCollection
|
The Model Template Collection. |
models ¶
Returns the Model Collection.
Use this collection to create and manage Models.
in generative AI applications, there are many types of models that are useful. For example, embedding models are useful for translating natural language into query-able vector representations, reranking models are useful when a vector database's query results need to be re-ranked based on some other criteria, and LLMs are useful for generating text from a prompt.
This collection allows you to create, deploy, and manage any custom model you choose if none of the built-in models fit your use case.
Returns:
Type | Description |
---|---|
ModelInstanceCollection
|
The Model Collection. |
question_sets ¶
Returns the Question Set Collection.
Use this collection to create and manage Question Sets.
Returns:
Type | Description |
---|---|
QuestionSetCollection
|
The Question Set Collection. |
questions ¶
Returns the Question Collection.
Use this collection to create and manage Questions.
Returns:
Type | Description |
---|---|
QuestionCollection
|
The Question Collection. |
users ¶
Returns the Users Collection.
Use this collection to get information about the currently authenticated user or to get information about other users.
Returns:
Type | Description |
---|---|
UsersCollection
|
The Users Collection. |
KnowledgeBaseCollection ¶
artifacts ¶
Returns a KnowledgeBaseArtifactsCollection object for artifacts associated with a knowledge base.
Returns:
Type | Description |
---|---|
KnowledgeBaseArtifactsCollection
|
A KnowledgeBaseArtifactsCollection object. |
chunks ¶
Returns a KnowledgeBaseChunksCollection object for chunks associated with a knowledge base.
Returns:
Type | Description |
---|---|
KnowledgeBaseChunksCollection
|
A KnowledgeBaseChunksCollection object. |
create ¶
Create a new Knowledge Base. Must pass either embedding_model_name or model_deployment_id.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the Knowledge Base. |
required |
embedding_model_name |
Optional[EmbeddingModelName]
|
The name of the embedding model to use for the Knowledge Base. |
None
|
model_deployment_id |
Optional[str]
|
ID for a EmbeddingConfigModelsAPI config. |
None
|
metadata |
Optional[Dict[str, Any]]
|
The metadata of the Knowledge Base. |
None
|
account_id |
Optional[str]
|
The ID of the account to create this Knowledge Base for. |
None
|
Returns:
Type | Description |
---|---|
KnowledgeBase
|
The newly created Knowledge Base. |
delete ¶
Delete a Knowledge Base by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Knowledge Base. |
required |
get ¶
Get an Knowledge Base by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Knowledge Base. |
required |
Returns:
Type | Description |
---|---|
KnowledgeBase
|
The Knowledge Base. |
list ¶
upload_schedules ¶
Returns a KnowledgeBaseUploadScheduleCollection object for upload schedules associated with a knowledge base.
Returns:
Type | Description |
---|---|
KnowledgeBaseUploadScheduleCollection
|
A KnowledgeBaseUploadScheduleCollection object. |
uploads ¶
Returns a KnowledgeBaseUploadsCollection object for uploads associated with a knowledge base.
Returns:
Type | Description |
---|---|
KnowledgeBaseUploadsCollection
|
A KnowledgeBaseUploadsCollection object. |
KnowledgeBaseUploadsCollection ¶
cancel ¶
Cancel an upload.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
knowledge_base |
KnowledgeBase
|
The Knowledge Base the upload was created for. |
required |
id |
str
|
The ID of the upload to cancel. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the upload was canceled, False otherwise. |
create_local_upload ¶
Create a new local upload.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
knowledge_base |
KnowledgeBase
|
The Knowledge Base to upload data to. |
required |
data_source_config |
LocalChunksSourceConfig
|
The data source config. |
required |
chunks |
List[ChunkToUpload]
|
The chunks to upload. |
required |
Returns:
Type | Description |
---|---|
KnowledgeBaseUpload
|
The newly created local upload. |
create_remote_upload ¶
create_remote_upload(knowledge_base, data_source_config, data_source_auth_config, chunking_strategy_config)
Create a new remote upload.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
knowledge_base |
KnowledgeBase
|
The Knowledge Base to upload data to. |
required |
data_source_config |
RemoteDataSourceConfig
|
The data source config. |
required |
data_source_auth_config |
Optional[DataSourceAuthConfig]
|
The data source auth config. |
required |
chunking_strategy_config |
ChunkingStrategyConfig
|
The chunking strategy config. |
required |
Returns:
Type | Description |
---|---|
KnowledgeBaseUpload
|
The newly created remote upload. |
get ¶
Get an Knowledge Base Upload by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Knowledge Base Upload. |
required |
knowledge_base |
KnowledgeBase
|
The Knowledge Base the upload was created for. |
required |
Returns:
Type | Description |
---|---|
KnowledgeBaseUpload
|
The Knowledge Base Upload. |
list ¶
List all Knowledge Base Uploads.
Returns:
Type | Description |
---|---|
List[KnowledgeBaseUpload]
|
A list of Knowledge Base Uploads. |
KnowledgeBaseArtifactsCollection ¶
get ¶
Get a Knowledge Base Artifact by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Knowledge Base Artifact. |
required |
knowledge_base |
KnowledgeBase
|
The Knowledge Base the artifact was created for. |
required |
status_filter |
Optional[ChunkUploadStatus]
|
Return only artifacts with the given status. |
value
|
Returns:
Type | Description |
---|---|
KnowledgeBaseArtifact
|
The Knowledge Base Artifact. |
list ¶
List all Knowledge Base Artifacts.
Returns:
Type | Description |
---|---|
List[KnowledgeBaseArtifact]
|
A list of Knowledge Base Artifacts. |
ChunkCollection ¶
rank ¶
Re-rank a list of chunks against a query.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
Natural language query to re-rank chunks against. If a vector store query was originally used to retrieve these chunks, please use the same query for this ranking. |
required |
relevant_chunks |
List[Chunk]
|
List of chunks to rank. |
required |
rank_strategy |
Union[CrossEncoderRankStrategy, RougeRankStrategy, ModelRankStrategy]
|
The ranking strategy to use.
Rank strategies determine how the ranking is done, They consist of the
ranking method name and additional params needed to compute the ranking.
So far, only the |
required |
top_k |
Optional[int]
|
Number of chunks to return. Must be greater than 0 if specified. If not specified, all chunks will be returned. |
None
|
Returns:
Type | Description |
---|---|
List[Chunk]
|
An ordered list of the re-ranked chunks. |
synthesize ¶
Synthesize a natural language response from a list of chunks.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
Natural language query to synthesize response from. |
required |
chunks |
List[Chunk]
|
List of chunks to synthesize response from. |
required |
Returns:
Type | Description |
---|---|
str
|
A natural language response synthesized from the list of chunks. |
CompletionCollection ¶
create ¶
Create a new LLM Completion.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
Union[Literal['gpt-4', 'gpt-4-0613', 'gpt-4-32k', 'gpt-4-32k-0613', 'gpt-4-vision-preview', 'gpt-4o', 'gpt-3.5-turbo', 'gpt-3.5-turbo-0613', 'gpt-3.5-turbo-16k', 'gpt-3.5-turbo-16k-0613', 'text-davinci-003', 'text-davinci-002', 'text-curie-001', 'text-babbage-001', 'text-ada-001', 'claude-instant-1', 'claude-instant-1.1', 'claude-2', 'claude-2.0', 'llama-7b', 'llama-2-7b', 'llama-2-7b-chat', 'llama-2-13b', 'llama-2-13b-chat', 'llama-2-70b', 'llama-2-70b-chat', 'falcon-7b', 'falcon-7b-instruct', 'falcon-40b', 'falcon-40b-instruct', 'mpt-7b', 'mpt-7b-instruct', 'flan-t5-xxl', 'mistral-7b', 'mistral-7b-instruct', 'mixtral-8x7b', 'mixtral-8x7b-instruct', 'llm-jp-13b-instruct-full', 'llm-jp-13b-instruct-full-dolly', 'zephyr-7b-alpha', 'zephyr-7b-beta', 'codellama-7b', 'codellama-7b-instruct', 'codellama-13b', 'codellama-13b-instruct', 'codellama-34b', 'codellama-34b-instruct', 'codellama-70b', 'codellama-70b-instruct', 'gemini-pro', 'gemini-1.5-pro-preview-0409'], str]
|
The model to use for the completion. |
required |
prompt |
str
|
The prompt to use for the completion. |
required |
model_parameters |
Optional[ModelParameters]
|
The parameters to use for the model. |
None
|
Returns:
Type | Description |
---|---|
Completion
|
The newly created Completion. |
stream ¶
Stream LLM Completions.
Returns:
Type | Description |
---|---|
Iterable[Completion]
|
The newly created Completion. |
ModelTemplateCollection ¶
Collections class for SGP Models.
create ¶
create(name, endpoint_type, model_type, vendor_configuration, model_creation_parameters_schema=None, model_request_parameters_schema=None, account_id=None)
Create a new SGP Model Template.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the Model Template. |
required |
endpoint_type |
ModelEndpointType
|
The type of model this template will create. See Model Types and Schemas |
required |
model_type |
ModelType
|
The type of the Model Template. |
required |
vendor_configuration |
ModelVendorConfiguration
|
The vendor configuration of the Model Template. |
required |
model_creation_parameters_schema |
Optional[ParameterSchema]
|
The model creation parameters schema of the Model Template. |
None
|
model_request_parameters_schema |
Optional[ParameterSchema]
|
The model request parameters schema of the Model Template. |
None
|
account_id |
Optional[str]
|
The account ID of the Model Template. |
None
|
Returns:
Type | Description |
---|---|
ModelTemplate
|
The created Model Template. |
delete ¶
Delete a Model Template by ID.
Returns:
Type | Description |
---|---|
bool
|
True if the Model Template was successfully deleted. |
get ¶
list ¶
List all Model Templates that the user has access to.
Returns:
Type | Description |
---|---|
List[ModelTemplate]
|
A list of Model Templates that the user has access to. |
update ¶
ModelInstanceCollection ¶
Collections class for SGP Models.
create ¶
delete ¶
Delete a model by ID.
Returns:
Type | Description |
---|---|
bool
|
True if the model was successfully deleted. |
deployments ¶
Returns a ModelDeploymentCollection for deployments associated with this model.
list ¶
List all models that the user has access to.
Returns:
Type | Description |
---|---|
List[ModelInstance]
|
A list of models. |
update ¶
ModelDeploymentCollection ¶
create ¶
Create a new ModelDeployment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
ModelInstance
|
The Model to associate the ModelDeployment with. |
required |
Returns:
Type | Description |
---|---|
ModelDeployment
|
The newly created ModelDeployment. |
delete ¶
Delete a ModelDeployment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the ModelDeployment. |
required |
model |
ModelInstance
|
The Model to associate the ModelDeployment with. |
required |
execute ¶
Execute the specified model deployment with the given request.
Returns:
Type | Description |
---|---|
BaseModelResponse
|
The model deployment's response. |
get ¶
Get a ModelDeployment by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the ModelDeployment. |
required |
model |
ModelInstance
|
The Model to associate the ModelDeployment with. |
required |
Returns:
Type | Description |
---|---|
ModelDeployment
|
The ModelDeployment. |
list ¶
List ModelDeployment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
ModelInstance
|
The Model to associate the ModelDeployment with. |
required |
Returns:
Type | Description |
---|---|
List[ModelDeployment]
|
A list of ModelDeployment. |
ApplicationSpecCollection ¶
create ¶
Create a new Application Spec.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the Application Spec. |
required |
description |
str
|
The description of the Application Spec. |
required |
account_id |
Optional[str]
|
The ID of the account to create this Application Spec for. |
None
|
Returns:
Type | Description |
---|---|
ApplicationSpec
|
The newly created Application Spec. |
delete ¶
Delete an Application Spec by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Application Spec. |
required |
get ¶
Get an Application Spec by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Application Spec. |
required |
Returns:
Type | Description |
---|---|
ApplicationSpec
|
The Application Spec. |
list ¶
List all Application Specs.
Returns:
Type | Description |
---|---|
List[ApplicationSpec]
|
A list of Application Specs. |
update ¶
Update an Application Spec by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Application Spec. |
required |
name |
Optional[str]
|
The name of the Application Spec. |
None
|
description |
Optional[str]
|
The description of the Application Spec. |
None
|
Returns:
Type | Description |
---|---|
ApplicationSpec
|
The updated Application Spec. |
QuestionCollection ¶
create ¶
create(type, title, prompt, choices, multi=None, dropdown=False, required=False, conditions=None, account_id=None)
Create a new Question.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
type |
QuestionType
|
The type of the Question. |
required |
title |
str
|
The title of the Question. |
required |
prompt |
str
|
The prompt of the Question. |
required |
account_id |
Optional[str]
|
The ID of the account to create this Question for. |
None
|
choices |
Optional[List[CategoricalChoice]]
|
The choices of the Question. |
required |
multi |
Optional[bool]
|
Whether the question is multi-select |
None
|
dropdown |
Optional[bool]
|
Whether the question is to be displayed as a dropdown |
False
|
required |
Optional[bool]
|
Whether the question is required |
False
|
conditions |
Optional[List[dict]]
|
The conditions for the question |
None
|
Returns: The newly created Evaluation Configuration.
QuestionSetCollection ¶
create ¶
Create a new Question Set.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the Question Set. |
required |
questions |
List[Question]
|
The questions in this Question Set. |
required |
account_id |
Optional[str]
|
The ID of the account to create this Question Set for. |
None
|
Returns: The newly created Evaluation Configuration.
get ¶
Get the details of a Question Set.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Question Set. |
required |
Returns:
Type | Description |
---|---|
QuestionSet
|
The details of the Question Set. |
list ¶
EvaluationConfigCollection ¶
create ¶
Create a new Evaluation Configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_type |
EvaluationType
|
The type of Evaluation. Only |
required |
question_set |
QuestionSet
|
The Question Set to associate with the Evaluation. |
required |
account_id |
Optional[str]
|
The ID of the account to create this Evaluation Configuration for. |
None
|
Returns:
Type | Description |
---|---|
EvaluationConfig
|
The newly created Evaluation Configuration. |
delete ¶
Delete an Evaluation Configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Evaluation Configuration. |
required |
get ¶
Get the details of an evaluation config.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Evaluation Configuration. |
required |
Returns:
Type | Description |
---|---|
EvaluationConfig
|
The Evaluation Configuration. |
list ¶
List Evaluation Configurations.
Returns:
Type | Description |
---|---|
List[EvaluationConfig]
|
A list of Evaluation Configurations. |
EvaluationDatasetCollection ¶
add_test_cases ¶
Add new test cases to an existing dataset.
Unless you want to batch up multiple modifications to a dataset and snapshot them all at
once, you should leave update_dataset_version=True. See the docs for the
update_dataset_version
method for more details.
The schema_types
currently supported and their corresponding fields are:
Schema Types:
GENERATION
Field | Type | Default |
---|---|---|
input |
str |
required |
expected_output |
str |
None |
expected_extra_info |
Dict[str, Any] |
None |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to add test cases to. |
required |
test_cases_data |
List[Union[GenerationTestCaseData]]
|
The test cases to add. |
required |
update_dataset_version |
bool
|
Whether to update the dataset version after adding the test cases. Defaults to True. |
True
|
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The updated dataset. |
add_test_cases_from_file ¶
Add new test cases to an existing dataset from a JSONL file.
Unless you want to batch up multiple modifications to a dataset and snapshot them all at
once, you should leave update_dataset_version=True. See the docs for the
update_dataset_version
method for more details.
The schema_types
currently supported and their corresponding fields are:
Schema Types:
GENERATION
Field | Type | Default |
---|---|---|
input |
str |
required |
expected_output |
str |
None |
expected_extra_info |
Dict[str, Any] |
None |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to add test cases to. |
required |
filepath |
str
|
The path to the JSONL file. |
required |
update_dataset_version |
bool
|
Whether to update the dataset version after adding the test cases. |
True
|
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The updated dataset. |
create ¶
Create a new empty dataset.
Generally since most users will already have a list of test cases they want to create a
dataset from, they should use the create_from_file
method instead.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the dataset. |
required |
schema_type |
TestCaseSchemaType
|
The schema type of the dataset. |
required |
account_id |
Optional[str]
|
The ID of the account to create this dataset for. |
None
|
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The newly created dataset. |
create_from_file ¶
Create a new dataset that is seeded with test cases from a JSONL file.
The headers of the JSONL file must match the fields of the specified schema type.
The schema_types
currently supported and their corresponding fields are:
Schema Types:
GENERATION
Field | Type | Default |
---|---|---|
input |
str |
required |
expected_output |
str |
None |
expected_extra_info |
Dict[str, Any] |
None |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the dataset. |
required |
schema_type |
TestCaseSchemaType
|
The schema type of the dataset. |
required |
filepath |
str
|
The path to the JSONL file. |
required |
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The newly created dataset. |
delete ¶
Delete an existing dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the dataset. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the dataset was deleted, False otherwise. |
delete_test_cases ¶
Delete test cases from an existing dataset.
Unless you want to batch up multiple modifications to a dataset and snapshot them all at
once, you should leave update_dataset_version=True. See the docs for the
update_dataset_version
method for more details.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to delete test cases from. |
required |
test_case_ids |
List[str]
|
The IDs of the test cases to delete. |
required |
update_dataset_version |
bool
|
Whether to update the dataset version after deleting the test cases. |
True
|
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The updated dataset. |
get ¶
Get an existing dataset by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the dataset. |
required |
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The dataset. |
modify_test_cases ¶
Modify test cases in an existing dataset.
Unless you want to batch up multiple modifications to a dataset and snapshot them all at
once, you should leave update_dataset_version=True. See the docs for the
update_dataset_version
method for more details.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to modify test cases in. |
required |
modified_test_cases |
List[TestCase]
|
The modified test cases. |
required |
update_dataset_version |
bool
|
Whether to update the dataset version after modifying the test cases. |
True
|
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The updated dataset. |
overwrite_from_file ¶
Overwrite all test cases in existing dataset from a JSONL file.
The headers of the JSONL file must match the fields of the specified schema type.
The schema_types
currently supported and their corresponding fields are:
Schema Types:
GENERATION
Field | Type | Default |
---|---|---|
input |
str |
required |
expected_output |
str |
None |
expected_extra_info |
Dict[str, Any] |
None |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to overwrite. |
required |
schema_type |
TestCaseSchemaType
|
The schema type of the dataset. |
required |
filepath |
str
|
The path to the JSONL file. |
required |
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The updated dataset. |
test_cases ¶
Returns a TestCaseCollection object for test cases associated with the current Evaluation Dataset.
update ¶
Update the attributes of an existing dataset.
Important: This method will NOT update the version of the dataset. It will only
update the attributes of the dataset. If you want to snapshot the current state of the
dataset under an incremented version number, you should use the update_dataset_version
method instead.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the dataset. |
required |
name |
Optional[str]
|
The name of the dataset. |
None
|
schema_type |
Optional[TestCaseSchemaType]
|
The schema type of the dataset. |
None
|
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The updated dataset. |
update_dataset_version ¶
Update the version of an existing dataset.
This method will snapshot the current state of the dataset under an incremented version number.
Warning: By default, the add_test_cases
, delete_test_cases
, and modify_test_cases
methods will automatically update the dataset version for you. However,
if you want to batch up multiple modifications to a dataset and snapshot them all at once,
you can set update_dataset_version=False
on those methods and then call this method
manually afterward.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to update the version of. |
required |
Returns:
Type | Description |
---|---|
EvaluationDataset
|
The updated dataset. |
TestCaseCollection ¶
create ¶
Create a new test case.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to create the test case in. |
required |
schema_type |
TestCaseSchemaType
|
The schema type of the test case. |
required |
test_case_data |
Union[GenerationTestCaseData]
|
The test case data. |
required |
Returns:
Type | Description |
---|---|
TestCase
|
The newly created test case. |
create_batch ¶
Create multiple new test cases.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to create the test cases in. |
required |
test_cases |
List[TestCaseRequest]
|
The test cases to create. |
required |
Returns:
Type | Description |
---|---|
List[TestCase]
|
The newly created test cases. |
delete ¶
Delete an existing test case.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the test case. |
required |
evaluation_dataset |
EvaluationDataset
|
The dataset to delete the test case from. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the test case was deleted successfully, False otherwise. |
get ¶
Get an existing test case by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the test case. |
required |
evaluation_dataset |
EvaluationDataset
|
The dataset to get the test case from. |
required |
Returns:
Type | Description |
---|---|
TestCase
|
The test case. |
iter ¶
Iterate over all test cases in a dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to iterate over test cases from. |
required |
Returns:
Type | Description |
---|---|
Iterable[TestCase]
|
The test cases. |
list ¶
List all test cases in a dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_dataset |
EvaluationDataset
|
The dataset to list test cases from. |
required |
Returns:
Type | Description |
---|---|
List[TestCase]
|
The test cases. |
update ¶
update(test_case_id, evaluation_dataset, schema_type=None, test_case_data=None, test_case_metadata=None, chat_history=None)
Update an existing test case.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
test_case_id |
str
|
The ID of the test case. |
required |
evaluation_dataset |
EvaluationDataset
|
The dataset to update the test case in. |
required |
schema_type |
Optional[TestCaseSchemaType]
|
The schema type of the test case. |
None
|
test_case_data |
Optional[Union[GenerationTestCaseData]]
|
The test case data. |
None
|
Returns:
Type | Description |
---|---|
TestCase
|
The updated test case. |
EvaluationCollection ¶
create ¶
Create a new Evaluation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the Evaluation. |
required |
description |
str
|
The description of the Evaluation. |
required |
application_spec |
ApplicationSpec
|
The Application Spec to associate the Evaluation with. |
required |
evaluation_config |
EvaluationConfig
|
The configuration for the Evaluation. |
required |
tags |
Optional[Dict[str, Any]]
|
Optional key, value pairs to associate with the Evaluation. |
None
|
account_id |
Optional[str]
|
The ID of the account to create this Evaluation for. |
None
|
Returns:
Type | Description |
---|---|
Evaluation
|
The newly created Evaluation. |
delete ¶
Delete an Evaluation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Evaluation. |
required |
get ¶
Get an Evaluation by ID.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Evaluation. |
required |
Returns:
Type | Description |
---|---|
Evaluation
|
The Evaluation. |
test_case_results ¶
Returns a TestCaseResultCollection for test case results associated with this evaluation.
update ¶
Update an Evaluation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str
|
The ID of the Evaluation. |
required |
name |
Optional[str]
|
The name of the Evaluation. |
None
|
description |
Optional[str]
|
The description of the Evaluation. |
None
|
evaluation_config |
Optional[EvaluationConfig]
|
The configuration for the Evaluation. |
None
|
tags |
Optional[Dict[str, Any]]
|
Optional key, value pairs to associate with the Evaluation. |
None
|
Returns:
Type | Description |
---|---|
Evaluation
|
The updated Evaluation. |