Launch Client¶
LaunchClient ¶
LaunchClient(api_key: str, endpoint: Optional[str] = None, self_hosted: bool = False, use_path_with_custom_endpoint: bool = False)
Scale Launch Python Client.
Initializes a Scale Launch Client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
api_key |
str
|
Your Scale API key |
required |
endpoint |
Optional[str]
|
The Scale Launch Endpoint (this should not need to be changed) |
None
|
self_hosted |
bool
|
True iff you are connecting to a self-hosted Scale Launch |
False
|
use_path_with_custom_endpoint |
bool
|
True iff you are not using the default Scale Launch endpoint but your endpoint has path routing (to SCALE_LAUNCH_VX_PATH) set up |
False
|
batch_async_request ¶
batch_async_request(*, model_bundle: Union[ModelBundle, str], urls: Optional[List[str]] = None, inputs: Optional[List[Dict[str, Any]]] = None, batch_url_file_location: Optional[str] = None, serialization_format: str = 'JSON', labels: Optional[Dict[str, str]] = None, cpus: Optional[int] = None, memory: Optional[str] = None, gpus: Optional[int] = None, gpu_type: Optional[str] = None, storage: Optional[str] = None, max_workers: Optional[int] = None, per_worker: Optional[int] = None, timeout_seconds: Optional[float] = None) -> Dict[str, Any]
Sends a batch inference request using a given bundle. Returns a key that can be used to retrieve the results of inference at a later time.
Must have exactly one of urls or inputs passed in.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle |
Union[ModelBundle, str]
|
The bundle or the name of a the bundle to use for inference. |
required |
urls |
Optional[List[str]]
|
A list of urls, each pointing to a file containing model input. Must be accessible by Scale Launch, hence urls need to either be public or signedURLs. |
None
|
inputs |
Optional[List[Dict[str, Any]]]
|
A list of model inputs, if exists, we will upload the inputs and pass it in to Launch. |
None
|
batch_url_file_location |
Optional[str]
|
In self-hosted mode, the input to the batch job will be uploaded to this location if provided. Otherwise, one will be determined from bundle_location_fn() |
None
|
serialization_format |
str
|
Serialization format of output, either 'PICKLE' or 'JSON'. 'pickle' corresponds to pickling results + returning |
'JSON'
|
labels |
Optional[Dict[str, str]]
|
An optional dictionary of key/value pairs to associate with this endpoint. |
None
|
cpus |
Optional[int]
|
Number of cpus each worker should get, e.g. 1, 2, etc. This must be greater than or equal to 1. |
None
|
memory |
Optional[str]
|
Amount of memory each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of memory. |
None
|
storage |
Optional[str]
|
Amount of local ephemeral storage each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of storage. |
None
|
gpus |
Optional[int]
|
Number of gpus each worker should get, e.g. 0, 1, etc. |
None
|
max_workers |
Optional[int]
|
The maximum number of workers. Must be greater than or equal to 0, and as
well as greater than or equal to |
None
|
per_worker |
Optional[int]
|
The maximum number of concurrent requests that an individual worker can
service. Launch automatically scales the number of workers for the endpoint so that
each worker is processing
|
None
|
gpu_type |
Optional[str]
|
If specifying a non-zero number of gpus, this controls the type of gpu requested. Here are the supported values:
|
None
|
timeout_seconds |
Optional[float]
|
The maximum amount of time (in seconds) that the batch job can take. If not specified, the server defaults to 12 hours. This includes the time required to build the endpoint and the total time required for all the individual tasks. |
None
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
A dictionary that contains |
cancel_fine_tune ¶
Cancel a fine-tune
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fine_tune_id |
str
|
ID of the fine-tune |
required |
Returns:
Name | Type | Description |
---|---|---|
CancelFineTuneResponse |
CancelFineTuneResponse
|
whether the cancellation was successful |
clone_model_bundle_with_changes ¶
clone_model_bundle_with_changes(model_bundle: Union[ModelBundle, str], app_config: Optional[Dict] = None) -> ModelBundle
Warning
This method is deprecated. Use
clone_model_bundle_with_changes_v2
instead.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle |
Union[ModelBundle, str]
|
The existing bundle or its ID. |
required |
app_config |
Optional[Dict]
|
The new bundle's app config, if not passed in, the new
bundle's |
None
|
Returns:
Type | Description |
---|---|
ModelBundle
|
A |
clone_model_bundle_with_changes_v2 ¶
clone_model_bundle_with_changes_v2(original_model_bundle_id: str, new_app_config: Optional[Dict[str, Any]] = None) -> CreateModelBundleV2Response
Clone a model bundle with an optional new app_config
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
original_model_bundle_id |
str
|
The ID of the model bundle you want to clone. |
required |
new_app_config |
Optional[Dict[str, Any]]
|
A dictionary of new app config values to use for the cloned model. |
None
|
Returns:
Type | Description |
---|---|
CreateModelBundleV2Response
|
An object containing the following keys:
|
completions_stream ¶
completions_stream(endpoint_name: str, prompt: str, max_new_tokens: int, temperature: float, stop_sequences: Optional[List[str]] = None, return_token_log_probs: Optional[bool] = False, timeout: float = DEFAULT_LLM_COMPLETIONS_TIMEOUT) -> Iterable[CompletionStreamV1Response]
Run prompt completion on an LLM endpoint in streaming fashion. Will fail if endpoint does not support streaming.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
endpoint_name |
str
|
The name of the LLM endpoint to make the request to |
required |
prompt |
str
|
The prompt to send to the endpoint |
required |
max_new_tokens |
int
|
The maximum number of tokens to generate for each prompt |
required |
temperature |
float
|
The temperature to use for sampling |
required |
stop_sequences |
Optional[List[str]]
|
List of sequences to stop the completion at |
None
|
return_token_log_probs |
Optional[bool]
|
Whether to return the log probabilities of the tokens |
False
|
Returns:
Type | Description |
---|---|
Iterable[CompletionStreamV1Response]
|
Iterable responses for prompt completion |
completions_sync ¶
completions_sync(endpoint_name: str, prompt: str, max_new_tokens: int, temperature: float, stop_sequences: Optional[List[str]] = None, return_token_log_probs: Optional[bool] = False, timeout: float = DEFAULT_LLM_COMPLETIONS_TIMEOUT) -> CompletionSyncV1Response
Run prompt completion on a sync LLM endpoint. Will fail if the endpoint is not sync.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
endpoint_name |
str
|
The name of the LLM endpoint to make the request to |
required |
prompt |
str
|
The completion prompt to send to the endpoint |
required |
max_new_tokens |
int
|
The maximum number of tokens to generate for each prompt |
required |
temperature |
float
|
The temperature to use for sampling |
required |
stop_sequences |
Optional[List[str]]
|
List of sequences to stop the completion at |
None
|
return_token_log_probs |
Optional[bool]
|
Whether to return the log probabilities of the tokens |
False
|
Returns:
Type | Description |
---|---|
CompletionSyncV1Response
|
Response for prompt completion |
create_docker_image_batch_job ¶
create_docker_image_batch_job(*, labels: Dict[str, str], docker_image_batch_job_bundle: Optional[Union[str, DockerImageBatchJobBundleResponse]] = None, docker_image_batch_job_bundle_name: Optional[str] = None, job_config: Optional[Dict[str, Any]] = None, cpus: Optional[int] = None, memory: Optional[str] = None, gpus: Optional[int] = None, gpu_type: Optional[str] = None, storage: Optional[str] = None)
For self hosted mode only.
Parameters:
docker_image_batch_job_bundle: Specifies the docker image bundle to use for the batch job.
Either the string id of a docker image bundle, or a
DockerImageBatchJobBundleResponse object.
Only one of docker_image_batch_job_bundle and docker_image_batch_job_bundle_name
can be specified.
docker_image_batch_job_bundle_name: The name of a batch job bundle. If specified,
Launch will use the most recent bundle with that name owned by the current user.
Only one of docker_image_batch_job_bundle and docker_image_batch_job_bundle_name
can be specified.
labels: Kubernetes labels that are present on the batch job.
job_config: A JSON-serializable python object that will get passed to the batch job,
specifically as the contents of a file mounted at mount_location
inside the bundle.
You can call python's json.load()
on the file to retrieve the contents.
cpus: Optional override for the number of cpus to give to your job. Either the default
must be specified in the bundle, or this must be specified.
memory: Optional override for the amount of memory to give to your job. Either the default
must be specified in the bundle, or this must be specified.
gpus: Optional number of gpus to give to the bundle. If not specified in the bundle or
here, will be interpreted as 0 gpus.
gpu_type: Optional type of gpu. If the final number of gpus is positive, must be specified
either in the bundle or here.
storage: Optional reserved amount of disk to give to your batch job. If not specified,
your job may be evicted if it is using too much disk.
create_docker_image_batch_job_bundle ¶
create_docker_image_batch_job_bundle(*, name: str, image_repository: str, image_tag: str, command: List[str], env: Optional[Dict[str, str]] = None, mount_location: Optional[str] = None, cpus: Optional[int] = None, memory: Optional[str] = None, gpus: Optional[int] = None, gpu_type: Optional[str] = None, storage: Optional[str] = None) -> CreateDockerImageBatchJobBundleResponse
For self hosted mode only.
Creates a Docker Image Batch Job Bundle.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
A user-defined name for the bundle. Does not need to be unique. |
required |
image_repository |
str
|
The (short) repository of your image. For example, if your image is located at
123456789012.dkr.ecr.us-west-2.amazonaws.com/repo:tag, and your version of Launch
is configured to look at 123456789012.dkr.ecr.us-west-2.amazonaws.com for Docker Images,
you would pass the value |
required |
image_tag |
str
|
The tag of your image inside of the repo. In the example above, you would pass
the value |
required |
command |
List[str]
|
The command to run inside the docker image. |
required |
env |
Optional[Dict[str, str]]
|
A dictionary of environment variables to inject into your docker image. |
None
|
mount_location |
Optional[str]
|
A location in the filesystem where you would like a json-formatted file, controllable
on runtime, to be mounted. This allows behavior to be specified on runtime.
(Specifically, the contents of this file can be read via |
None
|
cpus |
Optional[int]
|
Optional default value for the number of cpus to give the job. |
None
|
memory |
Optional[str]
|
Optional default value for the amount of memory to give the job. |
None
|
gpus |
Optional[int]
|
Optional default value for the number of gpus to give the job. |
None
|
gpu_type |
Optional[str]
|
Optional default value for the type of gpu to give the job. |
None
|
storage |
Optional[str]
|
Optional default value for the amount of disk to give the job. |
None
|
create_fine_tune ¶
create_fine_tune(model: str, training_file: str, validation_file: Optional[str] = None, fine_tuning_method: Optional[str] = None, hyperparameters: Optional[Dict[str, str]] = None, wandb_config: Optional[Dict[str, Any]] = None, suffix: str = None) -> CreateFineTuneResponse
Create a fine-tune
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
str
|
Identifier of base model to train from. |
required |
training_file |
str
|
Path to file of training dataset. Dataset must be a csv with columns 'prompt' and 'response'. |
required |
validation_file |
Optional[str]
|
Path to file of validation dataset. Has the same format as training_file. If not provided, we will generate a split from the training dataset. |
None
|
fine_tuning_method |
Optional[str]
|
Fine-tuning method. Currently unused, but when different techniques are implemented we will expose this field. |
None
|
hyperparameters |
Optional[Dict[str, str]]
|
Hyperparameters to pass in to training job. |
None
|
wandb_config |
Optional[Dict[str, Any]]
|
Configuration for Weights and Biases.
To enable set |
None
|
suffix |
str
|
Optional user-provided identifier suffix for the fine-tuned model. |
None
|
Returns:
Name | Type | Description |
---|---|---|
CreateFineTuneResponse |
CreateFineTuneResponse
|
ID of the created fine-tune |
create_llm_model_endpoint ¶
create_llm_model_endpoint(endpoint_name: str, model_name: str, inference_framework_image_tag: str, source: LLMSource = LLMSource.HUGGING_FACE, inference_framework: LLMInferenceFramework = LLMInferenceFramework.DEEPSPEED, num_shards: int = 4, quantize: Optional[Quantization] = None, checkpoint_path: Optional[str] = None, cpus: int = 32, memory: str = '192Gi', storage: Optional[str] = None, gpus: int = 4, min_workers: int = 0, max_workers: int = 1, per_worker: int = 10, gpu_type: Optional[str] = 'nvidia-ampere-a10', endpoint_type: str = 'sync', high_priority: Optional[bool] = False, post_inference_hooks: Optional[List[PostInferenceHooks]] = None, default_callback_url: Optional[str] = None, default_callback_auth_kind: Optional[Literal['basic', 'mtls']] = None, default_callback_auth_username: Optional[str] = None, default_callback_auth_password: Optional[str] = None, default_callback_auth_cert: Optional[str] = None, default_callback_auth_key: Optional[str] = None, public_inference: Optional[bool] = None, update_if_exists: bool = False, labels: Optional[Dict[str, str]] = None)
Creates and registers a model endpoint in Scale Launch. The returned object is an
instance of type Endpoint
, which is a base class of either SyncEndpoint
or
AsyncEndpoint
. This is the object to which you sent inference requests.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
endpoint_name |
str
|
The name of the model endpoint you want to create. The name must be unique across all endpoints that you own. |
required |
model_name |
str
|
name for the LLM. List can be found at (TODO: add list of supported models) |
required |
inference_framework_image_tag |
str
|
image tag for the inference framework. (TODO: use latest image tag when unspecified) |
required |
source |
LLMSource
|
source of the LLM. Currently only HuggingFace is supported. |
HUGGING_FACE
|
inference_framework |
LLMInferenceFramework
|
inference framework for the LLM. Currently only DeepSpeed is supported. |
DEEPSPEED
|
num_shards |
int
|
number of shards for the LLM. When bigger than 1, LLM will be sharded to multiple GPUs. Number of GPUs must be larger than num_shards. |
4
|
quantize |
Optional[Quantization]
|
Quantization method for the LLM. Only affects behavior for text-generation-inference models. |
None
|
checkpoint_path |
Optional[str]
|
Path to the checkpoint to load the model from. Only affects behavior for text-generation-inference models. |
None
|
cpus |
int
|
Number of cpus each worker should get, e.g. 1, 2, etc. This must be greater than or equal to 1. |
32
|
memory |
str
|
Amount of memory each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of memory. |
'192Gi'
|
storage |
Optional[str]
|
Amount of local ephemeral storage each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of storage. |
None
|
gpus |
int
|
Number of gpus each worker should get, e.g. 0, 1, etc. |
4
|
min_workers |
int
|
The minimum number of workers. Must be greater than or equal to 0. This
should be determined by computing the minimum throughput of your workload and
dividing it by the throughput of a single worker. This field must be at least |
0
|
max_workers |
int
|
The maximum number of workers. Must be greater than or equal to 0,
and as well as greater than or equal to |
1
|
per_worker |
int
|
The maximum number of concurrent requests that an individual worker can
service. Launch automatically scales the number of workers for the endpoint so that
each worker is processing
Here is our recommendation for computing
|
10
|
gpu_type |
Optional[str]
|
If specifying a non-zero number of gpus, this controls the type of gpu requested. Here are the supported values:
|
'nvidia-ampere-a10'
|
endpoint_type |
str
|
Either |
'sync'
|
high_priority |
Optional[bool]
|
Either |
False
|
post_inference_hooks |
Optional[List[PostInferenceHooks]]
|
List of hooks to trigger after inference tasks are served. |
None
|
default_callback_url |
Optional[str]
|
The default callback url to use for async endpoints. This can be overridden in the task parameters for each individual task. post_inference_hooks must contain "callback" for the callback to be triggered. |
None
|
default_callback_auth_kind |
Optional[Literal['basic', 'mtls']]
|
The default callback auth kind to use for async endpoints. Either "basic" or "mtls". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_username |
Optional[str]
|
The default callback auth username to use. This only applies if default_callback_auth_kind is "basic". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_password |
Optional[str]
|
The default callback auth password to use. This only applies if default_callback_auth_kind is "basic". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_cert |
Optional[str]
|
The default callback auth cert to use. This only applies if default_callback_auth_kind is "mtls". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_key |
Optional[str]
|
The default callback auth key to use. This only applies if default_callback_auth_kind is "mtls". This can be overridden in the task parameters for each individual task. |
None
|
public_inference |
Optional[bool]
|
If |
None
|
update_if_exists |
bool
|
If |
False
|
labels |
Optional[Dict[str, str]]
|
An optional dictionary of key/value pairs to associate with this endpoint. |
None
|
Returns:
Type | Description |
---|---|
A Endpoint object that can be used to make requests to the endpoint. |
create_model_bundle ¶
create_model_bundle(model_bundle_name: str, env_params: Dict[str, str], *, load_predict_fn: Optional[Callable[[LaunchModel_T], Callable[[Any], Any]]] = None, predict_fn_or_cls: Optional[Callable[[Any], Any]] = None, requirements: Optional[List[str]] = None, model: Optional[LaunchModel_T] = None, load_model_fn: Optional[Callable[[], LaunchModel_T]] = None, app_config: Optional[Union[Dict[str, Any], str]] = None, globals_copy: Optional[Dict[str, Any]] = None, request_schema: Optional[Type[BaseModel]] = None, response_schema: Optional[Type[BaseModel]] = None) -> ModelBundle
Warning
This method is deprecated. Use
create_model_bundle_from_callable_v2
instead.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_name |
str
|
The name of the model bundle you want to create. The name must be unique across all bundles that you own. |
required |
predict_fn_or_cls |
Optional[Callable[[Any], Any]]
|
|
None
|
model |
Optional[LaunchModel_T]
|
Typically a trained Neural Network, e.g. a Pytorch module. Exactly one of |
None
|
load_model_fn |
Optional[Callable[[], LaunchModel_T]]
|
A function that, when run, loads a model. This function is essentially
a deferred wrapper around the Exactly one of |
None
|
load_predict_fn |
Optional[Callable[[LaunchModel_T], Callable[[Any], Any]]]
|
Function that, when called with a model, returns a function that carries out inference. If Otherwise, if In both cases, |
None
|
requirements |
Optional[List[str]]
|
A list of python package requirements, where each list element is of
the form
If you do not pass in a value for |
None
|
app_config |
Optional[Union[Dict[str, Any], str]]
|
Either a Dictionary that represents a YAML file contents or a local path to a YAML file. |
None
|
env_params |
Dict[str, str]
|
A dictionary that dictates environment information e.g. the use of pytorch or tensorflow, which base image tag to use, etc. Specifically, the dictionary should contain the following keys:
|
required |
globals_copy |
Optional[Dict[str, Any]]
|
Dictionary of the global symbol table. Normally provided by
|
None
|
request_schema |
Optional[Type[BaseModel]]
|
A pydantic model that represents the request schema for the model bundle. This is used to validate the request body for the model bundle's endpoint. |
None
|
response_schema |
Optional[Type[BaseModel]]
|
A pydantic model that represents the request schema for the model bundle. This is used to validate the response for the model bundle's endpoint. Note: If request_schema is specified, then response_schema must also be specified. |
None
|
create_model_bundle_from_callable_v2 ¶
create_model_bundle_from_callable_v2(*, model_bundle_name: str, load_predict_fn: Callable[[LaunchModel_T], Callable[[Any], Any]], load_model_fn: Callable[[], LaunchModel_T], request_schema: Type[BaseModel], response_schema: Type[BaseModel], requirements: Optional[List[str]] = None, pytorch_image_tag: Optional[str] = None, tensorflow_version: Optional[str] = None, custom_base_image_repository: Optional[str] = None, custom_base_image_tag: Optional[str] = None, app_config: Optional[Union[Dict[str, Any], str]] = None, metadata: Optional[Dict[str, Any]] = None) -> CreateModelBundleV2Response
Uploads and registers a model bundle to Scale Launch.
Parameters:
Returns:
Type | Description |
---|---|
CreateModelBundleV2Response
|
An object containing the following keys:
|
create_model_bundle_from_dirs ¶
create_model_bundle_from_dirs(*, model_bundle_name: str, base_paths: List[str], requirements_path: str, env_params: Dict[str, str], load_predict_fn_module_path: str, load_model_fn_module_path: str, app_config: Optional[Union[Dict[str, Any], str]] = None, request_schema: Optional[Type[BaseModel]] = None, response_schema: Optional[Type[BaseModel]] = None) -> ModelBundle
Warning
This method is deprecated. Use
create_model_bundle_from_dirs_v2
instead.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_name |
str
|
The name of the model bundle you want to create. The name must be unique across all bundles that you own. |
required |
base_paths |
List[str]
|
The paths on the local filesystem where the bundle code lives. |
required |
requirements_path |
str
|
A path on the local filesystem where a |
required |
env_params |
Dict[str, str]
|
A dictionary that dictates environment information e.g. the use of pytorch or tensorflow, which base image tag to use, etc. Specifically, the dictionary should contain the following keys:
Example: |
required |
load_predict_fn_module_path |
str
|
A python module path for a function that, when called with the output of load_model_fn_module_path, returns a function that carries out inference. |
required |
load_model_fn_module_path |
str
|
A python module path for a function that returns a model. The output feeds into the function located at load_predict_fn_module_path. |
required |
app_config |
Optional[Union[Dict[str, Any], str]]
|
Either a Dictionary that represents a YAML file contents or a local path to a YAML file. |
None
|
request_schema |
Optional[Type[BaseModel]]
|
A pydantic model that represents the request schema for the model bundle. This is used to validate the request body for the model bundle's endpoint. |
None
|
response_schema |
Optional[Type[BaseModel]]
|
A pydantic model that represents the request schema for the model bundle. This is used to validate the response for the model bundle's endpoint. Note: If request_schema is specified, then response_schema must also be specified. |
None
|
create_model_bundle_from_dirs_v2 ¶
create_model_bundle_from_dirs_v2(*, model_bundle_name: str, base_paths: List[str], load_predict_fn_module_path: str, load_model_fn_module_path: str, request_schema: Type[BaseModel], response_schema: Type[BaseModel], requirements_path: Optional[str] = None, pytorch_image_tag: Optional[str] = None, tensorflow_version: Optional[str] = None, custom_base_image_repository: Optional[str] = None, custom_base_image_tag: Optional[str] = None, app_config: Optional[Dict[str, Any]] = None, metadata: Optional[Dict[str, Any]] = None) -> CreateModelBundleV2Response
Packages up code from one or more local filesystem folders and uploads them as a bundle to Scale Launch. In this mode, a bundle is just local code instead of a serialized object.
For example, if you have a directory structure like so, and your current working
directory is my_root
:
my_root/
my_module1/
__init__.py
...files and directories
my_inference_file.py
my_module2/
__init__.py
...files and directories
then calling create_model_bundle_from_dirs_v2
with base_paths=["my_module1",
"my_module2"]
essentially creates a zip file without the root directory, e.g.:
my_module1/
__init__.py
...files and directories
my_inference_file.py
my_module2/
__init__.py
...files and directories
and these contents will be unzipped relative to the server side application root. Bear
these points in mind when referencing Python module paths for this bundle. For instance,
if my_inference_file.py
has def f(...)
as the desired inference loading function,
then the load_predict_fn_module_path
argument should be my_module1.my_inference_file.f
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_name |
str
|
The name of the model bundle you want to create. |
required |
base_paths |
List[str]
|
A list of paths to directories that will be zipped up and uploaded as a bundle. Each path must be relative to the current working directory. |
required |
load_predict_fn_module_path |
str
|
The Python module path to the function that will be used to load the model for inference. This function should take in a path to a model directory, and return a model object. The model object should be pickleable. |
required |
load_model_fn_module_path |
str
|
The Python module path to the function that will be used to load the model for training. This function should take in a path to a model directory, and return a model object. The model object should be pickleable. |
required |
request_schema |
Type[BaseModel]
|
A Pydantic model that defines the request schema for the bundle. |
required |
response_schema |
Type[BaseModel]
|
A Pydantic model that defines the response schema for the bundle. |
required |
requirements_path |
Optional[str]
|
Path to a requirements.txt file that will be used to install dependencies for the bundle. This file must be relative to the current working directory. |
None
|
pytorch_image_tag |
Optional[str]
|
The image tag for the PyTorch image that will be used to run the
bundle. Exactly one of |
None
|
tensorflow_version |
Optional[str]
|
The version of TensorFlow that will be used to run the bundle.
If not specified, the default version will be used. Exactly one of
|
None
|
custom_base_image_repository |
Optional[str]
|
The repository for a custom base image that will be
used to run the bundle. If not specified, the default base image will be used.
Exactly one of |
None
|
custom_base_image_tag |
Optional[str]
|
The tag for a custom base image that will be used to run the
bundle. Must be specified if |
None
|
app_config |
Optional[Dict[str, Any]]
|
An optional dictionary of configuration values that will be passed to the
bundle when it is run. These values can be accessed by the bundle via the
|
None
|
metadata |
Optional[Dict[str, Any]]
|
Metadata to record with the bundle. |
None
|
Returns:
Type | Description |
---|---|
CreateModelBundleV2Response
|
An object containing the following keys:
|
create_model_bundle_from_runnable_image_v2 ¶
create_model_bundle_from_runnable_image_v2(*, model_bundle_name: str, request_schema: Type[BaseModel], response_schema: Type[BaseModel], repository: str, tag: str, command: List[str], healthcheck_route: Optional[str] = None, predict_route: Optional[str] = None, env: Dict[str, str], readiness_initial_delay_seconds: int, metadata: Optional[Dict[str, Any]] = None) -> CreateModelBundleV2Response
Create a model bundle from a runnable image. The specified command
must start a process
that will listen for requests on port 5005 using HTTP.
Inference requests must be served at the POST /predict
route while the GET /readyz
route is a healthcheck.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_name |
str
|
The name of the model bundle you want to create. |
required |
request_schema |
Type[BaseModel]
|
A Pydantic model that defines the request schema for the bundle. |
required |
response_schema |
Type[BaseModel]
|
A Pydantic model that defines the response schema for the bundle. |
required |
repository |
str
|
The name of the Docker repository for the runnable image. |
required |
tag |
str
|
The tag for the runnable image. |
required |
command |
List[str]
|
The command that will be used to start the process that listens for requests. |
required |
predict_route |
Optional[str]
|
The endpoint route on the runnable image that will be called. |
None
|
healthcheck_route |
Optional[str]
|
The healthcheck endpoint route on the runnable image. |
None
|
env |
Dict[str, str]
|
A dictionary of environment variables that will be passed to the bundle when it is run. |
required |
readiness_initial_delay_seconds |
int
|
The number of seconds to wait for the HTTP server to become ready and successfully respond on its healthcheck. |
required |
metadata |
Optional[Dict[str, Any]]
|
Metadata to record with the bundle. |
None
|
Returns:
Type | Description |
---|---|
CreateModelBundleV2Response
|
An object containing the following keys:
|
create_model_bundle_from_streaming_enhanced_runnable_image_v2 ¶
create_model_bundle_from_streaming_enhanced_runnable_image_v2(*, model_bundle_name: str, request_schema: Type[BaseModel], response_schema: Type[BaseModel], repository: str, tag: str, command: Optional[List[str]] = None, healthcheck_route: Optional[str] = None, predict_route: Optional[str] = None, streaming_command: List[str], streaming_predict_route: Optional[str] = None, env: Dict[str, str], readiness_initial_delay_seconds: int, metadata: Optional[Dict[str, Any]] = None) -> CreateModelBundleV2Response
Create a model bundle from a runnable image. The specified command
must start a process
that will listen for requests on port 5005 using HTTP.
Inference requests must be served at the POST /predict
route while the GET /readyz
route is a healthcheck.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_name |
str
|
The name of the model bundle you want to create. |
required |
request_schema |
Type[BaseModel]
|
A Pydantic model that defines the request schema for the bundle. |
required |
response_schema |
Type[BaseModel]
|
A Pydantic model that defines the response schema for the bundle. |
required |
repository |
str
|
The name of the Docker repository for the runnable image. |
required |
tag |
str
|
The tag for the runnable image. |
required |
command |
Optional[List[str]]
|
The command that will be used to start the process that listens for requests if this bundle is used as a SYNC or ASYNC endpoint. |
None
|
healthcheck_route |
Optional[str]
|
The healthcheck endpoint route on the runnable image. |
None
|
predict_route |
Optional[str]
|
The endpoint route on the runnable image that will be called if this bundle is used as a SYNC or ASYNC endpoint. |
None
|
streaming_command |
List[str]
|
The command that will be used to start the process that listens for requests if this bundle is used as a STREAMING endpoint. |
required |
streaming_predict_route |
Optional[str]
|
The endpoint route on the runnable image that will be called if this bundle is used as a STREAMING endpoint. |
None
|
env |
Dict[str, str]
|
A dictionary of environment variables that will be passed to the bundle when it is run. |
required |
readiness_initial_delay_seconds |
int
|
The number of seconds to wait for the HTTP server to become ready and successfully respond on its healthcheck. |
required |
metadata |
Optional[Dict[str, Any]]
|
Metadata to record with the bundle. |
None
|
Returns:
Type | Description |
---|---|
CreateModelBundleV2Response
|
An object containing the following keys:
|
create_model_bundle_from_triton_enhanced_runnable_image_v2 ¶
create_model_bundle_from_triton_enhanced_runnable_image_v2(*, model_bundle_name: str, request_schema: Type[BaseModel], response_schema: Type[BaseModel], repository: str, tag: str, command: List[str], healthcheck_route: Optional[str] = None, predict_route: Optional[str] = None, env: Dict[str, str], readiness_initial_delay_seconds: int, triton_model_repository: str, triton_model_replicas: Optional[Dict[str, str]] = None, triton_num_cpu: float, triton_commit_tag: str, triton_storage: Optional[str] = None, triton_memory: Optional[str] = None, triton_readiness_initial_delay_seconds: int, metadata: Optional[Dict[str, Any]] = None) -> CreateModelBundleV2Response
Create a model bundle from a runnable image and a tritonserver image.
Same requirements as :param:create_model_bundle_from_runnable_image_v2
with additional constraints necessary
for configuring tritonserver's execution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_name |
str
|
The name of the model bundle you want to create. |
required |
request_schema |
Type[BaseModel]
|
A Pydantic model that defines the request schema for the bundle. |
required |
response_schema |
Type[BaseModel]
|
A Pydantic model that defines the response schema for the bundle. |
required |
repository |
str
|
The name of the Docker repository for the runnable image. |
required |
tag |
str
|
The tag for the runnable image. |
required |
command |
List[str]
|
The command that will be used to start the process that listens for requests. |
required |
predict_route |
Optional[str]
|
The endpoint route on the runnable image that will be called. |
None
|
healthcheck_route |
Optional[str]
|
The healthcheck endpoint route on the runnable image. |
None
|
env |
Dict[str, str]
|
A dictionary of environment variables that will be passed to the bundle when it is run. |
required |
readiness_initial_delay_seconds |
int
|
The number of seconds to wait for the HTTP server to become ready and successfully respond on its healthcheck. |
required |
triton_model_repository |
str
|
The S3 prefix that contains the contents of the model repository, formatted according to https://github.com/triton-inference-server/server/blob/main/docs/user_guide/model_repository.md |
required |
triton_model_replicas |
Optional[Dict[str, str]]
|
If supplied, the name and number of replicas to make for each model. |
None
|
triton_num_cpu |
float
|
Number of CPUs, fractional, to allocate to tritonserver. |
required |
triton_commit_tag |
str
|
The image tag of the specific trionserver version. |
required |
triton_storage |
Optional[str]
|
Amount of storage space to allocate for the tritonserver container. |
None
|
triton_memory |
Optional[str]
|
Amount of memory to allocate for the tritonserver container. |
None
|
triton_readiness_initial_delay_seconds |
int
|
Like readiness_initial_delay_seconds, but for tritonserver's own healthcheck. |
required |
metadata |
Optional[Dict[str, Any]]
|
Metadata to record with the bundle. |
None
|
Returns:
Type | Description |
---|---|
CreateModelBundleV2Response
|
An object containing the following keys:
|
create_model_endpoint ¶
create_model_endpoint(*, endpoint_name: str, model_bundle: Union[ModelBundle, str], cpus: int = 3, memory: str = '8Gi', storage: str = '16Gi', gpus: int = 0, min_workers: int = 1, max_workers: int = 1, per_worker: int = 10, gpu_type: Optional[str] = None, endpoint_type: str = 'sync', high_priority: Optional[bool] = False, post_inference_hooks: Optional[List[PostInferenceHooks]] = None, default_callback_url: Optional[str] = None, default_callback_auth_kind: Optional[Literal['basic', 'mtls']] = None, default_callback_auth_username: Optional[str] = None, default_callback_auth_password: Optional[str] = None, default_callback_auth_cert: Optional[str] = None, default_callback_auth_key: Optional[str] = None, public_inference: Optional[bool] = None, update_if_exists: bool = False, labels: Optional[Dict[str, str]] = None) -> Optional[Endpoint]
Creates and registers a model endpoint in Scale Launch. The returned object is an
instance of type Endpoint
, which is a base class of either SyncEndpoint
or
AsyncEndpoint
. This is the object to which you sent inference requests.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
endpoint_name |
str
|
The name of the model endpoint you want to create. The name must be unique across all endpoints that you own. |
required |
model_bundle |
Union[ModelBundle, str]
|
The |
required |
cpus |
int
|
Number of cpus each worker should get, e.g. 1, 2, etc. This must be greater than or equal to 1. |
3
|
memory |
str
|
Amount of memory each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of memory. |
'8Gi'
|
storage |
str
|
Amount of local ephemeral storage each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of storage. |
'16Gi'
|
gpus |
int
|
Number of gpus each worker should get, e.g. 0, 1, etc. |
0
|
min_workers |
int
|
The minimum number of workers. Must be greater than or equal to 0. This
should be determined by computing the minimum throughput of your workload and
dividing it by the throughput of a single worker. This field must be at least |
1
|
max_workers |
int
|
The maximum number of workers. Must be greater than or equal to 0,
and as well as greater than or equal to |
1
|
per_worker |
int
|
The maximum number of concurrent requests that an individual worker can
service. Launch automatically scales the number of workers for the endpoint so that
each worker is processing
Here is our recommendation for computing
|
10
|
gpu_type |
Optional[str]
|
If specifying a non-zero number of gpus, this controls the type of gpu requested. Here are the supported values:
|
None
|
endpoint_type |
str
|
Either |
'sync'
|
high_priority |
Optional[bool]
|
Either |
False
|
post_inference_hooks |
Optional[List[PostInferenceHooks]]
|
List of hooks to trigger after inference tasks are served. |
None
|
default_callback_url |
Optional[str]
|
The default callback url to use for async endpoints. This can be overridden in the task parameters for each individual task. post_inference_hooks must contain "callback" for the callback to be triggered. |
None
|
default_callback_auth_kind |
Optional[Literal['basic', 'mtls']]
|
The default callback auth kind to use for async endpoints. Either "basic" or "mtls". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_username |
Optional[str]
|
The default callback auth username to use. This only applies if default_callback_auth_kind is "basic". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_password |
Optional[str]
|
The default callback auth password to use. This only applies if default_callback_auth_kind is "basic". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_cert |
Optional[str]
|
The default callback auth cert to use. This only applies if default_callback_auth_kind is "mtls". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_key |
Optional[str]
|
The default callback auth key to use. This only applies if default_callback_auth_kind is "mtls". This can be overridden in the task parameters for each individual task. |
None
|
public_inference |
Optional[bool]
|
If |
None
|
update_if_exists |
bool
|
If |
False
|
labels |
Optional[Dict[str, str]]
|
An optional dictionary of key/value pairs to associate with this endpoint. |
None
|
Returns:
Type | Description |
---|---|
Optional[Endpoint]
|
A Endpoint object that can be used to make requests to the endpoint. |
delete_file ¶
Delete a file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_id |
str
|
ID of the file |
required |
Returns:
Name | Type | Description |
---|---|---|
DeleteFileResponse |
DeleteFileResponse
|
whether the deletion was successful |
delete_llm_model_endpoint ¶
Deletes an LLM model endpoint.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_endpoint_name |
str
|
The name of the model endpoint to delete. |
required |
delete_model_endpoint ¶
Deletes a model endpoint.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_endpoint |
A |
required |
edit_model_endpoint ¶
edit_model_endpoint(*, model_endpoint: Union[ModelEndpoint, str], model_bundle: Optional[Union[ModelBundle, str]] = None, cpus: Optional[float] = None, memory: Optional[str] = None, storage: Optional[str] = None, gpus: Optional[int] = None, min_workers: Optional[int] = None, max_workers: Optional[int] = None, per_worker: Optional[int] = None, gpu_type: Optional[str] = None, high_priority: Optional[bool] = None, post_inference_hooks: Optional[List[PostInferenceHooks]] = None, default_callback_url: Optional[str] = None, default_callback_auth_kind: Optional[Literal['basic', 'mtls']] = None, default_callback_auth_username: Optional[str] = None, default_callback_auth_password: Optional[str] = None, default_callback_auth_cert: Optional[str] = None, default_callback_auth_key: Optional[str] = None, public_inference: Optional[bool] = None) -> None
Edits an existing model endpoint. Here are the fields that cannot be edited on an existing endpoint:
- The endpoint's name. - The endpoint's type (i.e. you cannot go from a
SyncEnpdoint
to anAsyncEndpoint
or vice versa.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_endpoint |
Union[ModelEndpoint, str]
|
The model endpoint (or its name) you want to edit. The name must be unique across all endpoints that you own. |
required |
model_bundle |
Optional[Union[ModelBundle, str]]
|
The |
None
|
cpus |
Optional[float]
|
Number of cpus each worker should get, e.g. 1, 2, etc. This must be greater than or equal to 1. |
None
|
memory |
Optional[str]
|
Amount of memory each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of memory. |
None
|
storage |
Optional[str]
|
Amount of local ephemeral storage each worker should get, e.g. "4Gi", "512Mi", etc. This must be a positive amount of storage. |
None
|
gpus |
Optional[int]
|
Number of gpus each worker should get, e.g. 0, 1, etc. |
None
|
min_workers |
Optional[int]
|
The minimum number of workers. Must be greater than or equal to 0. |
None
|
max_workers |
Optional[int]
|
The maximum number of workers. Must be greater than or equal to 0,
and as well as greater than or equal to |
None
|
per_worker |
Optional[int]
|
The maximum number of concurrent requests that an individual worker can
service. Launch automatically scales the number of workers for the endpoint so that
each worker is processing
|
None
|
gpu_type |
Optional[str]
|
If specifying a non-zero number of gpus, this controls the type of gpu requested. Here are the supported values:
|
None
|
high_priority |
Optional[bool]
|
Either |
None
|
post_inference_hooks |
Optional[List[PostInferenceHooks]]
|
List of hooks to trigger after inference tasks are served. |
None
|
default_callback_url |
Optional[str]
|
The default callback url to use for async endpoints. This can be overridden in the task parameters for each individual task. post_inference_hooks must contain "callback" for the callback to be triggered. |
None
|
default_callback_auth_kind |
Optional[Literal['basic', 'mtls']]
|
The default callback auth kind to use for async endpoints. Either "basic" or "mtls". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_username |
Optional[str]
|
The default callback auth username to use. This only applies if default_callback_auth_kind is "basic". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_password |
Optional[str]
|
The default callback auth password to use. This only applies if default_callback_auth_kind is "basic". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_cert |
Optional[str]
|
The default callback auth cert to use. This only applies if default_callback_auth_kind is "mtls". This can be overridden in the task parameters for each individual task. |
None
|
default_callback_auth_key |
Optional[str]
|
The default callback auth key to use. This only applies if default_callback_auth_kind is "mtls". This can be overridden in the task parameters for each individual task. |
None
|
public_inference |
Optional[bool]
|
If |
None
|
get_batch_async_response ¶
Gets inference results from a previously created batch job.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_job_id |
str
|
An id representing the batch task job. This id is the in the response from
calling |
required |
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
A dictionary that contains the following fields: |
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
Dict[str, Any]
|
|
get_docker_image_batch_job ¶
For self hosted mode only. Gets information about a batch job given a batch job id.
get_docker_image_batch_job_bundle ¶
get_docker_image_batch_job_bundle(docker_image_batch_job_bundle_id: str) -> DockerImageBatchJobBundleResponse
For self hosted mode only. Gets information for a single batch job bundle with a given id.
get_file ¶
Get metadata about a file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_id |
str
|
ID of the file |
required |
Returns:
Name | Type | Description |
---|---|---|
GetFileResponse |
GetFileResponse
|
ID, filename, and size of the requested file |
get_file_content ¶
Get a file's content
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_id |
str
|
ID of the file |
required |
Returns:
Name | Type | Description |
---|---|---|
GetFileContentResponse |
GetFileContentResponse
|
ID and content of the requested file |
get_fine_tune ¶
Get status of a fine-tune
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fine_tune_id |
str
|
ID of the fine-tune |
required |
Returns:
Name | Type | Description |
---|---|---|
GetFineTuneResponse |
GetFineTuneResponse
|
ID and status of the requested fine-tune |
get_fine_tune_events ¶
Get list of fine-tune events
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fine_tune_id |
str
|
ID of the fine-tune |
required |
Returns:
Name | Type | Description |
---|---|---|
GetFineTuneEventsResponse |
GetFineTuneEventsResponse
|
a list of all the events of the fine-tune |
get_latest_docker_image_batch_job_bundle ¶
For self hosted mode only. Gets information for the latest batch job bundle with a given name.
get_latest_model_bundle_v2 ¶
Get the latest version of a model bundle.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_name |
str
|
The name of the model bundle you want to get. |
required |
Returns:
Type | Description |
---|---|
ModelBundleV2Response
|
An object containing the following keys:
|
get_llm_model_endpoint ¶
get_llm_model_endpoint(endpoint_name: str) -> Optional[Union[AsyncEndpoint, SyncEndpoint, StreamingEndpoint]]
Gets a model endpoint associated with a name that the user has access to.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
endpoint_name |
str
|
The name of the endpoint to retrieve. |
required |
get_model_bundle ¶
Returns a model bundle specified by bundle_name
that the user owns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle |
Union[ModelBundle, str]
|
The bundle or its name. |
required |
Returns:
Type | Description |
---|---|
ModelBundle
|
A |
get_model_bundle_v2 ¶
Get a model bundle.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_bundle_id |
str
|
The ID of the model bundle you want to get. |
required |
Returns:
Type | Description |
---|---|
ModelBundleV2Response
|
An object containing the following fields:
|
get_model_endpoint ¶
Gets a model endpoint associated with a name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
endpoint_name |
str
|
The name of the endpoint to retrieve. |
required |
list_docker_image_batch_job_bundles ¶
list_docker_image_batch_job_bundles(bundle_name: Optional[str] = None, order_by: Optional[Literal['newest', 'oldest']] = None) -> ListDockerImageBatchJobBundleResponse
For self hosted mode only. Gets information for multiple bundles.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bundle_name |
Optional[str]
|
The name of the bundles to retrieve. If not specified, this will retrieve all |
None
|
order_by |
Optional[Literal['newest', 'oldest']]
|
Either "newest", "oldest", or not specified. Specify to sort by newest/oldest. |
None
|
list_files ¶
List files
Returns:
Name | Type | Description |
---|---|---|
ListFilesResponse |
ListFilesResponse
|
list of all files (ID, filename, and size) |
list_fine_tunes ¶
List fine-tunes
Returns:
Name | Type | Description |
---|---|---|
ListFineTunesResponse |
ListFineTunesResponse
|
list of all fine-tunes and their statuses |
list_llm_model_endpoints ¶
Lists all LLM model endpoints that the user has access to.
Returns:
Type | Description |
---|---|
List[Endpoint]
|
A list of |
list_model_bundles ¶
Returns a list of model bundles that the user owns.
Returns:
Type | Description |
---|---|
List[ModelBundle]
|
A list of ModelBundle objects |
list_model_bundles_v2 ¶
List all model bundles.
Returns:
Type | Description |
---|---|
ListModelBundlesV2Response
|
An object containing the following keys:
|
list_model_endpoints ¶
Lists all model endpoints that the user owns.
Returns:
Type | Description |
---|---|
List[Endpoint]
|
A list of |
model_download ¶
download a finetuned model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name |
str
|
name of the model to download |
required |
download_format |
str
|
format of the model to download |
'hugging_face'
|
Returns:
Name | Type | Description |
---|---|---|
ModelDownloadResponse |
ModelDownloadResponse
|
dictionary with file names and urls to download the model |
read_endpoint_creation_logs ¶
Retrieves the logs for the creation of the endpoint.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_endpoint |
Union[ModelEndpoint, str]
|
The endpoint or its name. |
required |
register_batch_csv_location_fn ¶
For self-hosted mode only. Registers a function that gives a location for batch CSV inputs. Should give different locations each time. This function is called as batch_csv_location_fn(), and should return a batch_csv_url that upload_batch_csv_fn can take.
Strictly, batch_csv_location_fn() does not need to return a str. The only requirement is that if batch_csv_location_fn returns a value of type T, then upload_batch_csv_fn() takes in an object of type T as its second argument (i.e. batch_csv_url).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_csv_location_fn |
Callable[[], str]
|
Function that generates batch_csv_urls for upload_batch_csv_fn. |
required |
register_bundle_location_fn ¶
For self-hosted mode only. Registers a function that gives a location for a model bundle.
Should give different locations each time. This function is called as
bundle_location_fn()
, and should return a bundle_url
that
register_upload_bundle_fn
can take.
Strictly, bundle_location_fn()
does not need to return a str
. The only
requirement is that if bundle_location_fn
returns a value of type T
,
then upload_bundle_fn()
takes in an object of type T as its second argument (i.e.
bundle_url).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
bundle_location_fn |
Callable[[], str]
|
Function that generates bundle_urls for upload_bundle_fn. |
required |
register_upload_batch_csv_fn ¶
For self-hosted mode only. Registers a function that handles batch text upload. This function is called as
upload_batch_csv_fn(csv_text, csv_url)
This function should directly write the contents of csv_text
as a text string into
csv_url
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
upload_batch_csv_fn |
Callable[[str, str], None]
|
Function that takes in a csv text (string type), and uploads that bundle to an appropriate location. Only needed for self-hosted mode. |
required |
register_upload_bundle_fn ¶
For self-hosted mode only. Registers a function that handles model bundle upload. This function is called as
upload_bundle_fn(serialized_bundle, bundle_url)
This function should directly write the contents of serialized_bundle
as a
binary string into bundle_url
.
See register_bundle_location_fn
for more notes on the signature of upload_bundle_fn
Parameters:
Name | Type | Description | Default |
---|---|---|---|
upload_bundle_fn |
Callable[[str, str], None]
|
Function that takes in a serialized bundle (bytes type), and uploads that bundle to an appropriate location. Only needed for self-hosted mode. |
required |
update_docker_image_batch_job ¶
For self hosted mode only. Updates a batch job by id. Use this if you want to cancel/delete a batch job.
upload_file ¶
Upload a file
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Path to a local file to upload. |
required |
Returns:
Name | Type | Description |
---|---|---|
UploadFileResponse |
UploadFileResponse
|
ID of the created file |