Overview¶
Creating deployments on Launch generally involves three steps:
-
Create and upload a
ModelBundle
. Pass your trained model as well as pre-/post-processing code to the Scale Launch Python client, and we’ll create a model bundle based on the code and store it in our Bundle Store. -
Create a
ModelEndpoint
. Pass a ModelBundle as well as infrastructure settings such as the desired number of GPUs to our client. This provisions resources on Scale’s cluster dedicated to your ModelEndpoint. -
Make requests to the ModelEndpoint. You can make requests through the Python client, or make HTTP requests directly to Scale.