Deployment on Triton Inference Server
The Triton Model Navigator provides an API for working with the Triton model repository. Currently, we support adding your model or a pre-selected model from a Navigator Package.
The API only provides possible functionality for the given model's type and only provides offline validation of the provided configuration. In the end, the model with the configuration is created inside the provided model repository path.
Adding your model to the Triton model repository
When you work with an already exported model, you can provide a path to where one's model is located. Then you can use one of the specialized APIs that guides you through what options are possible for deployment of the selected model type.
Example of deploying a TensorRT model:
import model_navigator as nav
nav.triton.model_repository.add_model(
model_repository_path="/path/to/triton/model/repository",
model_path="/path/to/model/plan/file",
model_name="NameOfModel",
config=nav.triton.TensorRTModelConfig(
max_batch_size=256,
optimization=nav.triton.CUDAGraphOptimization(),
response_cache=True,
)
)
The model catalog with the model file and configuration will be created inside model_repository_path
. More
about the function you can find in the adding model section.
Adding model from package to the Triton model repository
When you want to deploy a model from a package created during the optimize
process, you can use:
import model_navigator as nav
nav.triton.model_repository.add_model_from_package(
model_repository_path="/path/to/triton/model/repository",
model_name="NameOfModel",
package=package,
)
The model is automatically selected based on profiling results. The default selection options can be adjusted by
changing the strategy
argument. More
about the function you can find in adding model section.
Using Triton Model Analyzer
A model added to the Triton Inference Server can be further optimized in the target environment using the Triton Model Analyzer.
Please follow the documentation to learn more about how to use the Triton Model Analyzer.