Triton Model Navigator

The Triton Model Navigator automates the process of moving model from source to deployment on Triton Inference Server. The tool validate possible export and conversion paths to serializable formats like TensorRT and select the most promising format for production deployment.

How it works?

The Triton Model Navigator is designed to provide a single entrypoint for each supported framework. The usage is simple as call to dedicated optimize function to start the process of searching for the best possible deployment by going through a broad spectrum of model conversions.

The optimize internally performs model export, conversion, correctness testing, performance profiling, and saves all generated artifacts in the navigator_workspace, which is represented by a returned package object. The result of optimize process can be saved as a portable Navigator Package with the save function. Saved packages only contain the base model formats along with the best selected format based on latency and throughput. The package can be reused to recreate the process on same or different hardware. The configuration and execution status is saved in the status.yaml file located inside the workspace and the Navigator Package.

Finally, the Navigator Package can be used for model deployment on Triton Inference Server. Dedicated API helps with obtaining all necessary parameters and creating model_repository or receive the optimized model for inference in Python environment.