Navigator Package

The model graph and/or checkpoint is not enough to perform a successful deployment of the model. When you are deploying model for inference you need to be aware of model inputs and outputs definition, maximal batch size that can be used for inference and other.

On that purpose we have created a Navigator Package - an artifact containing the serialized model, model metadata and optimization details.

The Navigator Package is recommended way of sharing the optimized model for deployment on PyTriton or Triton Inference Server sections or re-running the optimize method on different hardware.

Save

The package created during optimize can be saved in form of Zip file using the API method:

import model_navigator as nav

nav.package.save(
    package=package,
    path="/path/to/package.nav"
)

The save method collect the generated models from workspace selecting:

base formats - first available serialization formats exporting model from source
max throughput format - the model that achieved the highest throughput during profiling
min latency format - the model that achieved the minimal latency during profiling

Additionally, the package contains:

status file with optimization details
logs from optimize execution
reproduction script per each model format
input and output data samples in form on numpy files

Read more in save method API specification.

Load

The packages saved to file can be loaded for further processing:

import model_navigator as nav

package = nav.package.load(
    path="/path/to/package.nav"
)

Once the package is loaded you can obtain desired information or use it to optimize or profile the package. Read more in load method API specification.

Optimize

The loaded package object can be used to re-run the optimize process. In comparison to the framework dedicated API, the package optimize process start from the serialized models inside the package and reproduce the available optimization paths. This step can be used to reproduce the process without access to sources on different hardware.

The optimization from package can be run using:

import model_navigator as nav

optimized_package = nav.package.optimize(
    package=package
)

At the end of the process the new optimized models are generated. Please mind, the workspace is overridden in this step. Read more in optimize method API specification.

Profile

The optimize process use a single sample from dataloader for profiling. The process is focusing on selecting the best model format and this requires an unequivocal sample for performance comparison.

In some cases you may want to profile the models on different dataset. On that purpose the Model Navigator expose the API for profiling all samples in dataset for each model:

import torch
import model_navigator as nav

profiling_results = nav.package.profile(
    package=package,
    dataloader=[torch.randn(1, 3, 256, 256), torch.randn(1, 3, 512, 512)],
)

The results contain profiling information per each model and sample. You can use it to perform desired analysis based on the results. Read more in profile method API specification.