Navigator Package
The model graph and/or checkpoint is not enough to perform a successful deployment of the model. When you are deploying model for inference you need to be aware of model inputs and outputs definition, maximal batch size that can be used for inference and other.
On that purpose we have created a Navigator Package
- an artifact containing the serialized model, model metadata and
optimization details.
The Navigator Package
is recommended way of sharing the optimized model for deployment
on PyTriton or Triton Inference Server sections
or re-running the optimize
method on different hardware.
Save
The package created during optimize can be saved in form of Zip file using the API method:
The save
method collect the generated models from workspace selecting:
- base formats - first available serialization formats exporting model from source
- max throughput format - the model that achieved the highest throughput during profiling
- min latency format - the model that achieved the minimal latency during profiling
Additionally, the package contains:
- status file with optimization details
- logs from optimize execution
- reproduction script per each model format
- input and output data samples in form on numpy files
Read more in save method API specification.
Load
The packages saved to file can be loaded for further processing:
Once the package is loaded you can obtain desired information or use it to optimize
or profile
the package. Read
more in load method API specification.
Optimize
The loaded package object can be used to re-run the optimize process. In comparison to the framework dedicated API, the package optimize process start from the serialized models inside the package and reproduce the available optimization paths. This step can be used to reproduce the process without access to sources on different hardware.
The optimization from package can be run using:
At the end of the process the new optimized models are generated. Please mind, the workspace is overridden in this step. Read more in optimize method API specification.
Profile
The optimize process use a single sample from dataloader for profiling. The process is focusing on selecting the best model format and this requires an unequivocal sample for performance comparison.
In some cases you may want to profile the models on different dataset. On that purpose the Model Navigator expose the API for profiling all samples in dataset for each model:
import torch
import model_navigator as nav
profiling_results = nav.package.profile(
package=package,
dataloader=[torch.randn(1, 3, 256, 256), torch.randn(1, 3, 512, 512)],
)
The results contain profiling information per each model and sample. You can use it to perform desired analysis based on the results. Read more in profile method API specification.