Quick Start
Using Model Navigator is as simply as calling optimize
with model
and dataloader
:
The optimize
function will save all the artifacts it generates in the navigator_workspace
.
Note: The dataloader
is utilized to determine the maximum and minimum shapes of the inputs utilized during model conversions. The Model Navigator
employs a single sample from the dataloader
, which is then repeated to generate synthetic batches for profiling purposes. Correctness tests are conducted on a subset of the dataloader
samples, while verification tests are executed on the entire dataloader
.
import torch
import model_navigator as nav
nav.torch.optimize(
model=torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_resnet50', pretrained=True).eval(),
dataloader=[torch.randn(1, 3, 256, 256) for _ in range(10)],
)
The code snippet below demonstrates the usage of the PyTritonAdapter
to retrieve the runner
and other necessary information. The runner
serves as an abstraction that connects the model checkpoint with its runtime, making the inference process more accessible and straightforward. Following that, it initiates the PyTriton server using the provided parameters.
pytriton_adapter = nav.pytriton.PyTritonAdapter(package=package, strategy=nav.MaxThroughputStrategy())
runner = pytriton_adapter.runner
runner.activate()
@batch
def infer_func(**inputs):
return runner.infer(inputs)
with Triton() as triton:
triton.bind(
model_name="resnet50",
infer_func=infer_func,
inputs=pytriton_adapter.inputs,
outputs=pytriton_adapter.outputs,
config=pytriton_adapter.config,
)
triton.serve()
Alternatively, Model Navigator can generate model_repository
that can be served on the Triton Inference Server:
nav.triton.model_repository.add_model_from_package(
model_repository_path=pathlib.Path("model_repository"),
model_name="resnet50",
package=package,
strategy=nav.MaxThroughputStrategy(),
)
For more information on additional frameworks and optimize
function parameters, please refer to the API documentation and examples.