Quick Start

Using Model Navigator is as simply as calling optimize with model and dataloader: The optimize function will save all the artifacts it generates in the navigator_workspace.

Note: The dataloader is utilized to determine the maximum and minimum shapes of the inputs utilized during model conversions. The Model Navigator employs a single sample from the dataloader, which is then repeated to generate synthetic batches for profiling purposes. Correctness tests are conducted on a subset of the dataloader samples, while verification tests are executed on the entire dataloader.

import torch
import model_navigator as nav

nav.torch.optimize(
    model=torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_resnet50', pretrained=True).eval(),
    dataloader=[torch.randn(1, 3, 256, 256) for _ in range(10)],
)

The code snippet below demonstrates the usage of the PyTritonAdapter to retrieve the runner and other necessary information. The runner serves as an abstraction that connects the model checkpoint with its runtime, making the inference process more accessible and straightforward. Following that, it initiates the PyTriton server using the provided parameters.

pytriton_adapter = nav.pytriton.PyTritonAdapter(package=package, strategy=nav.MaxThroughputStrategy())
runner = pytriton_adapter.runner

runner.activate()

@batch
def infer_func(**inputs):
    return runner.infer(inputs)

with Triton() as triton:
    triton.bind(
        model_name="resnet50",
        infer_func=infer_func,
        inputs=pytriton_adapter.inputs,
        outputs=pytriton_adapter.outputs,
        config=pytriton_adapter.config,
    )
    triton.serve()

Alternatively, Model Navigator can generate model_repository that can be served on the Triton Inference Server:

nav.triton.model_repository.add_model_from_package(
    model_repository_path=pathlib.Path("model_repository"),
    model_name="resnet50",
    package=package,
    strategy=nav.MaxThroughputStrategy(),
)

For more information on additional frameworks and optimize function parameters, please refer to the API documentation and examples.