Skip to content

Config

Classes, enums and types used to configure Model Navigator.

model_navigator.api.config

Definition of enums and classes representing configuration for Model Navigator.

CustomConfig

Bases: abc.ABC

Base class used for custom configs. Input for Model Navigator optimize method.

defaults()

Update parameters to defaults.

Source code in model_navigator/api/config.py
def defaults(self) -> None:
    """Update parameters to defaults."""
    return None

from_dict(config_dict) classmethod

Instantiate CustomConfig from a dictionary.

Source code in model_navigator/api/config.py
@classmethod
def from_dict(cls, config_dict: Dict[str, Any]) -> "CustomConfig":
    """Instantiate CustomConfig from a dictionary."""
    return cls(**config_dict)

name() abstractmethod classmethod

Name of the CustomConfig.

Source code in model_navigator/api/config.py
@classmethod
@abc.abstractmethod
def name(cls) -> str:
    """Name of the CustomConfig."""
    raise NotImplementedError()

CustomConfigForFormat

Bases: DataObject, CustomConfig

Abstract base class used for custom configs representing particular format.

format: Format abstractmethod property

Format represented by CustomConfig.

DeviceKind

Bases: Enum

Supported types of devices.

Parameters:

Name Type Description Default
CPU str

Select CPU device.

required
GPU str

Select GPU with CUDA support.

required

Format

Bases: Enum

All model formats supported by Model Navigator 'optimize' function.

Parameters:

Name Type Description Default
PYTHON str

Format indicating any model defined in Python.

required
TORCH str

Format indicating PyTorch model.

required
TENSORFLOW str

Format indicating TensorFlow model.

required
JAX str

Format indicating JAX model.

required
TORCHSCRIPT str

Format indicating TorchScript model.

required
TF_SAVEDMODEL str

Format indicating TensorFlow SavedModel.

required
TF_TRT str

Format indicating TensorFlow TensorRT model.

required
TORCH_TRT str

Format indicating PyTorch TensorRT model.

required
ONNX str

Format indicating ONNX model.

required
TENSORRT str

Format indicating TensorRT model.

required

JitType

Bases: Enum

TorchScript export paramter.

Used for selecting the type of TorchScript export.

Parameters:

Name Type Description Default
TRACE str

Use tracing during export.

required
SCRIPT str

Use scripting during export.

required

MeasurementMode

Bases: Enum

Profiler measurement mode.

Parameters:

Name Type Description Default
TIME_WINDOWS str

mode run measurement windows with fixed time length.

required
COUNT_WINDOWS str

mode run measurement windows with fixed number of requests.

required

OnnxConfig dataclass

Bases: CustomConfigForFormat

ONNX custom config used for ONNX export and conversion.

Parameters:

Name Type Description Default
opset Optional[int]

ONNX opset used for conversion.

DEFAULT_ONNX_OPSET
dynamic_axes Optional[Dict[str, Union[Dict[int, str], List[int]]]]

Dynamic axes for ONNX conversion.

None
onnx_extended_conversion bool

Enables additional conversions from TorchScript to ONNX.

False

format: Format property

Returns Format.ONNX.

Returns:

Type Description
Format

Format.ONNX

name() classmethod

Name of the config.

Source code in model_navigator/api/config.py
@classmethod
def name(cls) -> str:
    """Name of the config."""
    return "Onnx"

ProfilerConfig dataclass

Bases: DataObject

Profiler configuration.

For each batch size profiler will run measurments in windows. Depending on the measurement mode, each window will have fixed time length (MeasurementMode.TIME_WINDOWS) or fixed number of requests (MeasurementMode.COUNT_WINDOWS). Batch sizes are profiled in the ascending order.

Profiler will run multiple trials and will stop when the measurements are stable (within stability_percentage from the mean) within three consecutive windows. If the measurements are not stable after max_trials trials, the profiler will stop with an error. Profiler will also stop profiling when the throughput does not increase at least by throughput_cutoff_threshold.

Parameters:

Name Type Description Default
run_profiling bool

If True, run profiling, otherwise skip profiling.

True
batch_sizes Optional[List[Union[int, None]]]

List of batch sizes to profile. None means that the model does not support batching.

None
measurement_mode MeasurementMode

Measurement mode.

MeasurementMode.COUNT_WINDOWS
measurement_interval Optional[float]

Measurement interval in milliseconds. Used only in MeasurementMode.TIME_WINDOWS mode.

5000
measurement_request_count Optional[int]

Number of requests to measure in each window. Used only in MeasurementMode.COUNT_WINDOWS mode.

50
stability_percentage float

Allowed percentage of variation from the mean in three consecutive windows.

10.0
max_trials int

Maximum number of window trials.

10
throughput_cutoff_threshold float

Minimum throughput increase to continue profiling.

DEFAULT_PROFILING_THROUGHPUT_CUTOFF_THRESHOLD

from_dict(profiler_config_dict) classmethod

Instantiate ProfilerConfig class from a dictionary.

Parameters:

Name Type Description Default
profiler_config_dict Mapping

Data dictionary.

required

Returns:

Type Description
ProfilerConfig

ProfilerConfig

Source code in model_navigator/api/config.py
@classmethod
def from_dict(cls, profiler_config_dict: Mapping) -> "ProfilerConfig":
    """Instantiate ProfilerConfig class from a dictionary.

    Args:
        profiler_config_dict (Mapping): Data dictionary.

    Returns:
        ProfilerConfig
    """
    return cls(
        run_profiling=profiler_config_dict.get("run_profiling", True),
        batch_sizes=profiler_config_dict.get("batch_sizes"),
        measurement_interval=profiler_config_dict.get("measurement_interval"),
        measurement_mode=MeasurementMode(
            profiler_config_dict.get("measurement_mode", MeasurementMode.TIME_WINDOWS)
        ),
        measurement_request_count=profiler_config_dict.get("measurement_request_count"),
        stability_percentage=profiler_config_dict.get("stability_percentage", 10.0),
        max_trials=profiler_config_dict.get("max_trials", 10),
        throughput_cutoff_threshold=profiler_config_dict.get("throughput_cutoff_threshold", -2),
    )

ShapeTuple dataclass

Bases: DataObject

Represents a set of shapes for a single binding in a profile.

Each element of the tuple represents a shape for a single dimension of the binding.

Parameters:

Name Type Description Default
min Tuple[int]

The minimum shape that the profile will support.

required
opt Tuple[int]

The shape for which TensorRT will optimize the engine.

required
max Tuple[int]

The maximum shape that the profile will support.

required

__iter__()

Iterate over shapes.

Source code in model_navigator/api/config.py
def __iter__(self):
    """Iterate over shapes."""
    yield from [self.min, self.opt, self.max]

__repr__()

Representation.

Source code in model_navigator/api/config.py
def __repr__(self):
    """Representation."""
    return type(self).__name__ + self.__str__()

__str__()

String representation.

Source code in model_navigator/api/config.py
def __str__(self):
    """String representation."""
    return f"(min={self.min}, opt={self.opt}, max={self.max})"

SizedIterable

Bases: Protocol

Protocol representing sized iterable. Used by dataloader.

__iter__()

Magic method iter.

Returns:

Type Description
Iterator

Iterator to next item.

Source code in model_navigator/api/config.py
def __iter__(self) -> Iterator:
    """Magic method __iter__.

    Returns:
        Iterator to next item.
    """
    ...

__len__()

Magic method len.

Returns:

Type Description
int

Length of size iterable.

Source code in model_navigator/api/config.py
def __len__(self) -> int:
    """Magic method __len__.

    Returns:
        Length of size iterable.
    """
    ...

TensorFlowConfig dataclass

Bases: CustomConfigForFormat

TensorFlow custom config used for SavedModel export.

Parameters:

Name Type Description Default
jit_compile Tuple[Optional[bool], ...]

Enable or Disable jit_compile flag for tf.function wrapper for Jax infer function.

(None)
enable_xla Tuple[Optional[bool], ...]

Enable or Disable enable_xla flag for jax2tf converter.

(None)

format: Format property

Returns Format.TF_SAVEDMODEL.

Returns:

Type Description
Format

Format.TF_SAVEDMODEL

defaults()

Update parameters to defaults.

Source code in model_navigator/api/config.py
def defaults(self) -> None:
    """Update parameters to defaults."""
    self.jit_compile = (None,)
    self.enable_xla = (None,)

name() classmethod

Name of the config.

Source code in model_navigator/api/config.py
@classmethod
def name(cls) -> str:
    """Name of the config."""
    return "TensorFlow"

TensorFlowTensorRTConfig dataclass

Bases: CustomConfigForFormat

TensorFlow TensorRT custom config used for TensorRT SavedModel export.

Parameters:

Name Type Description Default
precision Union[Union[str, TensorRTPrecision], Tuple[Union[str, TensorRTPrecision], ...]]

TensorRT precision.

DEFAULT_TENSORRT_PRECISION
max_workspace_size Optional[int]

Max workspace size used by converter.

DEFAULT_MAX_WORKSPACE_SIZE
minimum_segment_size int

Min size of subgraph.

DEFAULT_MIN_SEGMENT_SIZE
trt_profile Optional[TensorRTProfile]

TensorRT profile.

None

format: Format property

Returns Format.TF_TRT.

Returns:

Type Description
Format

Format.TF_TRT

__post_init__()

Parse dataclass enums.

Source code in model_navigator/api/config.py
def __post_init__(self) -> None:
    """Parse dataclass enums."""
    precision = (self.precision,) if not isinstance(self.precision, (list, tuple)) else self.precision
    self.precision = tuple(TensorRTPrecision(p) for p in precision)

defaults()

Update parameters to defaults.

Source code in model_navigator/api/config.py
def defaults(self) -> None:
    """Update parameters to defaults."""
    self.precision = tuple(TensorRTPrecision(p) for p in DEFAULT_TENSORRT_PRECISION)
    self.max_workspace_size = DEFAULT_MAX_WORKSPACE_SIZE
    self.minimum_segment_size = DEFAULT_MIN_SEGMENT_SIZE
    self.trt_profile = None

from_dict(config_dict) classmethod

Instantiate TensorFlowTensorRTConfig from adictionary.

Source code in model_navigator/api/config.py
@classmethod
def from_dict(cls, config_dict: Dict[str, Any]) -> "TensorFlowTensorRTConfig":
    """Instantiate TensorFlowTensorRTConfig from  adictionary."""
    if config_dict.get("trt_profile") is not None and not isinstance(config_dict["trt_profile"], TensorRTProfile):
        config_dict["trt_profile"] = TensorRTProfile.from_dict(config_dict["trt_profile"])
    return cls(**config_dict)

name() classmethod

Name of the config.

Source code in model_navigator/api/config.py
@classmethod
def name(cls) -> str:
    """Name of the config."""
    return "TensorFlowTensorRT"

TensorRTCompatibilityLevel

Bases: Enum

Compatibility level for TensorRT.

Parameters:

Name Type Description Default
AMPERE_PLUS str

Support AMPERE plus architecture

required

TensorRTConfig dataclass

Bases: CustomConfigForFormat

TensorRT custom config used for TensorRT conversion.

Parameters:

Name Type Description Default
precision Union[Union[str, TensorRTPrecision], Tuple[Union[str, TensorRTPrecision], ...]]

TensorRT precision.

DEFAULT_TENSORRT_PRECISION
max_workspace_size Optional[int]

Max workspace size used by converter.

DEFAULT_MAX_WORKSPACE_SIZE
trt_profile Optional[TensorRTProfile]

TensorRT profile.

None
optimization_level Optional[int]

Optimization level for TensorRT conversion. Allowed values are fom 0 to 5. Where default is 3 based on TensorRT API documentation.

None

format: Format property

Returns Format.TENSORRT.

Returns:

Type Description
Format

Format.TENSORRT

__post_init__()

Parse dataclass enums.

Source code in model_navigator/api/config.py
def __post_init__(self) -> None:
    """Parse dataclass enums."""
    self.precision_mode = TensorRTPrecisionMode(self.precision_mode)
    precision = (self.precision,) if not isinstance(self.precision, (list, tuple)) else self.precision
    self.precision = tuple(TensorRTPrecision(p) for p in precision)

    if self.optimization_level is not None and (self.optimization_level < 0 or self.optimization_level > 5):
        raise ModelNavigatorConfigurationError(
            f"TensorRT `optimization_level` must be between 0 and 5. Provided value: {self.optimization_level}."
        )

defaults()

Update parameters to defaults.

Source code in model_navigator/api/config.py
def defaults(self) -> None:
    """Update parameters to defaults."""
    self.precision = tuple(TensorRTPrecision(p) for p in DEFAULT_TENSORRT_PRECISION)
    self.precision_mode = DEFAULT_TENSORRT_PRECISION_MODE
    self.trt_profile = None
    self.max_workspace_size = DEFAULT_MAX_WORKSPACE_SIZE
    self.optimization_level = None
    self.compatibility_level = None

from_dict(config_dict) classmethod

Instantiate TensorRTConfig from adictionary.

Source code in model_navigator/api/config.py
@classmethod
def from_dict(cls, config_dict: Dict[str, Any]) -> "TensorRTConfig":
    """Instantiate TensorRTConfig from  adictionary."""
    if config_dict.get("trt_profile") is not None and not isinstance(config_dict["trt_profile"], TensorRTProfile):
        config_dict["trt_profile"] = TensorRTProfile.from_dict(config_dict["trt_profile"])
    return cls(**config_dict)

name() classmethod

Name of the config.

Source code in model_navigator/api/config.py
@classmethod
def name(cls) -> str:
    """Name of the config."""
    return "TensorRT"

TensorRTPrecision

Bases: Enum

Precisions supported during TensorRT conversions.

Parameters:

Name Type Description Default
INT8 str

8-bit integer precision.

required
FP16 str

16-bit floating point precision.

required
FP32 str

32-bit floating point precision.

required

TensorRTPrecisionMode

Bases: Enum

Precision modes for TensorRT conversions.

Parameters:

Name Type Description Default
HIERARCHY str

Use TensorRT precision hierarchy starting from highest to lowest.

required
SINGLE str

Use single precision.

required
MIXED str

Use mixed precision.

required

TensorRTProfile

Bases: Dict[str, ShapeTuple]

Single optimization profile that can be used to build an engine.

More specifically, it is an Dict[str, ShapeTuple] which maps binding names to a set of min/opt/max shapes.

__getitem__(key)

Retrieves the shapes registered for a given input name.

Returns:

Name Type Description
ShapeTuple
A named tuple including ``min``, ``opt``, and ``max`` members for the shapes
corresponding to the input.
Source code in model_navigator/api/config.py
def __getitem__(self, key):
    """Retrieves the shapes registered for a given input name.

    Returns:
        ShapeTuple:
                A named tuple including ``min``, ``opt``, and ``max`` members for the shapes
                corresponding to the input.
    """
    if key not in self:
        LOGGER.error(f"Binding: {key} does not have shapes set in this profile")
    return super().__getitem__(key)

__repr__()

Representation.

Source code in model_navigator/api/config.py
def __repr__(self):
    """Representation."""
    ret = "TensorRTProfile()"
    for name, (min, opt, max) in self.items():
        ret += f".add('{name}', min={min}, opt={opt}, max={max})"
    return ret

__str__()

String representation.

Source code in model_navigator/api/config.py
def __str__(self):
    """String representation."""
    elems = []
    for name, (min, opt, max) in self.items():
        elems.append(f"{name} [min={min}, opt={opt}, max={max}]")

    sep = ",\n "
    return "{" + sep.join(elems) + "}"

add(name, min, opt, max)

A convenience function to add shapes for a single binding.

Parameters:

Name Type Description Default
name str

The name of the binding.

required
min Tuple[int]

The minimum shape that the profile will support.

required
opt Tuple[int]

The shape for which TensorRT will optimize the engine.

required
max Tuple[int]

The maximum shape that the profile will support.

required

Returns:

Name Type Description
Profile

self, which allows this function to be easily chained to add multiple bindings, e.g., TensorRTProfile().add(...).add(...)

Source code in model_navigator/api/config.py
def add(self, name, min, opt, max):
    """A convenience function to add shapes for a single binding.

    Args:
        name (str): The name of the binding.
        min (Tuple[int]): The minimum shape that the profile will support.
        opt (Tuple[int]): The shape for which TensorRT will optimize the engine.
        max (Tuple[int]): The maximum shape that the profile will support.

    Returns:
        Profile:
            self, which allows this function to be easily chained to add multiple bindings,
            e.g., TensorRTProfile().add(...).add(...)
    """
    self[name] = ShapeTuple(min, opt, max)
    return self

from_dict(profile_dict) classmethod

Create a TensorRTProfile from a dictionary.

Parameters:

Name Type Description Default
profile_dict Dict[str, Dict[str, Tuple[int, ...]]]

A dictionary mapping binding names to a dictionary containing min, opt, and max keys.

required

Returns:

Name Type Description
TensorRTProfile

A TensorRTProfile object.

Source code in model_navigator/api/config.py
@classmethod
def from_dict(cls, profile_dict: Dict[str, Dict[str, Tuple[int, ...]]]):
    """Create a TensorRTProfile from a dictionary.

    Args:
        profile_dict (Dict[str, Dict[str, Tuple[int, ...]]]):
            A dictionary mapping binding names to a dictionary containing ``min``, ``opt``, and
            ``max`` keys.

    Returns:
        TensorRTProfile:
            A TensorRTProfile object.
    """
    return cls({name: ShapeTuple(**shapes) for name, shapes in profile_dict.items()})

TorchConfig dataclass

Bases: CustomConfigForFormat

Torch custom config used for TorchScript export.

Parameters:

Name Type Description Default
jit_type Union[Union[str, JitType], Tuple[Union[str, JitType], ...]]

Type of TorchScript export.

(JitType.SCRIPT, JitType.TRACE)
strict bool

Enable or Disable strict flag for tracer used in TorchScript export, default: True.

True

format: Format property

Returns Format.TORCHSCRIPT.

Returns:

Type Description
Format

Format.TORCHSCRIPT

__post_init__()

Parse dataclass enums.

Source code in model_navigator/api/config.py
def __post_init__(self) -> None:
    """Parse dataclass enums."""
    jit_type = (self.jit_type,) if not isinstance(self.jit_type, (list, tuple)) else self.jit_type
    self.jit_type = tuple(JitType(j) for j in jit_type)

defaults()

Update parameters to defaults.

Source code in model_navigator/api/config.py
def defaults(self) -> None:
    """Update parameters to defaults."""
    self.jit_type = (JitType.SCRIPT, JitType.TRACE)
    self.strict = True

name() classmethod

Name of the config.

Source code in model_navigator/api/config.py
@classmethod
def name(cls) -> str:
    """Name of the config."""
    return "Torch"

TorchTensorRTConfig dataclass

Bases: CustomConfigForFormat

Torch custom config used for TensorRT TorchScript conversion.

Parameters:

Name Type Description Default
precision Union[Union[str, TensorRTPrecision], Tuple[Union[str, TensorRTPrecision], ...]]

TensorRT precision.

DEFAULT_TENSORRT_PRECISION
max_workspace_size Optional[int]

Max workspace size used by converter.

DEFAULT_MAX_WORKSPACE_SIZE
trt_profile Optional[TensorRTProfile]

TensorRT profile.

None

format: Format property

Returns Format.TORCH_TRT.

Returns:

Type Description
Format

Format.TORCH_TRT

__post_init__()

Parse dataclass enums.

Source code in model_navigator/api/config.py
def __post_init__(self) -> None:
    """Parse dataclass enums."""
    precision = (self.precision,) if not isinstance(self.precision, (list, tuple)) else self.precision
    self.precision = tuple(TensorRTPrecision(p) for p in precision)
    self.precision_mode = TensorRTPrecisionMode(self.precision_mode)

defaults()

Update parameters to defaults.

Source code in model_navigator/api/config.py
def defaults(self) -> None:
    """Update parameters to defaults."""
    self.precision = tuple(TensorRTPrecision(p) for p in DEFAULT_TENSORRT_PRECISION)
    self.precision_mode = DEFAULT_TENSORRT_PRECISION_MODE
    self.trt_profile = None
    self.max_workspace_size = DEFAULT_MAX_WORKSPACE_SIZE

from_dict(config_dict) classmethod

Instantiate TorchTensorRTConfig from adictionary.

Source code in model_navigator/api/config.py
@classmethod
def from_dict(cls, config_dict: Dict[str, Any]) -> "TorchTensorRTConfig":
    """Instantiate TorchTensorRTConfig from  adictionary."""
    if config_dict.get("trt_profile") is not None and not isinstance(config_dict["trt_profile"], TensorRTProfile):
        config_dict["trt_profile"] = TensorRTProfile.from_dict(config_dict["trt_profile"])
    return cls(**config_dict)

name() classmethod

Name of the config.

Source code in model_navigator/api/config.py
@classmethod
def name(cls) -> str:
    """Name of the config."""
    return "TorchTensorRT"

map_custom_configs(custom_configs)

Map custom configs from list to dictionary.

Parameters:

Name Type Description Default
custom_configs Optional[Sequence[CustomConfig]]

List of custom configs passed to API method

required

Returns:

Type Description
Dict

Mapped configs to dictionary

Source code in model_navigator/api/config.py
def map_custom_configs(custom_configs: Optional[Sequence[CustomConfig]]) -> Dict:
    """Map custom configs from list to dictionary.

    Args:
        custom_configs: List of custom configs passed to API method

    Returns:
        Mapped configs to dictionary
    """
    if not custom_configs:
        return {}

    return {config.name(): config for config in custom_configs}

model_navigator.api.MaxThroughputAndMinLatencyStrategy

Bases: RuntimeSearchStrategy

Get runtime with the highest throughput and the lowest latency.

model_navigator.api.MaxThroughputStrategy

Bases: RuntimeSearchStrategy

Get runtime with the highest throughput.

model_navigator.api.MinLatencyStrategy

Bases: RuntimeSearchStrategy

Get runtime with the lowest latency.