Config
Classes, enums and types used to configure Model Navigator.
model_navigator.api.config
Definition of enums and classes representing configuration for Model Navigator.
CustomConfig
Base class used for custom configs. Input for Model Navigator optimize
method.
defaults()
from_dict(config_dict)
classmethod
CustomConfigForFormat
Bases: DataObject
, CustomConfig
Abstract base class used for custom configs representing particular format.
format: Format
abstractmethod
property
Format represented by CustomConfig.
DeviceKind
Format
Bases: Enum
All model formats supported by Model Navigator 'optimize' function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
PYTHON |
str
|
Format indicating any model defined in Python. |
required |
TORCH |
str
|
Format indicating PyTorch model. |
required |
TENSORFLOW |
str
|
Format indicating TensorFlow model. |
required |
JAX |
str
|
Format indicating JAX model. |
required |
TORCHSCRIPT |
str
|
Format indicating TorchScript model. |
required |
TF_SAVEDMODEL |
str
|
Format indicating TensorFlow SavedModel. |
required |
TF_TRT |
str
|
Format indicating TensorFlow TensorRT model. |
required |
TORCH_TRT |
str
|
Format indicating PyTorch TensorRT model. |
required |
ONNX |
str
|
Format indicating ONNX model. |
required |
TENSORRT |
str
|
Format indicating TensorRT model. |
required |
JitType
MeasurementMode
OnnxConfig
dataclass
Bases: CustomConfigForFormat
ONNX custom config used for ONNX export and conversion.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
opset |
Optional[int]
|
ONNX opset used for conversion. |
DEFAULT_ONNX_OPSET
|
dynamic_axes |
Optional[Dict[str, Union[Dict[int, str], List[int]]]]
|
Dynamic axes for ONNX conversion. |
None
|
onnx_extended_conversion |
bool
|
Enables additional conversions from TorchScript to ONNX. |
False
|
ProfilerConfig
dataclass
Bases: DataObject
Profiler configuration.
For each batch size profiler will run measurments in windows. Depending on the measurement mode, each window will have fixed time length (MeasurementMode.TIME_WINDOWS) or fixed number of requests (MeasurementMode.COUNT_WINDOWS). Batch sizes are profiled in the ascending order.
Profiler will run multiple trials and will stop when the measurements
are stable (within stability_percentage
from the mean) within three consecutive windows.
If the measurements are not stable after max_trials
trials, the profiler will stop with an error.
Profiler will also stop profiling when the throughput does not increase at least by throughput_cutoff_threshold
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
run_profiling |
bool
|
If True, run profiling, otherwise skip profiling. |
True
|
batch_sizes |
Optional[List[Union[int, None]]]
|
List of batch sizes to profile. None means that the model does not support batching. |
None
|
measurement_mode |
MeasurementMode
|
Measurement mode. |
MeasurementMode.COUNT_WINDOWS
|
measurement_interval |
Optional[float]
|
Measurement interval in milliseconds. Used only in MeasurementMode.TIME_WINDOWS mode. |
5000
|
measurement_request_count |
Optional[int]
|
Number of requests to measure in each window. Used only in MeasurementMode.COUNT_WINDOWS mode. |
50
|
stability_percentage |
float
|
Allowed percentage of variation from the mean in three consecutive windows. |
10.0
|
max_trials |
int
|
Maximum number of window trials. |
10
|
throughput_cutoff_threshold |
float
|
Minimum throughput increase to continue profiling. |
DEFAULT_PROFILING_THROUGHPUT_CUTOFF_THRESHOLD
|
from_dict(profiler_config_dict)
classmethod
Instantiate ProfilerConfig class from a dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profiler_config_dict |
Mapping
|
Data dictionary. |
required |
Returns:
Type | Description |
---|---|
ProfilerConfig
|
ProfilerConfig |
Source code in model_navigator/api/config.py
ShapeTuple
dataclass
Bases: DataObject
Represents a set of shapes for a single binding in a profile.
Each element of the tuple represents a shape for a single dimension of the binding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
min |
Tuple[int]
|
The minimum shape that the profile will support. |
required |
opt |
Tuple[int]
|
The shape for which TensorRT will optimize the engine. |
required |
max |
Tuple[int]
|
The maximum shape that the profile will support. |
required |
__iter__()
__repr__()
SizedIterable
TensorFlowConfig
dataclass
Bases: CustomConfigForFormat
TensorFlow custom config used for SavedModel export.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
jit_compile |
Tuple[Optional[bool], ...]
|
Enable or Disable jit_compile flag for tf.function wrapper for Jax infer function. |
(None)
|
enable_xla |
Tuple[Optional[bool], ...]
|
Enable or Disable enable_xla flag for jax2tf converter. |
(None)
|
format: Format
property
defaults()
TensorFlowTensorRTConfig
dataclass
Bases: CustomConfigForFormat
TensorFlow TensorRT custom config used for TensorRT SavedModel export.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
precision |
Union[Union[str, TensorRTPrecision], Tuple[Union[str, TensorRTPrecision], ...]]
|
TensorRT precision. |
DEFAULT_TENSORRT_PRECISION
|
max_workspace_size |
Optional[int]
|
Max workspace size used by converter. |
DEFAULT_MAX_WORKSPACE_SIZE
|
minimum_segment_size |
int
|
Min size of subgraph. |
DEFAULT_MIN_SEGMENT_SIZE
|
trt_profile |
Optional[TensorRTProfile]
|
TensorRT profile. |
None
|
format: Format
property
__post_init__()
Parse dataclass enums.
defaults()
Update parameters to defaults.
Source code in model_navigator/api/config.py
from_dict(config_dict)
classmethod
Instantiate TensorFlowTensorRTConfig from adictionary.
Source code in model_navigator/api/config.py
TensorRTCompatibilityLevel
TensorRTConfig
dataclass
Bases: CustomConfigForFormat
TensorRT custom config used for TensorRT conversion.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
precision |
Union[Union[str, TensorRTPrecision], Tuple[Union[str, TensorRTPrecision], ...]]
|
TensorRT precision. |
DEFAULT_TENSORRT_PRECISION
|
max_workspace_size |
Optional[int]
|
Max workspace size used by converter. |
DEFAULT_MAX_WORKSPACE_SIZE
|
trt_profile |
Optional[TensorRTProfile]
|
TensorRT profile. |
None
|
optimization_level |
Optional[int]
|
Optimization level for TensorRT conversion. Allowed values are fom 0 to 5. Where default is 3 based on TensorRT API documentation. |
None
|
format: Format
property
__post_init__()
Parse dataclass enums.
Source code in model_navigator/api/config.py
defaults()
Update parameters to defaults.
Source code in model_navigator/api/config.py
from_dict(config_dict)
classmethod
Instantiate TensorRTConfig from adictionary.
Source code in model_navigator/api/config.py
TensorRTPrecision
TensorRTPrecisionMode
TensorRTProfile
Bases: Dict[str, ShapeTuple]
Single optimization profile that can be used to build an engine.
More specifically, it is an Dict[str, ShapeTuple]
which maps binding
names to a set of min/opt/max shapes.
__getitem__(key)
Retrieves the shapes registered for a given input name.
Returns:
Name | Type | Description |
---|---|---|
ShapeTuple |
|
Source code in model_navigator/api/config.py
__repr__()
__str__()
String representation.
add(name, min, opt, max)
A convenience function to add shapes for a single binding.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
The name of the binding. |
required |
min |
Tuple[int]
|
The minimum shape that the profile will support. |
required |
opt |
Tuple[int]
|
The shape for which TensorRT will optimize the engine. |
required |
max |
Tuple[int]
|
The maximum shape that the profile will support. |
required |
Returns:
Name | Type | Description |
---|---|---|
Profile |
self, which allows this function to be easily chained to add multiple bindings, e.g., TensorRTProfile().add(...).add(...) |
Source code in model_navigator/api/config.py
from_dict(profile_dict)
classmethod
Create a TensorRTProfile from a dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile_dict |
Dict[str, Dict[str, Tuple[int, ...]]]
|
A dictionary mapping binding names to a dictionary containing |
required |
Returns:
Name | Type | Description |
---|---|---|
TensorRTProfile |
A TensorRTProfile object. |
Source code in model_navigator/api/config.py
TorchConfig
dataclass
Bases: CustomConfigForFormat
Torch custom config used for TorchScript export.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
jit_type |
Union[Union[str, JitType], Tuple[Union[str, JitType], ...]]
|
Type of TorchScript export. |
(JitType.SCRIPT, JitType.TRACE)
|
strict |
bool
|
Enable or Disable strict flag for tracer used in TorchScript export, default: True. |
True
|
format: Format
property
__post_init__()
defaults()
TorchTensorRTConfig
dataclass
Bases: CustomConfigForFormat
Torch custom config used for TensorRT TorchScript conversion.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
precision |
Union[Union[str, TensorRTPrecision], Tuple[Union[str, TensorRTPrecision], ...]]
|
TensorRT precision. |
DEFAULT_TENSORRT_PRECISION
|
max_workspace_size |
Optional[int]
|
Max workspace size used by converter. |
DEFAULT_MAX_WORKSPACE_SIZE
|
trt_profile |
Optional[TensorRTProfile]
|
TensorRT profile. |
None
|
format: Format
property
__post_init__()
Parse dataclass enums.
Source code in model_navigator/api/config.py
defaults()
Update parameters to defaults.
Source code in model_navigator/api/config.py
from_dict(config_dict)
classmethod
Instantiate TorchTensorRTConfig from adictionary.
Source code in model_navigator/api/config.py
map_custom_configs(custom_configs)
Map custom configs from list to dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
custom_configs |
Optional[Sequence[CustomConfig]]
|
List of custom configs passed to API method |
required |
Returns:
Type | Description |
---|---|
Dict
|
Mapped configs to dictionary |
Source code in model_navigator/api/config.py
model_navigator.api.MaxThroughputAndMinLatencyStrategy
Bases: RuntimeSearchStrategy
Get runtime with the highest throughput and the lowest latency.
model_navigator.api.MaxThroughputStrategy
Bases: RuntimeSearchStrategy
Get runtime with the highest throughput.
model_navigator.api.MinLatencyStrategy
Bases: RuntimeSearchStrategy
Get runtime with the lowest latency.