Specialized Configs for Triton Backends
The Python API provides specialized configuration classes that help provide only available options for the given type of model.
model_navigator.api.triton.ONNXModelConfig
dataclass
Bases: BaseSpecializedModelConfig
Specialized model config for ONNX backend supported model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
platform |
Optional[Platform]
|
Override backend parameter with platform. Possible options: Platform.ONNXRuntimeONNX |
None
|
optimization |
Optional[ONNXOptimization]
|
Possible optimization for ONNX models |
None
|
backend: Backend
property
Define backend value for config.
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/onnx_model_config.py
model_navigator.api.triton.ONNXOptimization
dataclass
ONNX possible optimizations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
accelerator |
Union[OpenVINOAccelerator, TensorRTAccelerator]
|
Execution accelerator for model |
required |
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/onnx_model_config.py
model_navigator.api.triton.PythonModelConfig
dataclass
Bases: BaseSpecializedModelConfig
Specialized model config for Python backend supported model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs |
Sequence[InputTensorSpec]
|
Required definition of model inputs |
dataclasses.field(default_factory=lambda : [])
|
outputs |
Sequence[OutputTensorSpec]
|
Required definition of model outputs |
dataclasses.field(default_factory=lambda : [])
|
backend: Backend
property
Define backend value for config.
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/python_model_config.py
model_navigator.api.triton.PyTorchModelConfig
dataclass
Bases: BaseSpecializedModelConfig
Specialized model config for PyTorch backend supported model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
platform |
Optional[Platform]
|
Override backend parameter with platform. Possible options: Platform.PyTorchLibtorch |
None
|
inputs |
Sequence[InputTensorSpec]
|
Required definition of model inputs |
dataclasses.field(default_factory=lambda : [])
|
outputs |
Sequence[OutputTensorSpec]
|
Required definition of model outputs |
dataclasses.field(default_factory=lambda : [])
|
backend: Backend
property
Define backend value for config.
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/pytorch_model_config.py
model_navigator.api.triton.TensorFlowModelConfig
dataclass
Bases: BaseSpecializedModelConfig
Specialized model config for TensorFlow backend supported model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
platform |
Optional[Platform]
|
Override backend parameter with platform. Possible options: Platform.TensorFlowSavedModel, Platform.TensorFlowGraphDef |
None
|
optimization |
Optional[TensorFlowOptimization]
|
Possible optimization for TensorFlow models |
None
|
backend: Backend
property
Define backend value for config.
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/tensorflow_model_config.py
model_navigator.api.triton.TensorFlowOptimization
dataclass
TensorFlow possible optimizations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
accelerator |
Union[AutoMixedPrecisionAccelerator, GPUIOAccelerator, TensorRTAccelerator]
|
Execution accelerator for model |
required |
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/tensorflow_model_config.py
model_navigator.api.triton.TensorRTModelConfig
dataclass
Bases: BaseSpecializedModelConfig
Specialized model config for TensorRT platform supported model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
platform |
Optional[Platform]
|
Override backend parameter with platform. Possible options: Platform.TensorRTPlan |
None
|
optimization |
Optional[TensorRTOptimization]
|
Possible optimization for TensorRT models |
None
|
backend: Backend
property
Define backend value for config.
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/tensorrt_model_config.py
model_navigator.api.triton.TensorRTOptimization
dataclass
TensorRT possible optimizations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cuda_graphs |
bool
|
Use CUDA graphs API to capture model operations and execute them more efficiently. |
False
|
gather_kernel_buffer_threshold |
Optional[int]
|
The backend may use a gather kernel to gather input data if the device has direct access to the source buffer and the destination buffer. |
None
|
eager_batching |
bool
|
Start preparing the next batch before the model instance is ready for the next inference. |
False
|
__post_init__()
Validate the configuration for early error handling.