Accelerators
model_navigator.api.triton.AutoMixedPrecisionAccelerator
dataclass
Auto-mixed-precision accelerator for TensorFlow. Enable automatic FP16 precision.
Currently empty - no arguments required.
model_navigator.api.triton.GPUIOAccelerator
dataclass
GPU IO accelerator for TensorFlow.
Currently empty - no arguments required.
model_navigator.api.triton.OpenVINOAccelerator
dataclass
OpenVINO optimization.
Currently empty - no arguments required.
model_navigator.api.triton.OpenVINOAccelerator
dataclass
OpenVINO optimization.
Currently empty - no arguments required.
model_navigator.api.triton.TensorRTAccelerator
dataclass
TensorRT accelerator configuration.
Read more in Triton Inference server model configuration
Parameters:
Name | Type | Description | Default |
---|---|---|---|
precision |
TensorRTOptPrecision
|
The precision used for optimization |
TensorRTOptPrecision.FP32
|
max_workspace_size |
Optional[int]
|
The maximum GPU memory the model can use temporarily during execution |
None
|
max_cached_engines |
Optional[int]
|
The maximum number of cached TensorRT engines in dynamic TensorRT ops |
None
|
minimum_segment_size |
Optional[int]
|
The smallest model subgraph that will be considered for optimization by TensorRT |
None
|