Accelerators
          model_navigator.api.triton.AutoMixedPrecisionAccelerator
  
  
      dataclass
  
  Auto-mixed-precision accelerator for TensorFlow. Enable automatic FP16 precision.
Currently empty - no arguments required.
          model_navigator.api.triton.GPUIOAccelerator
  
  
      dataclass
  
  GPU IO accelerator for TensorFlow.
Currently empty - no arguments required.
          model_navigator.api.triton.OpenVINOAccelerator
  
  
      dataclass
  
  OpenVINO optimization.
Currently empty - no arguments required.
          model_navigator.api.triton.OpenVINOAccelerator
  
  
      dataclass
  
  OpenVINO optimization.
Currently empty - no arguments required.
          model_navigator.api.triton.TensorRTAccelerator
  
  
      dataclass
  
TensorRTAccelerator(precision=TensorRTOptPrecision.FP32, max_workspace_size=None, max_cached_engines=None, minimum_segment_size=None)
TensorRT accelerator configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
precision(TensorRTOptPrecision, default:FP32) –The precision used for optimization
 - 
        
max_workspace_size(Optional[int], default:None) –The maximum GPU memory the model can use temporarily during execution
 - 
        
max_cached_engines(Optional[int], default:None) –The maximum number of cached TensorRT engines in dynamic TensorRT ops
 - 
        
minimum_segment_size(Optional[int], default:None) –The smallest model subgraph that will be considered for optimization by TensorRT
 
model_navigator.api.triton.TensorRTOptPrecision
            Bases: Enum
TensorRT optimization allowed precision.
Parameters:
- 
        
FP16–fp16 precision
 - 
        
FP32–fp32 precision