Dynamic Batcher
          model_navigator.api.triton.DynamicBatcher
  
  
      dataclass
  
DynamicBatcher(max_queue_delay_microseconds=0, preferred_batch_size=None, preserve_ordering=False, priority_levels=0, default_priority_level=0, default_queue_policy=None, priority_queue_policy=None)
Dynamic batching configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
max_queue_delay_microseconds(int, default:0) –The maximum time, in microseconds, a request will be delayed in the scheduling queue to wait for additional requests for batching.
 - 
        
preferred_batch_size(Optional[list], default:None) –Preferred batch sizes for dynamic batching.
 - 
        
preserve_ordering–Should the dynamic batcher preserve the ordering of responses to match the order of requests received by the scheduler.
 - 
        
priority_levels(int, default:0) –The number of priority levels to be enabled for the model.
 - 
        
default_priority_level(int, default:0) –The priority level used for requests that don't specify their priority.
 - 
        
default_queue_policy(Optional[QueuePolicy], default:None) –The default queue policy used for requests.
 - 
        
priority_queue_policy(Optional[Dict[int, QueuePolicy]], default:None) –Specify the queue policy for the priority level.
 
__post_init__
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
            
          model_navigator.api.triton.QueuePolicy
  
  
      dataclass
  
QueuePolicy(timeout_action=TimeoutAction.REJECT, default_timeout_microseconds=0, allow_timeout_override=False, max_queue_size=0)
Model queue policy configuration.
Used for default_queue_policy and priority_queue_policy fields in DynamicBatcher configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
timeout_action(TimeoutAction, default:REJECT) –The action applied to timed-out request.
 - 
        
default_timeout_microseconds(int, default:0) –The default timeout for every request, in microseconds.
 - 
        
allow_timeout_override(bool, default:False) –Whether individual request can override the default timeout value.
 - 
        
max_queue_size(int, default:0) –The maximum queue size for holding requests.
 
model_navigator.api.triton.TimeoutAction
            Bases: Enum
Timeout action definition for timeout_action QueuePolicy field.
Read more in Triton Inference server model configuration
Parameters:
- 
        
REJECT–"REJECT"
 - 
        
DELAY–"DELAY"