Sequence Batcher
          model_navigator.api.triton.SequenceBatcher
  
  
      dataclass
  
SequenceBatcher(strategy=None, max_sequence_idle_microseconds=None, control_inputs=lambda: [](), states=lambda: []())
Sequence batching configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
strategy(Optional[Union[SequenceBatcherStrategyDirect, SequenceBatcherStrategyOldest]], default:None) –The strategy used by the sequence batcher.
 - 
        
max_sequence_idle_microseconds(Optional[int], default:None) –The maximum time, in microseconds, that a sequence is allowed to be idle before it is aborted.
 - 
        
control_inputs(List[SequenceBatcherControlInput], default:lambda: []()) –The model input(s) that the server should use to communicate sequence start, stop, ready and similar control values to the model.
 - 
        
states(List[SequenceBatcherState], default:lambda: []()) –The optional state that can be stored in Triton for performing inference requests on a sequence.
 
__post_init__
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
            
          model_navigator.api.triton.SequenceBatcherControl
  
  
      dataclass
  
SequenceBatcherControl(kind, dtype=None, int32_false_true=lambda: [](), fp32_false_true=lambda: [](), bool_false_true=lambda: []())
Sequence Batching control configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
kind(SequenceBatcherControlKind) –The kind of this control.
 - 
        
dtype(Optional[Union[dtype, Type[dtype]]], default:None) –The control's datatype.
 - 
        
int32_false_true(List[int], default:lambda: []()) –The control's true and false setting is indicated by setting a value in an int32 tensor.
 - 
        
fp32_false_true(List[float], default:lambda: []()) –The control's true and false setting is indicated by setting a value in a fp32 tensor.
 - 
        
bool_false_true(List[bool], default:lambda: []()) –The control's true and false setting is indicated by setting a value in a bool tensor.
 
__post_init__
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
            
          model_navigator.api.triton.SequenceBatcherControlInput
  
  
      dataclass
  
  Sequence Batching control input configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
input_name(str) –The name of the model input.
 - 
        
controls(List[SequenceBatcherControl]) –List of control value(s) that should be communicated to the model using this model input.
 
model_navigator.api.triton.SequenceBatcherControlKind
            Bases: Enum
Sequence Batching control options.
Read more in Triton Inference server model configuration
Parameters:
- 
        
CONTROL_SEQUENCE_START–"CONTROL_SEQUENCE_START"
 - 
        
CONTROL_SEQUENCE_READY–"CONTROL_SEQUENCE_READY"
 - 
        
CONTROL_SEQUENCE_END–"CONTROL_SEQUENCE_END"
 - 
        
CONTROL_SEQUENCE_CORRID–"CONTROL_SEQUENCE_CORRID"
 
          model_navigator.api.triton.SequenceBatcherInitialState
  
  
      dataclass
  
  Sequence Batching initial state configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
name(str) – - 
        
shape(Tuple[int, ...]) –The shape of the state tensor, not including the batch dimension.
 - 
        
dtype(Optional[Union[dtype, Type[dtype]]], default:None) –The data-type of the state.
 - 
        
zero_data(Optional[bool], default:None) –The identifier for using zeros as initial state data.
 - 
        
data_file(Optional[Path], default:None) –The file whose content will be used as the initial data for the state in row-major order.
 
__post_init__
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
            
          model_navigator.api.triton.SequenceBatcherState
  
  
      dataclass
  
  Sequence Batching state configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
input_name(str) –The name of the model state input.
 - 
        
output_name(str) –The name of the model state output.
 - 
        
dtype(Union[dtype, Type[dtype]]) –The data-type of the state.
 - 
        
shape(Tuple[int, ...]) –The shape of the state tensor.
 - 
        
initial_states(List[SequenceBatcherInitialState], default:lambda: []()) –The optional field to specify the list of initial states for the model.
 
__post_init__
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
            
          
          model_navigator.api.triton.SequenceBatcherStrategyDirect
  
  
      dataclass
  
  Sequence Batching strategy direct configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
max_queue_delay_microseconds(int, default:0) –The maximum time, in microseconds, a candidate request will be delayed in the sequence batch scheduling queue to wait for additional requests for batching.
 - 
        
minimum_slot_utilization(float, default:0.0) –The minimum slot utilization that must be satisfied to execute the batch before 'max_queue_delay_microseconds' expires.
 
          model_navigator.api.triton.SequenceBatcherStrategyOldest
  
  
      dataclass
  
SequenceBatcherStrategyOldest(max_candidate_sequences, preferred_batch_size=lambda: [](), max_queue_delay_microseconds=0)
Sequence Batching strategy oldest configuration.
Read more in Triton Inference server model configuration
Parameters:
- 
        
max_candidate_sequences(int) –Maximum number of candidate sequences that the batcher maintains.
 - 
        
preferred_batch_size(List[int], default:lambda: []()) –Preferred batch sizes for dynamic batching of candidate sequences.
 - 
        
max_queue_delay_microseconds(int, default:0) –The maximum time, in microseconds, a candidate request will be delayed in the dynamic batch scheduling queue to wait for additional requests for batching.