Sequence Batcher

`model_navigator.api.triton.SequenceBatcher` `dataclass`

Sequence batching configuration.

Read more in Triton Inference server model configuration

Parameters:

Name	Type	Description	Default
`strategy`	`Optional[Union[SequenceBatcherStrategyDirect, SequenceBatcherStrategyOldest]]`	The strategy used by the sequence batcher.	`None`
`max_sequence_idle_microseconds`	`Optional[int]`	The maximum time, in microseconds, that a sequence is allowed to be idle before it is aborted.	`None`
`control_inputs`	`List[SequenceBatcherControlInput]`	The model input(s) that the server should use to communicate sequence start, stop, ready and similar control values to the model.	`dataclasses.field(default_factory=lambda : [])`
`states`	`List[SequenceBatcherState]`	The optional state that can be stored in Triton for performing inference requests on a sequence.	`dataclasses.field(default_factory=lambda : [])`

`__post_init__()`

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py

def __post_init__(self):
    """Validate the configuration for early error handling."""
    if self.strategy and (
        not isinstance(self.strategy, SequenceBatcherStrategyDirect)
        and not isinstance(self.strategy, SequenceBatcherStrategyOldest)
    ):
        raise ModelNavigatorWrongParameterError("Unsupported strategy type provided.")

`model_navigator.api.triton.SequenceBatcherControl` `dataclass`

Sequence Batching control configuration.

Read more in Triton Inference server model configuration

Parameters:

Name	Type	Description	Default
`kind`	`SequenceBatcherControlKind`	The kind of this control.	required
`dtype`	`Optional[Union[np.dtype, Type[np.dtype]]]`	The control's datatype.	`None`
`int32_false_true`	`List[int]`	The control's true and false setting is indicated by setting a value in an int32 tensor.	`dataclasses.field(default_factory=lambda : [])`
`fp32_false_true`	`List[float]`	The control's true and false setting is indicated by setting a value in a fp32 tensor.	`dataclasses.field(default_factory=lambda : [])`
`bool_false_true`	`List[bool]`	The control's true and false setting is indicated by setting a value in a bool tensor.	`dataclasses.field(default_factory=lambda : [])`

`__post_init__()`

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py

def __post_init__(self):
    """Validate the configuration for early error handling."""
    if self.kind == SequenceBatcherControlKind.CONTROL_SEQUENCE_CORRID and self.dtype is None:
        raise ModelNavigatorWrongParameterError(f"The {self.kind} control type requires `dtype` to be specified.")

    if self.kind == SequenceBatcherControlKind.CONTROL_SEQUENCE_CORRID and any(
        [self.int32_false_true, self.fp32_false_true, self.bool_false_true]
    ):
        raise ModelNavigatorWrongParameterError(
            f"The {self.kind} control type requires `dtype` to be specified only."
        )

    controls = [
        SequenceBatcherControlKind.CONTROL_SEQUENCE_START,
        SequenceBatcherControlKind.CONTROL_SEQUENCE_END,
        SequenceBatcherControlKind.CONTROL_SEQUENCE_READY,
    ]

    if self.kind in controls and self.dtype:
        raise ModelNavigatorWrongParameterError(f"The {self.kind} control does not support `dtype` parameter.")

    if self.kind in controls and not (self.int32_false_true or self.fp32_false_true or self.bool_false_true):
        raise ModelNavigatorWrongParameterError(
            f"The {self.kind} control type requires one of: "
            "`int32_false_true`, `fp32_false_true`, `bool_false_true` to be specified."
        )

    if self.int32_false_true and len(self.int32_false_true) != 2:
        raise ModelNavigatorWrongParameterError(
            "The `int32_false_true` field should be two element list with false and true values. Example: [0 , 1]"
        )

    if self.fp32_false_true and len(self.fp32_false_true) != 2:
        raise ModelNavigatorWrongParameterError(
            "The `fp32_false_true` field should be two element list with false and true values. Example: [0 , 1]"
        )

    if self.bool_false_true and len(self.bool_false_true) != 2:
        raise ModelNavigatorWrongParameterError(
            "The `bool_false_true` field should be two element list with false and true values. "
            "Example: [False, True]"
        )

    if self.dtype:
        self.dtype = cast_dtype(dtype=self.dtype)

    expect_type("dtype", self.dtype, np.dtype, optional=True)

`model_navigator.api.triton.SequenceBatcherControlInput` `dataclass`

Sequence Batching control input configuration.

Read more in Triton Inference server model configuration

Parameters:

Name	Type	Description	Default
`input_name`	`str`	The name of the model input.	required
`controls`	`List[SequenceBatcherControl]`	List of control value(s) that should be communicated to the model using this model input.	required

`model_navigator.api.triton.SequenceBatcherControlKind`

Bases: enum.Enum

Sequence Batching control options.

Read more in Triton Inference server model configuration

Parameters:

Name	Description	Default
`CONTROL_SEQUENCE_START`	"CONTROL_SEQUENCE_START"	required
`CONTROL_SEQUENCE_READY`	"CONTROL_SEQUENCE_READY"	required
`CONTROL_SEQUENCE_END`	"CONTROL_SEQUENCE_END"	required
`CONTROL_SEQUENCE_CORRID`	"CONTROL_SEQUENCE_CORRID"	required

`model_navigator.api.triton.SequenceBatcherInitialState` `dataclass`

Sequence Batching initial state configuration.

Read more in Triton Inference server model configuration

Parameters:

Name	Type	Description	Default
`name`	`str`		required
`shape`	`Tuple[int, ...]`	The shape of the state tensor, not including the batch dimension.	required
`dtype`	`Optional[Union[np.dtype, Type[np.dtype]]]`	The data-type of the state.	`None`
`zero_data`	`Optional[bool]`	The identifier for using zeros as initial state data.	`None`
`data_file`	`Optional[str]`	The file whose content will be used as the initial data for the state in row-major order.	`None`

`__post_init__()`

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py

def __post_init__(self):
    """Validate the configuration for early error handling."""
    if not self.zero_data and not self.data_file:
        raise ModelNavigatorWrongParameterError("zero_data or data_file has to be defined. None was provided.")

    if self.zero_data and self.data_file:
        raise ModelNavigatorWrongParameterError("zero_data or data_file has to be defined. Both were provided.")

    if self.dtype:
        self.dtype = cast_dtype(dtype=self.dtype)

    expect_type("name", self.name, str)
    expect_type("shape", self.shape, tuple)
    expect_type("dtype", self.dtype, np.dtype, optional=True)
    is_shape_correct("shape", self.shape)

`model_navigator.api.triton.SequenceBatcherState` `dataclass`

Sequence Batching state configuration.

Read more in Triton Inference server model configuration

Parameters:

Name	Type	Description	Default
`input_name`	`str`	The name of the model state input.	required
`output_name`	`str`	The name of the model state output.	required
`dtype`	`Union[np.dtype, Type[np.dtype]]`	The data-type of the state.	required
`shape`	`Tuple[int, ...]`	The shape of the state tensor.	required
`initial_states`	`List[SequenceBatcherInitialState]`	The optional field to specify the list of initial states for the model.	`dataclasses.field(default_factory=lambda : [])`

`__post_init__()`

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py

def __post_init__(self):
    """Validate the configuration for early error handling."""
    self.dtype = cast_dtype(dtype=self.dtype)

    expect_type("shape", self.shape, tuple)
    expect_type("dtype", self.dtype, np.dtype, optional=True)
    is_shape_correct("shape", self.shape)

`model_navigator.api.triton.SequenceBatcherStrategyDirect` `dataclass`

Sequence Batching strategy direct configuration.

Read more in Triton Inference server model configuration

Parameters:

Name	Type	Description	Default
`max_queue_delay_microseconds`	`int`	The maximum time, in microseconds, a candidate request will be delayed in the sequence batch scheduling queue to wait for additional requests for batching.	`0`
`minimum_slot_utilization`	`float`	The minimum slot utilization that must be satisfied to execute the batch before 'max_queue_delay_microseconds' expires.	`0.0`

`model_navigator.api.triton.SequenceBatcherStrategyOldest` `dataclass`

Sequence Batching strategy oldest configuration.

Read more in Triton Inference server model configuration

Parameters:

Name	Type	Description	Default
`max_candidate_sequences`	`int`	Maximum number of candidate sequences that the batcher maintains.	required
`preferred_batch_size`	`List[int]`	Preferred batch sizes for dynamic batching of candidate sequences.	`dataclasses.field(default_factory=lambda : [])`
`max_queue_delay_microseconds`	`int`	The maximum time, in microseconds, a candidate request will be delayed in the dynamic batch scheduling queue to wait for additional requests for batching.	`0`

Sequence Batcher

model_navigator.api.triton.SequenceBatcher dataclass

__post_init__()

model_navigator.api.triton.SequenceBatcherControl dataclass

__post_init__()

model_navigator.api.triton.SequenceBatcherControlInput dataclass

model_navigator.api.triton.SequenceBatcherControlKind

model_navigator.api.triton.SequenceBatcherInitialState dataclass

__post_init__()

model_navigator.api.triton.SequenceBatcherState dataclass

__post_init__()

model_navigator.api.triton.SequenceBatcherStrategyDirect dataclass

model_navigator.api.triton.SequenceBatcherStrategyOldest dataclass

`model_navigator.api.triton.SequenceBatcher` `dataclass`

`__post_init__()`

`model_navigator.api.triton.SequenceBatcherControl` `dataclass`

`__post_init__()`

`model_navigator.api.triton.SequenceBatcherControlInput` `dataclass`

`model_navigator.api.triton.SequenceBatcherControlKind`

`model_navigator.api.triton.SequenceBatcherInitialState` `dataclass`

`__post_init__()`

`model_navigator.api.triton.SequenceBatcherState` `dataclass`

`__post_init__()`

`model_navigator.api.triton.SequenceBatcherStrategyDirect` `dataclass`

`model_navigator.api.triton.SequenceBatcherStrategyOldest` `dataclass`