Skip to content

Sequence Batcher

model_navigator.api.triton.SequenceBatcher dataclass

Sequence batching configuration.

Read more in Triton Inference server model configuration

Parameters:

__post_init__

__post_init__()

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py
def __post_init__(self):
    """Validate the configuration for early error handling."""
    if self.strategy and (
        not isinstance(self.strategy, SequenceBatcherStrategyDirect)
        and not isinstance(self.strategy, SequenceBatcherStrategyOldest)
    ):
        raise ModelNavigatorWrongParameterError("Unsupported strategy type provided.")

model_navigator.api.triton.SequenceBatcherControl dataclass

Sequence Batching control configuration.

Read more in Triton Inference server model configuration

Parameters:

  • kind (SequenceBatcherControlKind) –

    The kind of this control.

  • dtype (Optional[Union[np.dtype, Type[np.dtype]]]) –

    The control's datatype.

  • int32_false_true (List[int]) –

    The control's true and false setting is indicated by setting a value in an int32 tensor.

  • fp32_false_true (List[float]) –

    The control's true and false setting is indicated by setting a value in a fp32 tensor.

  • bool_false_true (List[bool]) –

    The control's true and false setting is indicated by setting a value in a bool tensor.

__post_init__

__post_init__()

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py
def __post_init__(self):
    """Validate the configuration for early error handling."""
    if self.kind == SequenceBatcherControlKind.CONTROL_SEQUENCE_CORRID and self.dtype is None:
        raise ModelNavigatorWrongParameterError(f"The {self.kind} control type requires `dtype` to be specified.")

    if self.kind == SequenceBatcherControlKind.CONTROL_SEQUENCE_CORRID and any(
        [self.int32_false_true, self.fp32_false_true, self.bool_false_true]
    ):
        raise ModelNavigatorWrongParameterError(
            f"The {self.kind} control type requires `dtype` to be specified only."
        )

    controls = [
        SequenceBatcherControlKind.CONTROL_SEQUENCE_START,
        SequenceBatcherControlKind.CONTROL_SEQUENCE_END,
        SequenceBatcherControlKind.CONTROL_SEQUENCE_READY,
    ]

    if self.kind in controls and self.dtype:
        raise ModelNavigatorWrongParameterError(f"The {self.kind} control does not support `dtype` parameter.")

    if self.kind in controls and not (self.int32_false_true or self.fp32_false_true or self.bool_false_true):
        raise ModelNavigatorWrongParameterError(
            f"The {self.kind} control type requires one of: "
            "`int32_false_true`, `fp32_false_true`, `bool_false_true` to be specified."
        )

    if self.int32_false_true and len(self.int32_false_true) != 2:
        raise ModelNavigatorWrongParameterError(
            "The `int32_false_true` field should be two element list with false and true values. Example: [0 , 1]"
        )

    if self.fp32_false_true and len(self.fp32_false_true) != 2:
        raise ModelNavigatorWrongParameterError(
            "The `fp32_false_true` field should be two element list with false and true values. Example: [0 , 1]"
        )

    if self.bool_false_true and len(self.bool_false_true) != 2:
        raise ModelNavigatorWrongParameterError(
            "The `bool_false_true` field should be two element list with false and true values. "
            "Example: [False, True]"
        )

    if self.dtype:
        self.dtype = cast_dtype(dtype=self.dtype)

    expect_type("dtype", self.dtype, np.dtype, optional=True)

model_navigator.api.triton.SequenceBatcherControlInput dataclass

Sequence Batching control input configuration.

Read more in Triton Inference server model configuration

Parameters:

  • input_name (str) –

    The name of the model input.

  • controls (List[SequenceBatcherControl]) –

    List of control value(s) that should be communicated to the model using this model input.

model_navigator.api.triton.SequenceBatcherControlKind

Bases: enum.Enum

Sequence Batching control options.

Read more in Triton Inference server model configuration

Parameters:

  • CONTROL_SEQUENCE_START

    "CONTROL_SEQUENCE_START"

  • CONTROL_SEQUENCE_READY

    "CONTROL_SEQUENCE_READY"

  • CONTROL_SEQUENCE_END

    "CONTROL_SEQUENCE_END"

  • CONTROL_SEQUENCE_CORRID

    "CONTROL_SEQUENCE_CORRID"

model_navigator.api.triton.SequenceBatcherInitialState dataclass

Sequence Batching initial state configuration.

Read more in Triton Inference server model configuration

Parameters:

  • name (str) –
  • shape (Tuple[int, ...]) –

    The shape of the state tensor, not including the batch dimension.

  • dtype (Optional[Union[np.dtype, Type[np.dtype]]]) –

    The data-type of the state.

  • zero_data (Optional[bool]) –

    The identifier for using zeros as initial state data.

  • data_file (Optional[str]) –

    The file whose content will be used as the initial data for the state in row-major order.

__post_init__

__post_init__()

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py
def __post_init__(self):
    """Validate the configuration for early error handling."""
    if not self.zero_data and not self.data_file:
        raise ModelNavigatorWrongParameterError("zero_data or data_file has to be defined. None was provided.")

    if self.zero_data and self.data_file:
        raise ModelNavigatorWrongParameterError("zero_data or data_file has to be defined. Both were provided.")

    if self.dtype:
        self.dtype = cast_dtype(dtype=self.dtype)

    expect_type("name", self.name, str)
    expect_type("shape", self.shape, tuple)
    expect_type("dtype", self.dtype, np.dtype, optional=True)
    is_shape_correct("shape", self.shape)

model_navigator.api.triton.SequenceBatcherState dataclass

Sequence Batching state configuration.

Read more in Triton Inference server model configuration

Parameters:

  • input_name (str) –

    The name of the model state input.

  • output_name (str) –

    The name of the model state output.

  • dtype (Union[np.dtype, Type[np.dtype]]) –

    The data-type of the state.

  • shape (Tuple[int, ...]) –

    The shape of the state tensor.

  • initial_states (List[SequenceBatcherInitialState]) –

    The optional field to specify the list of initial states for the model.

__post_init__

__post_init__()

Validate the configuration for early error handling.

Source code in model_navigator/triton/specialized_configs/common.py
def __post_init__(self):
    """Validate the configuration for early error handling."""
    self.dtype = cast_dtype(dtype=self.dtype)

    expect_type("shape", self.shape, tuple)
    expect_type("dtype", self.dtype, np.dtype, optional=True)
    is_shape_correct("shape", self.shape)

model_navigator.api.triton.SequenceBatcherStrategyDirect dataclass

Sequence Batching strategy direct configuration.

Read more in Triton Inference server model configuration

Parameters:

  • max_queue_delay_microseconds (int) –

    The maximum time, in microseconds, a candidate request will be delayed in the sequence batch scheduling queue to wait for additional requests for batching.

  • minimum_slot_utilization (float) –

    The minimum slot utilization that must be satisfied to execute the batch before 'max_queue_delay_microseconds' expires.

model_navigator.api.triton.SequenceBatcherStrategyOldest dataclass

Sequence Batching strategy oldest configuration.

Read more in Triton Inference server model configuration

Parameters:

  • max_candidate_sequences (int) –

    Maximum number of candidate sequences that the batcher maintains.

  • preferred_batch_size (List[int]) –

    Preferred batch sizes for dynamic batching of candidate sequences.

  • max_queue_delay_microseconds (int) –

    The maximum time, in microseconds, a candidate request will be delayed in the dynamic batch scheduling queue to wait for additional requests for batching.