Skip to content

Dynamic Batcher

model_navigator.api.triton.DynamicBatcher dataclass

Dynamic batching configuration.

Read more in Triton Inference server model configuration

Parameters:

  • max_queue_delay_microseconds (int, default: 0 ) –

    The maximum time, in microseconds, a request will be delayed in the scheduling queue to wait for additional requests for batching.

  • preferred_batch_size (Optional[list], default: None ) –

    Preferred batch sizes for dynamic batching.

  • preserve_ordering

    Should the dynamic batcher preserve the ordering of responses to match the order of requests received by the scheduler.

  • priority_levels (int, default: 0 ) –

    The number of priority levels to be enabled for the model.

  • default_priority_level (int, default: 0 ) –

    The priority level used for requests that don't specify their priority.

  • default_queue_policy (Optional[QueuePolicy], default: None ) –

    The default queue policy used for requests.

  • priority_queue_policy (Optional[Dict[int, QueuePolicy]], default: None ) –

    Specify the queue policy for the priority level.

model_navigator.api.triton.QueuePolicy dataclass

Model queue policy configuration.

Used for default_queue_policy and priority_queue_policy fields in DynamicBatcher configuration.

Read more in Triton Inference server model configuration

Parameters:

  • timeout_action (TimeoutAction, default: REJECT ) –

    The action applied to timed-out request.

  • default_timeout_microseconds (int, default: 0 ) –

    The default timeout for every request, in microseconds.

  • allow_timeout_override (bool, default: False ) –

    Whether individual request can override the default timeout value.

  • max_queue_size (int, default: 0 ) –

    The maximum queue size for holding requests.

model_navigator.api.triton.TimeoutAction

Bases: Enum

Timeout action definition for timeout_action QueuePolicy field.

Read more in Triton Inference server model configuration

Parameters:

  • REJECT

    "REJECT"

  • DELAY

    "DELAY"