Model Instance Group

model_navigator.api.triton.InstanceGroup `dataclass`

Configuration for model instance group.

Read more in Triton Inference server model configuration

Parameters:

kind (Optional[DeviceKind]) –

Kind of this instance group.
count (Optional[int]) –

For a group assigned to GPU, the number of instances created for each GPU listed in 'gpus'. For a group assigned to CPU the number of instances created.
name (Optional[str]) –

Optional name of this group of instances.
gpus (List[int]) –

GPU(s) where instances should be available.
passive (bool) –

Whether the instances within this instance group will be accepting inference requests from the scheduler.
host_policy (Optional[str]) –

The host policy name that the instance to be associated with.
profile (List[str]) –

For TensorRT models containing multiple optimization profile, this parameter specifies a set of optimization profiles available to this instance group.

Bases: enum.Enum

Device kind for model deployment.

Read more in Triton Inference server model configuration

Parameters: