Skip to content

Model Instance Group

model_navigator.api.triton.InstanceGroup dataclass

Configuration for model instance group.

Read more in Triton Inference server model configuration

Parameters:

  • kind (Optional[DeviceKind], default: None ) –

    Kind of this instance group.

  • count (Optional[int], default: None ) –

    For a group assigned to GPU, the number of instances created for each GPU listed in 'gpus'. For a group assigned to CPU the number of instances created.

  • name (Optional[str], default: None ) –

    Optional name of this group of instances.

  • gpus (List[int], default: dataclasses.field(default_factory=lambda : []) ) –

    GPU(s) where instances should be available.

  • passive (bool, default: False ) –

    Whether the instances within this instance group will be accepting inference requests from the scheduler.

  • host_policy (Optional[str], default: None ) –

    The host policy name that the instance to be associated with.

  • profile (List[str], default: dataclasses.field(default_factory=lambda : []) ) –

    For TensorRT models containing multiple optimization profile, this parameter specifies a set of optimization profiles available to this instance group.

model_navigator.api.triton.DeviceKind

Bases: Enum

Device kind for model deployment.

Read more in Triton Inference server model configuration

Parameters:

  • KIND_AUTO

    "KIND_AUTO"

  • KIND_CPU

    "KIND_CPU"

  • KIND_GPU

    "KIND_GPU"