Model Instance Group
model_navigator.triton.InstanceGroup
dataclass
InstanceGroup(kind=None, count=None, name=None, gpus=lambda: [](), passive=False, host_policy=None, profile=lambda: []())
Configuration for model instance group.
Read more in Triton Inference server model configuration
Parameters:
-
kind
(Optional[DeviceKind]
, default:None
) –Kind of this instance group.
-
count
(Optional[int]
, default:None
) –For a group assigned to GPU, the number of instances created for each GPU listed in 'gpus'. For a group assigned to CPU the number of instances created.
-
name
(Optional[str]
, default:None
) –Optional name of this group of instances.
-
gpus
(List[int]
, default:lambda: []()
) –GPU(s) where instances should be available.
-
passive
(bool
, default:False
) –Whether the instances within this instance group will be accepting inference requests from the scheduler.
-
host_policy
(Optional[str]
, default:None
) –The host policy name that the instance to be associated with.
-
profile
(List[str]
, default:lambda: []()
) –For TensorRT models containing multiple optimization profile, this parameter specifies a set of optimization profiles available to this instance group.
__post_init__
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
model_navigator.triton.DeviceKind
Bases: Enum
Device kind for model deployment.
Read more in Triton Inference server model configuration
Parameters:
-
KIND_AUTO
–"KIND_AUTO"
-
KIND_CPU
–"KIND_CPU"
-
KIND_GPU
–"KIND_GPU"