Model Instance Group
model_navigator.triton.InstanceGroup
dataclass
InstanceGroup(kind=None, count=None, name=None, gpus=lambda: [](), passive=False, host_policy=None, profile=lambda: []())
Configuration for model instance group.
Read more in Triton Inference server model configuration
Parameters:
-
kind(Optional[DeviceKind], default:None) –Kind of this instance group.
-
count(Optional[int], default:None) –For a group assigned to GPU, the number of instances created for each GPU listed in 'gpus'. For a group assigned to CPU the number of instances created.
-
name(Optional[str], default:None) –Optional name of this group of instances.
-
gpus(List[int], default:lambda: []()) –GPU(s) where instances should be available.
-
passive(bool, default:False) –Whether the instances within this instance group will be accepting inference requests from the scheduler.
-
host_policy(Optional[str], default:None) –The host policy name that the instance to be associated with.
-
profile(List[str], default:lambda: []()) –For TensorRT models containing multiple optimization profile, this parameter specifies a set of optimization profiles available to this instance group.
__post_init__
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
model_navigator.triton.DeviceKind
Bases: Enum
Device kind for model deployment.
Read more in Triton Inference server model configuration
Parameters:
-
KIND_AUTO–"KIND_AUTO"
-
KIND_CPU–"KIND_CPU"
-
KIND_GPU–"KIND_GPU"