Model Instance Group
model_navigator.api.triton.InstanceGroup
dataclass
Configuration for model instance group.
Read more in Triton Inference server model configuration
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kind |
Optional[DeviceKind]
|
Kind of this instance group. |
None
|
count |
Optional[int]
|
For a group assigned to GPU, the number of instances created for each GPU listed in 'gpus'. For a group assigned to CPU the number of instances created. |
None
|
name |
Optional[str]
|
Optional name of this group of instances. |
None
|
gpus |
List[int]
|
GPU(s) where instances should be available. |
dataclasses.field(default_factory=lambda : [])
|
passive |
bool
|
Whether the instances within this instance group will be accepting inference requests from the scheduler. |
False
|
host_policy |
Optional[str]
|
The host policy name that the instance to be associated with. |
None
|
profile |
List[str]
|
For TensorRT models containing multiple optimization profile, this parameter specifies a set of optimization profiles available to this instance group. |
dataclasses.field(default_factory=lambda : [])
|
__post_init__()
Validate the configuration for early error handling.
Source code in model_navigator/triton/specialized_configs/common.py
model_navigator.api.triton.DeviceKind
Device kind for model deployment.
Read more in Triton Inference server model configuration
Parameters:
Name | Type | Description | Default |
---|---|---|---|
KIND_AUTO |
"KIND_AUTO" |
required | |
KIND_CPU |
"KIND_CPU" |
required | |
KIND_GPU |
"KIND_GPU" |
required |