Model Instance Group
model_navigator.api.triton.InstanceGroup
dataclass
Configuration for model instance group.
Read more in Triton Inference server model configuration
Parameters:
-
kind
(
Optional[DeviceKind]
) –Kind of this instance group.
-
count
(
Optional[int]
) –For a group assigned to GPU, the number of instances created for each GPU listed in 'gpus'. For a group assigned to CPU the number of instances created.
-
name
(
Optional[str]
) –Optional name of this group of instances.
-
gpus
(
List[int]
) –GPU(s) where instances should be available.
-
passive
(
bool
) –Whether the instances within this instance group will be accepting inference requests from the scheduler.
-
host_policy
(
Optional[str]
) –The host policy name that the instance to be associated with.
-
profile
(
List[str]
) –For TensorRT models containing multiple optimization profile, this parameter specifies a set of optimization profiles available to this instance group.
model_navigator.api.triton.DeviceKind
Device kind for model deployment.
Read more in Triton Inference server model configuration
Parameters:
-
KIND_AUTO
–
"KIND_AUTO"
-
KIND_CPU
–
"KIND_CPU"
-
KIND_GPU
–
"KIND_GPU"