Model Instance Group
model_navigator.api.triton.InstanceGroup
dataclass
Configuration for model instance group.
Read more in Triton Inference server model configuration
Parameters:
-
kind
(
Optional[DeviceKind], default:None) –Kind of this instance group.
-
count
(
Optional[int], default:None) –For a group assigned to GPU, the number of instances created for each GPU listed in 'gpus'. For a group assigned to CPU the number of instances created.
-
name
(
Optional[str], default:None) –Optional name of this group of instances.
-
gpus
(
List[int], default:dataclasses.field(default_factory=lambda : [])) –GPU(s) where instances should be available.
-
passive
(
bool, default:False) –Whether the instances within this instance group will be accepting inference requests from the scheduler.
-
host_policy
(
Optional[str], default:None) –The host policy name that the instance to be associated with.
-
profile
(
List[str], default:dataclasses.field(default_factory=lambda : [])) –For TensorRT models containing multiple optimization profile, this parameter specifies a set of optimization profiles available to this instance group.
model_navigator.api.triton.DeviceKind
Bases: Enum
Device kind for model deployment.
Read more in Triton Inference server model configuration
Parameters:
-
KIND_AUTO
–
"KIND_AUTO"
-
KIND_CPU
–
"KIND_CPU"
-
KIND_GPU
–
"KIND_GPU"