Decorators
pytriton.decorators
Inference callable decorators.
ConstantPadder
Padder that pads the given batches with a constant value.
Initialize the padder.
Parameters:
-
pad_value
(int
, default:0
) –Padding value. Defaults to 0.
Source code in pytriton/decorators.py
__call__
Pad the given batches with the specified value to pad size enabling further batching to single arrays.
Parameters:
Returns:
-
InferenceResults
–List[Dict[str, np.ndarray]]: List of padded batches.
Raises:
-
PyTritonRuntimeError
–If the input arrays for a given input name have different dtypes.
Source code in pytriton/decorators.py
ModelConfigDict
Bases: MutableMapping
Dictionary for storing model configs for inference callable.
Create ModelConfigDict object.
Source code in pytriton/decorators.py
TritonContext
dataclass
TritonContext(model_configs: ModelConfigDict = ModelConfigDict())
Triton context definition class.
batch
Decorator for converting list of request dicts to dict of input batches.
Converts list of request dicts to dict of input batches. It passes **kwargs to inference callable where each named input contains numpy array with batch of requests received by Triton server. We assume that each request has the same set of keys (you can use group_by_keys decorator before using @batch decorator if your requests may have different set of keys).
Raises:
-
PyTritonValidationError
–If the requests have different set of keys.
-
ValueError
–If the output tensors have different than expected batch sizes. Expected batch size is calculated as a sum of batch sizes of all requests.
Source code in pytriton/decorators.py
convert_output
convert_output(outputs: Union[Dict, List, Tuple], wrapped=None, instance=None, model_config: Optional[TritonModelConfig] = None)
Converts output from tuple ot list to dictionary.
It is utility function useful for mapping output list into dictionary of outputs. Currently, it is used in @sample and @batch decorators (we assume that user can return list or tuple of outputs instead of dictionary if this list matches output list in model config (size and order).
Source code in pytriton/decorators.py
fill_optionals
This decorator ensures that any missing inputs in requests are filled with default values specified by the user.
Default values should be NumPy arrays without batch axis.
If you plan to group requests ex. with @group_by_keys or @group_by_vales decorators provide default values for optional parameters at the beginning of decorators stack. The other decorators can then group requests into bigger batches resulting in a better model performance.
Typical use:
Parameters:
-
defaults
–keyword arguments containing default values for missing inputs
If you have default values for some optional parameter it is good idea to provide them at the very beginning, so the other decorators (e.g. @group_by_keys) can make bigger consistent groups.
Source code in pytriton/decorators.py
441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 |
|
first_value
This decorator overwrites selected inputs with first element of the given input.
It can be used in two ways:
-
Wrapping a single request inference callable by chaining with @batch decorator:
-
Wrapping a multiple requests inference callable:
By default, the decorator squeezes single value arrays to scalars.
This behavior can be disabled by setting the squeeze_single_values
flag to False.
By default, the decorator checks the equality of the values on selected values.
This behavior can be disabled by setting the strict
flag to False.
Wrapper can only be used with models that support batching.
Parameters:
-
keys
(str
, default:()
) –The input keys selected for conversion.
-
squeeze_single_values
–squeeze single value ND array to scalar values. Defaults to True.
-
strict
(bool
, default:True
) –enable checking if all values on single selected input of request are equal. Defaults to True.
Raises:
-
PyTritonRuntimeError
–if not all values on a single selected input of the request are equal
-
PyTritonBadParameterError
–if any of the keys passed to the decorator are not allowed.
Source code in pytriton/decorators.py
586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 |
|
get_inference_request_batch_size
get_inference_request_batch_size(inference_request: InferenceRequest) -> int
Get batch size from triton request.
Parameters:
-
inference_request
(InferenceRequest
) –Triton request.
Returns:
-
int
(int
) –Batch size.
Source code in pytriton/decorators.py
get_model_config
Retrieves instance of TritonModelConfig from callable.
It is internally used in convert_output function to get output list from model. You can use this in custom decorators if you need access to model_config information. If you use @triton_context decorator you do not need this function (you can get model_config directly from triton_context passing function/callable to dictionary getter).
Source code in pytriton/decorators.py
get_triton_context
get_triton_context(wrapped, instance) -> TritonContext
Retrieves triton context from callable.
It is used in @triton_context to get triton context registered by triton binding in inference callable. If you use @triton_context decorator you do not need this function.
Source code in pytriton/decorators.py
group_by_keys
Group by keys.
Decorator prepares groups of requests with the same set of keys and calls wrapped function for each group separately (it is convenient to use this decorator before batching, because the batching decorator requires consistent set of inputs as it stacks them into batches).
Source code in pytriton/decorators.py
group_by_values
Decorator for grouping requests by values of selected keys.
This function splits a batch into multiple sub-batches based on the specified keys values and calls the decorated function with each sub-batch. This is particularly useful when working with models that require dynamic parameters sent by the user.
For example, given an input of the form:
Using @group_by_values("param1", "param2") will split the batch into two sub-batches:
[
{"sentences": [b"Sentence1", b"Sentence2"], "param1": [1, 1], "param2": [1, 1]},
{"sentences": [b"Sentence3"], "param1": [2], "param2": [1]}
]
This decorator should be used after the @batch decorator.
Example usage:
Parameters:
-
*keys
–List of keys to group by.
-
pad_fn
(Optional[Callable[[InferenceRequests], InferenceRequests]]
, default:None
) –Optional function to pad the batch to the same size before merging again to a single batch.
Returns:
-
–
The decorator function.
Source code in pytriton/decorators.py
253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 |
|
pad_batch
Add padding to the inputs batches.
Decorator appends last rows to the inputs multiple times to get desired batch size (preferred batch size or max batch size from model config whatever is closer to current input size).
Source code in pytriton/decorators.py
sample
Decorator is used for non-batched inputs to convert from one element list of requests to request kwargs.
Decorator takes first request and convert it into named inputs.
Useful with non-batching models - instead of one element list of request, we will get named inputs - kwargs
.
Source code in pytriton/decorators.py
triton_context
Adds triton context.
It gives you additional argument passed to the function in **kwargs called 'triton_context'. You can read model config from it and in the future possibly have some interaction with triton.