Changelog

0.2.4 (2023-08-10)

new: Introduced strict flag in Triton.bind which enables data types and shapes validation of inference callable outputs against model config
new: AsyncioModelClient which works in FastAPI and other async frameworks
fix: FuturesModelClient do not raise gevent.exceptions.InvalidThreadUseError
fix: Do not throw TimeoutError if could not connect to server during model verification
Version of Triton Inference Server embedded in wheel: 2.33.0

0.2.3 (2023-07-21)

Improved verification of Proxy Backend environment when running under same Python interpreter
Fixed pytriton.version to represent currently installed version
Version of Triton Inference Server embedded in wheel: 2.33.0

0.2.2 (2023-07-19)

Added inference_timeout_s parameters to client classes
Renamed PyTritonClientUrlParseError to PyTritonClientInvalidUrlError
ModelClient and FuturesModelClient methods raise PyTritonClientClosedError when used after client is closed
Pinned tritonclient dependency due to issues with tritonclient >= 2.34 on systems with glibc version lower than 2.34
Added warning after Triton Server setup and teardown while using too verbose logging level as it may cause a significant performance drop in model inference
Version of Triton Inference Server embedded in wheel: 2.33.0

0.2.1 (2023-06-28)

Fixed handling TritonConfig.cache_directory option - the directory was always overwritten with the default value.
Fixed tritonclient dependency - PyTriton need tritonclient supporting http headers and parameters
Improved shared memory usage to match 64MB limit (default value for Docker, Kubernetes) reducing the initial size for PyTriton Proxy Backend.
Version of Triton Inference Server embedded in wheel: 2.33.0

0.2.0 (2023-05-30)

Added support for using custom HTTP/gRPC request headers and parameters.

This change breaks backward compatibility of the inference function signature. The undecorated inference function now accepts a list of Request instances instead of a list of dictionaries. The Request class contains data for inputs and parameters for combined parameters and headers.

See docs/custom_params.md for further information

Added FuturesModelClient which enables sending inference requests in a parallel manner.
Added displaying documentation link after models are loaded.
Version of Triton Inference Server embedded in wheel: 2.33.0

0.1.5 (2023-05-12)

Improved pytriton.decorators.group_by_values function
Modified the function to avoid calling the inference callable on each individual sample when grouping by string/bytes input
Added pad_fn argument for easy padding and combining of the inference results
Fixed Triton binaries search
Improved Workspace management (remove workspace on shutdown)
Version of external components used during testing:
Triton Inference Server: 2.29.0
Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.4 (2023-03-16)

Add validation of the model name passed to Triton bind method.
Add monkey patching of InferenceServerClient.__del__ method to prevent unhandled exceptions.
Version of external components used during testing:
Triton Inference Server: 2.29.0
Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.3 (2023-02-20)

Fixed getting model config in fill_optionals decorator.
Version of external components used during testing:
Triton Inference Server: 2.29.0
Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.2 (2023-02-14)

Fixed wheel build to support installations on operating systems with glibc version 2.31 or higher.
Updated the documentation on custom builds of the package.
Change: TritonContext instance is shared across bound models and contains model_configs dictionary.
Fixed support of binding multiple models that uses methods of the same class.
Version of external components used during testing:
Triton Inference Server: 2.29.0
Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.1 (2023-01-31)

Change: The @first_value decorator has been updated with new features:
Renamed from @first_values to @first_value
Added a strict flag to toggle the checking of equality of values on a single selected input of the request. Default is True
Added a squeeze_single_values flag to toggle the squeezing of single value ND arrays to scalars. Default is True
Fix: @fill_optionals now supports non-batching models
Fix: @first_value fixed to work with optional inputs
Fix: @group_by_values fixed to work with string inputs
Fix: @group_by_values fixed to work per sample-wise
Version of external components used during testing:
Triton Inference Server: 2.29.0
Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.

0.1.0 (2023-01-12)

Initial release of PyTriton
Version of external components used during testing:
Triton Inference Server: 2.29.0
Other component versions depend on the used framework and Triton Inference Server containers versions. Refer to its support matrix for a detailed summary.