Remote Mode
Remote mode is a way to use the PyTriton with the Triton Inference Server running remotely (at this moment it must be deployed on the same machine, but may be launched in a different container).
To bind the model in remote mode, it is required to use the RemoteTriton
class instead of Triton
.
Only difference of using RemoteTriton
is that it requires the triton url
argument in the constructor.
Example of binding a model in remote mode
Example below assumes that the Triton Inference Server is running on the same machine (launched with PyTriton in separate python script).
RemoteTriton
binds remote model to existing Triton Inference Server.
When RemoteTriton
is closed, the model is unloaded from the server.
import numpy as np
from pytriton.decorators import batch
from pytriton.model_config import ModelConfig, Tensor
from pytriton.triton import RemoteTriton, TritonConfig
triton_config = TritonConfig(
cache_config=[f"local,size={1024 * 1024}"], # 1MB
)
@batch
def _add_sub(**inputs):
a_batch, b_batch = inputs.values()
add_batch = a_batch + b_batch
sub_batch = a_batch - b_batch
return {"add": add_batch, "sub": sub_batch}
with RemoteTriton(url='localhost') as triton:
triton.bind(
model_name="AddSub",
infer_func=_add_sub,
inputs=[Tensor(shape=(1,), dtype=np.float32), Tensor(shape=(1,), dtype=np.float32)],
outputs=[Tensor(shape=(1,), dtype=np.float32), Tensor(shape=(1,), dtype=np.float32)],
config=ModelConfig(max_batch_size=8, response_cache=True)
)
triton.serve()