gliner.serve.client module¶
HTTP client for the GLiNER Ray Serve deployment.
- exception gliner.serve.client.GLiNERClientError[source]¶
Bases:
RuntimeErrorRaised when the GLiNER server returns an error or is unreachable.
- class gliner.serve.client.GLiNERClient(base_url='http://localhost:8000', route_prefix='/gliner', timeout=30.0, max_concurrency=32)[source]¶
Bases:
objectHTTP client for a running GLiNER Ray Serve deployment.
Example
>>> from gliner.serve import GLiNERClient >>> client = GLiNERClient() >>> results = client.predict( ... "John works at Google in Mountain View", labels=["person", "organization", "location"] ... ) {'entities': [{'start': 0, 'end': 4, 'text': 'John', 'label': 'person', ...}, ...]}
Initialize the HTTP client.
- Parameters:
base_url (str) – Scheme + host + port of the Ray Serve HTTP proxy.
route_prefix (str) – Route prefix the deployment is mounted under (must match
GLiNERServeConfig.route_prefix).timeout (float) – Per-request timeout in seconds.
max_concurrency (int) – Maximum in-flight HTTP requests when predicting on a list of texts. Bounds the client-side thread pool.
- __init__(base_url='http://localhost:8000', route_prefix='/gliner', timeout=30.0, max_concurrency=32)[source]¶
Initialize the HTTP client.
- Parameters:
base_url (str) – Scheme + host + port of the Ray Serve HTTP proxy.
route_prefix (str) – Route prefix the deployment is mounted under (must match
GLiNERServeConfig.route_prefix).timeout (float) – Per-request timeout in seconds.
max_concurrency (int) – Maximum in-flight HTTP requests when predicting on a list of texts. Bounds the client-side thread pool.
- gliner.serve.client.get_client(base_url='http://localhost:8000', route_prefix='/gliner', timeout=30.0, max_concurrency=32)[source]¶
Convenience constructor for
GLiNERClient.