Client¶
DataPressClient is a small sync client for talking to a running
DataPress server. It uses only the Python stdlib plus a lazy
pyarrow import (only loaded when you call query() for Arrow IPC).
from datap_rs import DataPressClient
c = DataPressClient("http://127.0.0.1:8000")
c.healthz() # -> {"status": "ok"}
c.readyz() # -> {"status": "ready", "datasets": N}
c.datasets() # -> ["accidents", ...]
c.schema("accidents") # -> dict
c.count("accidents") # -> int
Querying¶
query() requests Arrow IPC and returns a pyarrow.Table:
table = c.query("accidents", {
"columns": ["State", "Severity"],
"predicates": [{ "col": "State", "op": "eq", "val": "TX" }],
"page_size": 10_000,
})
For the JSON envelope verbatim, use query_json():
payload = c.query_json("accidents", { "page_size": 50 })
# -> { "data": [...], "page": 1, "page_size": 50 }
Filtered counts¶
Errors¶
Non-2xx responses raise DataPressHTTPError with three attributes:
| Attribute | Meaning |
|---|---|
status |
HTTP status code (int). |
body |
Response body as str (may be empty). |
payload |
Parsed JSON body if the server sent one, else None. |
from datap_rs import DataPressHTTPError
try:
c.query("missing", {})
except DataPressHTTPError as e:
print(e.status, e.payload)
With a URL prefix¶
c = DataPressClient("http://127.0.0.1:8000/datapress")
# Internally calls /datapress/api/v1/datasets, /datapress/healthz, ...
Admin endpoints¶
reload() requires the server's ADMIN_TOKEN: