Skip to content

Client

DataPressClient is a small sync client for talking to a running DataPress server. It uses only the Python stdlib plus a lazy pyarrow import (only loaded when you call query() for Arrow IPC).

from datap_rs import DataPressClient

c = DataPressClient("http://127.0.0.1:8000")

c.healthz()                                  # -> {"status": "ok"}
c.readyz()                                   # -> {"status": "ready", "datasets": N}
c.datasets()                                 # -> ["accidents", ...]
c.schema("accidents")                        # -> dict
c.count("accidents")                         # -> int

Querying

query() requests Arrow IPC and returns a pyarrow.Table:

table = c.query("accidents", {
    "columns":   ["State", "Severity"],
    "predicates": [{ "col": "State", "op": "eq", "val": "TX" }],
    "page_size": 10_000,
})

For the JSON envelope verbatim, use query_json():

payload = c.query_json("accidents", { "page_size": 50 })
# -> { "data": [...], "page": 1, "page_size": 50 }

Filtered counts

n = c.count("accidents", {
    "predicates": [{ "col": "State", "op": "in", "val": ["CA","TX"] }],
})

Errors

Non-2xx responses raise DataPressHTTPError with three attributes:

Attribute Meaning
status HTTP status code (int).
body Response body as str (may be empty).
payload Parsed JSON body if the server sent one, else None.
from datap_rs import DataPressHTTPError

try:
    c.query("missing", {})
except DataPressHTTPError as e:
    print(e.status, e.payload)

With a URL prefix

c = DataPressClient("http://127.0.0.1:8000/datapress")
# Internally calls /datapress/api/v1/datasets, /datapress/healthz, ...

Admin endpoints

reload() requires the server's ADMIN_TOKEN:

c.reload("accidents", admin_token="...")  # -> {"dataset":..., "rows":..., "elapsed_ms":...}