Skip to content

Distinct rows

distinct: true deduplicates the projected rows. With columns set you get distinct values over that subset; without columns it acts as SELECT DISTINCT *.

{
  "columns":  ["state"],
  "distinct": true,
  "order_by": [{ "col": "state" }],
  "page_size": 100
}

Combine with predicates / limit / pagination as usual:

{
  "columns":   ["city", "state"],
  "predicates": [{ "col": "severity", "op": "gte", "val": 3 }],
  "distinct":  true,
  "order_by":  [{ "col": "state" }, { "col": "city" }],
  "limit":     5000,
  "page":      1,
  "page_size": 100
}

Rules

  • distinct is mutually exclusive with group_by / aggregations — combining them returns 400. Use group_by (with no aggregations) when you also want counts per distinct combination.
  • distinct bypasses the in-memory fast paths and always goes through the SQL engine.
  • On DuckDB the dedup happens on the raw column values before each row is formatted as JSON, not on the JSON string itself.