DocsGuidesSQL Queries

SQL Queries

TopGun supports SQL queries via Apache DataFusion, a high-performance in-process query engine. You can run SELECT, WHERE, GROUP BY, ORDER BY, LIMIT, and aggregate functions directly against your map data — without leaving the TopGun ecosystem. SQL queries execute on the server against the server’s in-memory data; there is no client-side SQL execution.

This feature requires the datafusion Cargo feature flag on the server binary. Without it, only predicate-based live queries are available.

Full SQL Syntax

SELECT, WHERE, GROUP BY, ORDER BY, LIMIT, DISTINCT, and aggregate functions on any registered map.

Standard Data Types

MsgPack integers, floats, strings, booleans, and binaries are automatically mapped to Arrow / SQL types.

Aggregations

COUNT, SUM, AVG, MIN, MAX — group and summarise data across thousands of records in a single query.

Enabling DataFusion

The SQL query engine is gated behind an optional Cargo feature flag so that deployments that do not need SQL pay no binary-size or startup-time cost.

Build command

# Build the server with DataFusion SQL support
cargo build -p topgun-server --features datafusion

Without --features datafusion, sending a SQL_QUERY message returns a SQL_QUERY_RESP with error: "SqlNotEnabled". Maps must also have a registered schema via SchemaProvider before they can be queried via SQL — unregistered maps return error: "SchemaRequired".

Wire Protocol

SQL queries use a simple request/response message pair over the existing WebSocket connection.

`SqlQueryPayload` (client → server)

Field	Type	Description
`type`	`"SQL_QUERY"`	Message discriminant
`sql`	`string`	The SQL query string
`queryId`	`string`	Client-generated correlation ID

`SqlQueryRespPayload` (server → client)

Field	Type	Description
`type`	`"SQL_QUERY_RESP"`	Message discriminant
`queryId`	`string`	Echoed correlation ID from the request
`columns`	`string[]`	Column names in result order
`rows`	`any[][]`	Array of rows; each row is an array of MsgPack values
`error`	`string \| null`	Error message, or `null` on success

Message pair (conceptual JSON)

// SqlQueryPayload — sent by the client
{
  "type": "SQL_QUERY",
  "sql": "SELECT name, price FROM products WHERE price > 50 ORDER BY price DESC",
  "queryId": "q-abc123"
}

// SqlQueryRespPayload — returned by the server
{
  "type": "SQL_QUERY_RESP",
  "queryId": "q-abc123",
  "columns": ["name", "price"],
  "rows": [
    ["Widget Pro", 99.99],
    ["Widget Basic", 59.99]
  ],
  "error": null
}

Sending a Query from a Client

WebSocket client example

import { WebSocket } from 'ws';
import { encode, decode } from '@msgpack/msgpack';

const ws = new WebSocket('ws://localhost:8080');

ws.on('open', () => {
  const payload = {
    type: 'SQL_QUERY',
    sql: 'SELECT name, price FROM products WHERE price > 50 ORDER BY price DESC',
    queryId: 'q-abc123',
  };
  ws.send(encode(payload));
});

ws.on('message', (data) => {
  const resp = decode(data);
  if (resp.type === 'SQL_QUERY_RESP' && resp.queryId === 'q-abc123') {
    if (resp.error) {
      console.error('SQL error:', resp.error);
    } else {
      // resp.columns: ['name', 'price']
      // resp.rows: [['Widget Pro', 99.99], ['Widget Basic', 59.99]]
      const results = resp.rows.map((row) =>
        Object.fromEntries(resp.columns.map((col, i) => [col, row[i]]))
      );
      console.log('Query results:', results);
    }
  }
});

There is no client.sql() wrapper method in the TypeScript SDK yet. Send SQL_QUERY directly over the WebSocket using MsgPack serialisation until a higher-level API is added.

Supported SQL Syntax

DataFusion supports a broad SQL dialect. The subset relevant to TopGun’s single-map, single-node usage is listed below. For the full grammar see datafusion.apache.org.

Clauses

Clause	Supported forms
`SELECT`	Column list, `*`, aliases (`AS`), arithmetic expressions
`WHERE`	`=`, `!=`, `<`, `>`, `<=`, `>=`, `AND`, `OR`, `NOT`, `IS NULL`, `IS NOT NULL`, `LIKE`, `IN`
`GROUP BY`	One or more columns
`HAVING`	Aggregate filter after GROUP BY
`ORDER BY`	`ASC` / `DESC` on any column or alias
`LIMIT` / `OFFSET`	Pagination
`DISTINCT`	Deduplicate rows

Aggregate Functions

Function	Description
`COUNT(*)`	Row count
`SUM(col)`	Sum of numeric values
`AVG(col)`	Average of numeric values
`MIN(col)`	Minimum value
`MAX(col)`	Maximum value

Syntax Examples

SQL syntax examples

-- Basic column selection
SELECT name, price, category FROM products

-- Wildcard: includes the synthetic _key column plus all schema fields
SELECT * FROM products

-- Filtering with comparison operators
SELECT name, price FROM products WHERE price > 100

-- Multiple conditions
SELECT name, price FROM products
WHERE price >= 20 AND price <= 200 AND inStock = true

-- NULL checks
SELECT name FROM products WHERE description IS NOT NULL

-- Pattern matching
SELECT name FROM products WHERE name LIKE '%wireless%'

-- IN list
SELECT name, category FROM products WHERE category IN ('electronics', 'accessories')

-- Sorting
SELECT name, price FROM products ORDER BY price ASC

-- Limit and offset
SELECT name, price FROM products ORDER BY price DESC LIMIT 10 OFFSET 20

-- DISTINCT values
SELECT DISTINCT category FROM products

-- Column aliases and arithmetic
SELECT name, price, price * 1.1 AS price_with_tax FROM products

-- Aggregation
SELECT category, COUNT(*) AS total, AVG(price) AS avg_price
FROM products
GROUP BY category
ORDER BY total DESC

Supported Data Types

When a map is scanned for SQL execution, each MsgPack value is converted to an Arrow array. The mapping is:

MsgPack type	Arrow / SQL type	Notes
`Integer`	`BIGINT` (Int64)	All integer widths
`Float` (f64)	`DOUBLE` (Float64)
`String`	`VARCHAR` (Utf8)
`Boolean`	`BOOLEAN`
`Binary`	`BYTEA` (Binary)	Raw bytes
`Integer` (millis)	`TIMESTAMP` (TimestampMillisecond)	When field is declared as timestamp in schema
`Array`	`LIST`	Element type inferred from schema
`Nil`	`NULL`	Represented as null in the Arrow array
Complex / Map	`VARCHAR` (JSON string)	Nested objects fall back to JSON text

Every table includes a synthetic _key column (VARCHAR) representing the map entry key. Use it to correlate SQL results back to TopGun map keys for subsequent writes or live query subscriptions.

Example Queries

Basic Filtering and Sorting

Given a products map with fields name (string), price (number), category (string), inStock (boolean):

Filtering and sorting

-- Given a 'products' map with fields: name (string), price (number),
-- category (string), inStock (boolean)

-- Find all in-stock electronics under $150, newest first
SELECT _key, name, price
FROM products
WHERE category = 'electronics'
  AND inStock = true
  AND price < 150
ORDER BY name ASC

Aggregation with GROUP BY

Given an orders map with fields customerId (string), total (number), status (string), createdAt (number):

Aggregation

-- Given a 'orders' map with fields: customerId (string), total (number),
-- status (string), createdAt (number / timestamp millis)

-- Revenue summary per customer, only completed orders
SELECT customerId,
       COUNT(*)   AS order_count,
       SUM(total) AS total_revenue,
       AVG(total) AS avg_order_value,
       MAX(total) AS largest_order
FROM orders
WHERE status = 'completed'
GROUP BY customerId
ORDER BY total_revenue DESC
LIMIT 20

Combined Filtering, Aggregation, and Ordering

Given a users map with fields country (string), age (number), plan (string), active (boolean):

Combined query

-- Given a 'users' map with fields: country (string), age (number),
-- plan (string), active (boolean)

-- Active users by country, filtered to paying plans, sorted by count
SELECT country,
       plan,
       COUNT(*)   AS user_count,
       AVG(age)   AS avg_age
FROM users
WHERE active = true
  AND plan IN ('pro', 'enterprise')
GROUP BY country, plan
HAVING COUNT(*) > 5
ORDER BY user_count DESC

Limitations

Single-Node Only

SQL queries execute against the local node's data. Distributed SQL across cluster nodes is not yet supported.

No Cross-Map JOINs

Each SQL query targets a single map (table). JOINs between maps are planned for a future release.

The full list of current limitations:

Single-node only — SQL queries execute against the local node’s data. Distributed SQL across cluster nodes is not yet supported.
No cross-map JOINs — each SQL query targets a single map. JOINs between different maps are planned for a future release.
No live SQL subscriptions — SQL queries are one-shot request/response. For real-time updates, use predicate-based live queries.
Schema required — maps without a registered schema cannot be queried via SQL. The server returns error: "SchemaRequired" for unregistered maps.
No filter pushdown — DataFusion applies WHERE filters after the full map scan. For simple equality filters on large maps, predicate-based live queries are faster because they skip non-matching records during the scan.
Arrow cache rebuild — query results are cached per partition and invalidated on mutation. The first query after a write incurs an Arrow cache rebuild from the current CRDT state, which adds latency proportional to partition size.

Choosing between SQL and live queries: Use SQL for one-shot analytics, aggregations, and ad-hoc exploration. Use live queries for real-time subscriptions, simple equality filters, and low-latency reads on large maps.

Live Queries

Schema & Type Safety