DocsGuidesSQL Queries

SQL Queries

TopGun supports SQL queries via Apache DataFusion, a high-performance in-process query engine. You can run SELECT, WHERE, GROUP BY, ORDER BY, LIMIT, and aggregate functions directly against your map data — without leaving the TopGun ecosystem. SQL queries execute on the server against the server’s in-memory data; there is no client-side SQL execution.

This feature requires the datafusion Cargo feature flag on the server binary. Without it, only predicate-based live queries are available.

Full SQL Syntax

SELECT, WHERE, GROUP BY, ORDER BY, LIMIT, DISTINCT, and aggregate functions on any registered map.

Standard Data Types

MsgPack integers, floats, strings, booleans, and binaries are automatically mapped to Arrow / SQL types.

Aggregations

COUNT, SUM, AVG, MIN, MAX — group and summarise data across thousands of records in a single query.

Enabling DataFusion

The SQL query engine is gated behind an optional Cargo feature flag so that deployments that do not need SQL pay no binary-size or startup-time cost.

Build command
# Build the server with DataFusion SQL support
cargo build -p topgun-server --features datafusion

Without --features datafusion, sending a SQL_QUERY message returns a SQL_QUERY_RESP with error: "SqlNotEnabled". Maps must also have a registered schema via SchemaProvider before they can be queried via SQL — unregistered maps return error: "SchemaRequired".

Wire Protocol

SQL queries use a simple request/response message pair over the existing WebSocket connection.

SqlQueryPayload (client → server)

FieldTypeDescription
type"SQL_QUERY"Message discriminant
sqlstringThe SQL query string
queryIdstringClient-generated correlation ID

SqlQueryRespPayload (server → client)

FieldTypeDescription
type"SQL_QUERY_RESP"Message discriminant
queryIdstringEchoed correlation ID from the request
columnsstring[]Column names in result order
rowsany[][]Array of rows; each row is an array of MsgPack values
errorstring | nullError message, or null on success
Message pair (conceptual JSON)
// SqlQueryPayload — sent by the client
{
  "type": "SQL_QUERY",
  "sql": "SELECT name, price FROM products WHERE price > 50 ORDER BY price DESC",
  "queryId": "q-abc123"
}

// SqlQueryRespPayload — returned by the server
{
  "type": "SQL_QUERY_RESP",
  "queryId": "q-abc123",
  "columns": ["name", "price"],
  "rows": [
    ["Widget Pro", 99.99],
    ["Widget Basic", 59.99]
  ],
  "error": null
}

Sending a Query from a Client

WebSocket client example
import { WebSocket } from 'ws';
import { encode, decode } from '@msgpack/msgpack';

const ws = new WebSocket('ws://localhost:8080');

ws.on('open', () => {
  const payload = {
    type: 'SQL_QUERY',
    sql: 'SELECT name, price FROM products WHERE price > 50 ORDER BY price DESC',
    queryId: 'q-abc123',
  };
  ws.send(encode(payload));
});

ws.on('message', (data) => {
  const resp = decode(data);
  if (resp.type === 'SQL_QUERY_RESP' && resp.queryId === 'q-abc123') {
    if (resp.error) {
      console.error('SQL error:', resp.error);
    } else {
      // resp.columns: ['name', 'price']
      // resp.rows: [['Widget Pro', 99.99], ['Widget Basic', 59.99]]
      const results = resp.rows.map((row) =>
        Object.fromEntries(resp.columns.map((col, i) => [col, row[i]]))
      );
      console.log('Query results:', results);
    }
  }
});

There is no client.sql() wrapper method in the TypeScript SDK yet. Send SQL_QUERY directly over the WebSocket using MsgPack serialisation until a higher-level API is added.

Supported SQL Syntax

DataFusion supports a broad SQL dialect. The subset relevant to TopGun’s single-map, single-node usage is listed below. For the full grammar see datafusion.apache.org.

Clauses

ClauseSupported forms
SELECTColumn list, *, aliases (AS), arithmetic expressions
WHERE=, !=, <, >, <=, >=, AND, OR, NOT, IS NULL, IS NOT NULL, LIKE, IN
GROUP BYOne or more columns
HAVINGAggregate filter after GROUP BY
ORDER BYASC / DESC on any column or alias
LIMIT / OFFSETPagination
DISTINCTDeduplicate rows

Aggregate Functions

FunctionDescription
COUNT(*)Row count
SUM(col)Sum of numeric values
AVG(col)Average of numeric values
MIN(col)Minimum value
MAX(col)Maximum value

Syntax Examples

SQL syntax examples
-- Basic column selection
SELECT name, price, category FROM products

-- Wildcard: includes the synthetic _key column plus all schema fields
SELECT * FROM products

-- Filtering with comparison operators
SELECT name, price FROM products WHERE price > 100

-- Multiple conditions
SELECT name, price FROM products
WHERE price >= 20 AND price <= 200 AND inStock = true

-- NULL checks
SELECT name FROM products WHERE description IS NOT NULL

-- Pattern matching
SELECT name FROM products WHERE name LIKE '%wireless%'

-- IN list
SELECT name, category FROM products WHERE category IN ('electronics', 'accessories')

-- Sorting
SELECT name, price FROM products ORDER BY price ASC

-- Limit and offset
SELECT name, price FROM products ORDER BY price DESC LIMIT 10 OFFSET 20

-- DISTINCT values
SELECT DISTINCT category FROM products

-- Column aliases and arithmetic
SELECT name, price, price * 1.1 AS price_with_tax FROM products

-- Aggregation
SELECT category, COUNT(*) AS total, AVG(price) AS avg_price
FROM products
GROUP BY category
ORDER BY total DESC

Supported Data Types

When a map is scanned for SQL execution, each MsgPack value is converted to an Arrow array. The mapping is:

MsgPack typeArrow / SQL typeNotes
IntegerBIGINT (Int64)All integer widths
Float (f64)DOUBLE (Float64)
StringVARCHAR (Utf8)
BooleanBOOLEAN
BinaryBYTEA (Binary)Raw bytes
Integer (millis)TIMESTAMP (TimestampMillisecond)When field is declared as timestamp in schema
ArrayLISTElement type inferred from schema
NilNULLRepresented as null in the Arrow array
Complex / MapVARCHAR (JSON string)Nested objects fall back to JSON text

Every table includes a synthetic _key column (VARCHAR) representing the map entry key. Use it to correlate SQL results back to TopGun map keys for subsequent writes or live query subscriptions.

Example Queries

Basic Filtering and Sorting

Given a products map with fields name (string), price (number), category (string), inStock (boolean):

Filtering and sorting
-- Given a 'products' map with fields: name (string), price (number),
-- category (string), inStock (boolean)

-- Find all in-stock electronics under $150, newest first
SELECT _key, name, price
FROM products
WHERE category = 'electronics'
  AND inStock = true
  AND price < 150
ORDER BY name ASC

Aggregation with GROUP BY

Given an orders map with fields customerId (string), total (number), status (string), createdAt (number):

Aggregation
-- Given a 'orders' map with fields: customerId (string), total (number),
-- status (string), createdAt (number / timestamp millis)

-- Revenue summary per customer, only completed orders
SELECT customerId,
       COUNT(*)   AS order_count,
       SUM(total) AS total_revenue,
       AVG(total) AS avg_order_value,
       MAX(total) AS largest_order
FROM orders
WHERE status = 'completed'
GROUP BY customerId
ORDER BY total_revenue DESC
LIMIT 20

Combined Filtering, Aggregation, and Ordering

Given a users map with fields country (string), age (number), plan (string), active (boolean):

Combined query
-- Given a 'users' map with fields: country (string), age (number),
-- plan (string), active (boolean)

-- Active users by country, filtered to paying plans, sorted by count
SELECT country,
       plan,
       COUNT(*)   AS user_count,
       AVG(age)   AS avg_age
FROM users
WHERE active = true
  AND plan IN ('pro', 'enterprise')
GROUP BY country, plan
HAVING COUNT(*) > 5
ORDER BY user_count DESC

Limitations

Single-Node Only

SQL queries execute against the local node's data. Distributed SQL across cluster nodes is not yet supported.

No Cross-Map JOINs

Each SQL query targets a single map (table). JOINs between maps are planned for a future release.

The full list of current limitations:

  • Single-node only — SQL queries execute against the local node’s data. Distributed SQL across cluster nodes is not yet supported.
  • No cross-map JOINs — each SQL query targets a single map. JOINs between different maps are planned for a future release.
  • No live SQL subscriptions — SQL queries are one-shot request/response. For real-time updates, use predicate-based live queries.
  • Schema required — maps without a registered schema cannot be queried via SQL. The server returns error: "SchemaRequired" for unregistered maps.
  • No filter pushdown — DataFusion applies WHERE filters after the full map scan. For simple equality filters on large maps, predicate-based live queries are faster because they skip non-matching records during the scan.
  • Arrow cache rebuild — query results are cached per partition and invalidated on mutation. The first query after a write incurs an Arrow cache rebuild from the current CRDT state, which adds latency proportional to partition size.

Choosing between SQL and live queries: Use SQL for one-shot analytics, aggregations, and ad-hoc exploration. Use live queries for real-time subscriptions, simple equality filters, and low-latency reads on large maps.