Advanced

Schema System

Define reusable, validated data structures for type-safe encoding

What You'll Learn

How to define schemas with typed fields
Schema inheritance for extending base schemas
Using schemas for encode-time validation

Why Schemas?

Every call to an LLM API costs money. Sending malformed data wastes tokens on errors and retries. AXON schemas let you validate data before it reaches the API, catching type mismatches, missing fields, and invalid values at the encoding step rather than in a costly round-trip.

Key benefit: Schemas are also embedded in the AXON output as type annotations. This means the receiving LLM understands the data types without needing additional instructions, improving accuracy and reducing prompt engineering overhead.

Defining a Schema

A schema is a plain object with a name and a fields array. Each field specifies a name and one of AXON's 13 validated types:

import { encode, decode, registerSchema, type Schema } from '@axon-format/core';

const userSchema: Schema = {
  name: 'User',
  fields: [
    { name: 'id', type: 'u8' },
    { name: 'name', type: 'str' },
    { name: 'email', type: 'str' },
    { name: 'verified', type: 'bool' },
    { name: 'createdAt', type: 'iso8601' }
  ]
};

Encoding with a Schema

Pass your schema to encode() via the schemas option. AXON validates every row against the schema and produces compact, typed output:

const data = {
  users: [
    { id: 1, name: "Alice", email: "alice@co.com", verified: true, createdAt: "2025-01-15T09:00:00Z" },
    { id: 2, name: "Bob", email: "bob@co.com", verified: false, createdAt: "2025-02-20T14:30:00Z" }
  ]
};

// Encode with schema validation
const encoded = encode(data, { schemas: [userSchema] });
console.log(encoded);

The AXON output includes type annotations in the header, so the LLM knows each column's type:

// AXON output
users::[2] createdAt:iso8601|email:str|id:u8|name:str|verified:bool
2025-01-15T09:00:00Z|alice@co.com|1|Alice|true
2025-02-20T14:30:00Z|bob@co.com|2|Bob|false

Schema Registration

For schemas you use frequently, register them globally with registerSchema(). Registered schemas can be referenced by name and are available throughout your application:

// Register at application startup
registerSchema(userSchema);

// Later, use by name anywhere in your codebase
const encoded = encode(data, { schemas: ['User'] });

// Decode with the same schema for type-safe output
const decoded = decode(encoded, { schemas: ['User'] });

Schema Inheritance

Build specialized schemas by extending base schemas with the spread operator. This keeps your type definitions DRY while supporting different entity variations:

// Base schema
const userSchema: Schema = {
  name: 'User',
  fields: [
    { name: 'id', type: 'u8' },
    { name: 'name', type: 'str' },
    { name: 'email', type: 'str' }
  ]
};

// Extended schema adds admin-specific fields
const adminSchema: Schema = {
  name: 'Admin',
  fields: [
    ...userSchema.fields,
    { name: 'permissions', type: 'str[]' },
    { name: 'level', type: 'u8' }
  ]
};

Validation Errors

When data does not match the schema, AXON throws descriptive errors at encode time. This catches problems before you spend tokens on a failed API call:

const badData = {
  users: [
    { id: "not-a-number", name: "Alice", email: "alice@co.com", verified: true }
  ]
};

try {
  encode(badData, { schemas: [userSchema] });
} catch (err) {
  console.error(err.message);
  // Schema validation failed: field "id" expected type u8, got string "not-a-number"
}

Common validation errors: Type mismatches (u8 receiving a string), missing required fields, values out of numeric range (e.g., negative values for unsigned types), and malformed ISO 8601 timestamps. All errors include the field name and expected type for easy debugging.

Benefits

Type safety: Catch errors before expensive API calls
Reusability: Define once, use across your application
Documentation: Schemas serve as inline documentation for your data shapes
Validation: Automatic data validation on both encode and decode
LLM context: Type annotations in the output help LLMs understand your data without extra prompting

Next: Query Hints

Add metadata to help LLMs understand and query your data

Query Hints →