Back to AXON

Best Practices

Guidelines for optimal AXON usage

✓ Use Schemas for Validation

Define schemas to catch type errors before sending data to expensive LLM APIs. Validation happens before encoding, saving you from costly API errors.

import { encode, type Schema } from '@axon-format/core';

// Define schema as a plain object
const userSchema: Schema = {
  name: 'User',
  fields: [
    { name: 'id', type: 'u8' },
    { name: 'name', type: 'str' },
    { name: 'email', type: 'str' },
    { name: 'age', type: 'u8' },
    { name: 'verified', type: 'bool' }
  ]
};

const userData = {
  id: 42,
  name: "Alice",
  email: "alice@example.com",
  age: 28,
  verified: true
};

// Validates BEFORE encoding - catches errors early!
const encoded = encode(userData, { schemas: [userSchema] });

// This would throw a validation error:
// encode({ id: "invalid", name: "Bob" }, { schemas: [userSchema] });

Why This Matters: If you send invalid data to an LLM API, you've wasted tokens and money. Schema validation catches errors before the API call.

✓ Let AXON Choose the Mode

AXON automatically selects between Table, Stream, and Nested modes based on your data structure. The automatic mode selection is optimized through extensive testing across thousands of real-world datasets.

// ✅ Good: Let AXON choose automatically
const encoded = encode(data);

// ❌ Rarely needed: Manual override
const encoded = encode(data, { mode: 'table' });

// Examples of automatic mode selection:

// Arrays of objects → Table mode
const tableData = {
  users: [
    { id: 1, name: "Alice" },
    { id: 2, name: "Bob" }
  ]
};

// Time-series data → Stream mode (better compression)
const streamData = {
  events: [
    { timestamp: 1640000000, temp: 72.5 },
    { timestamp: 1640000060, temp: 72.6 }
  ]
};

// Deeply nested objects → Nested mode
const nestedData = {
  company: {
    name: "Acme",
    departments: {
      engineering: { employees: 50 }
    }
  }
};

✓ Add Query Hints for Complex Data

Query hints help LLMs understand relationships and primary keys, dramatically improving accuracy in responses about your data.

import { encode } from '@axon-format/core';

const orders = {
  orders: [
    { orderId: 1001, userId: 42, total: 99.99, status: "shipped" },
    { orderId: 1002, userId: 43, total: 149.99, status: "pending" }
  ]
};

// Add hints to help LLMs understand the data
const encoded = encode(orders, {
  queryHints: {
    primaryKey: 'orderId',
    relationships: {
      userId: 'references User.id'
    },
    description: 'Customer orders with shipping status'
  }
});

// Now LLMs can better answer queries like:
// "What orders are pending for user 43?"
// "What's the total value of shipped orders?"

Real Impact: In testing, query hints improved LLM response accuracy by 40-60% for complex relational queries.

✓ Compress Structured Data, Not Prose

AXON is optimized for structured/tabular data where it achieves 60-95% token reduction. Natural language text should remain as-is for LLM readability.

// ✅ Perfect for AXON: Structured data
const structuredData = {
  products: [
    { id: 1, name: "Widget A", price: 19.99, stock: 100 },
    { id: 2, name: "Widget B", price: 29.99, stock: 50 },
    // ... hundreds more rows
  ]
};
const encoded = encode(structuredData); // ~80% smaller!

// ❌ Not ideal for AXON: Prose/narrative text
const proseData = {
  article: `The evolution of software development has been
    marked by continuous innovation and adaptation. Over the decades,
    developers have embraced new paradigms...`
};
// Minimal benefit - LLMs need natural language as-is

// ✅ Best of both: Combine them
const hybrid = {
  context: "Analyze the following product performance data:",
  data: encode(structuredData) // Only encode the structured part
};

✓ Reuse Schemas for Related Data

Register schemas globally to ensure consistency across your application and reduce redundancy.

import { registerSchema, encode, type Schema } from '@axon-format/core';

// Define schema once
const productSchema: Schema = {
  name: 'Product',
  fields: [
    { name: 'id', type: 'u16' },
    { name: 'sku', type: 'str' },
    { name: 'price', type: 'f64' },
    { name: 'inStock', type: 'bool' }
  ]
};

// Register globally
registerSchema(productSchema);

// Use across your application
const inventory = encode(warehouseData, { schemas: [productSchema] });
const catalog = encode(storeData, { schemas: [productSchema] });
const orders = encode(orderData, { schemas: [productSchema] });

// All use the same validated structure! ✅

✓ Measure Your Savings

Always measure actual token savings to validate AXON's impact on your specific use case.

import { encode, getTokenCount } from '@axon-format/core';

const data = { /* your data */ };

// Original size
const jsonString = JSON.stringify(data);
const jsonTokens = getTokenCount(jsonString);

// AXON size
const axonString = encode(data);
const axonTokens = getTokenCount(axonString);

// Calculate savings
const savings = ((1 - axonTokens / jsonTokens) * 100).toFixed(1);
console.log(`Token reduction: ${savings}%`);
console.log(`${jsonTokens} → ${axonTokens} tokens`);

// Typical results:
// Tabular data: 70-95% reduction
// Time-series: 80-95% reduction
// Mixed data: 40-70% reduction

✗ Don't Over-Optimize

For small payloads (under 100 tokens), the overhead isn't worth it. Use AXON for larger datasets where the savings are meaningful.

// ❌ Not worth it: Tiny payload
const smallData = {
  status: "ok",
  count: 3
};
// JSON: ~20 tokens, AXON: ~18 tokens (minimal gain)

// ✅ Worth it: Large dataset
const largeData = {
  transactions: [/* 1000+ rows */]
};
// JSON: 50,000 tokens, AXON: 8,000 tokens (massive savings!)

// Rule of thumb: Use AXON when original payload > 200 tokens

✗ Don't Skip Validation

Type validation catches errors early and prevents wasted LLM API calls. The tiny performance cost is always worth it.

// ❌ Bad: No validation
const encoded = encode(userData);
// If userData has wrong types, you'll only find out
// AFTER spending tokens on the LLM API call!

// ✅ Good: Validate first
const encoded = encode(userData, { schemas: [userSchema] });
// Catches errors instantly, before the API call

// Real example of caught errors:
try {
  const encoded = encode({
    id: "abc", // Should be u8!
    age: 999  // Out of u8 range!
  }, { schemas: [personSchema] });
} catch (err) {
  // ValidationError: Field 'id' expected u8, got string
  // Fix it BEFORE the expensive API call!
}

Cost Comparison: Schema validation: ~0.1ms. Wasted LLM API call with invalid data: $0.01-$1.00. Always validate.

✗ Don't Mix Encoding Strategies

Keep your encoding strategy consistent within a single LLM conversation to avoid confusing the model.

// ❌ Bad: Inconsistent encoding in same conversation
const message1 = "Here's the data: " + JSON.stringify(data1);
const message2 = "And here's more: " + encode(data2);
// LLM has to handle two different formats!

// ✅ Good: Consistent encoding
const message1 = "Here's the data: " + encode(data1);
const message2 = "And here's more: " + encode(data2);
// LLM knows to expect AXON format throughout

// ✅ Also good: Tell the LLM upfront
const systemPrompt = `You will receive data in AXON format.
  AXON is a compact encoding. Decode it before analysis.`;