ASON 2.0 Documentation - Complete Format Specification & API Guide

Introduction

ASON 2.0 (Aliased Serialization Object Notation) is a serialization format designed to optimize token consumption in LLM (Large Language Model) contexts while maintaining human readability and guaranteeing 100% lossless round-trip fidelity.

Unlike traditional JSON, ASON 2.0 uses intelligent compression techniques such as sections (@section), tabular arrays with pipe delimiters, semantic references ($var), and dot notation to achieve 20-60% token reduction without losing any information.

Features

• Intelligent Compression: Reduces up to 60% of tokens compared to JSON
• Human Readable: Maintains a clear and easy-to-read structure
• Perfect Round-Trip: Guaranteed decompression without data loss
• Semantic References: Uses human-readable variable names ($var) for deduplication
• Tabular Arrays: CSV-like format with [N]{fields} syntax and pipe delimiter
• Section Organization: @section syntax for grouping related data

Installation & Usage

NPM Package

npm install @ason-format/ason

import { SmartCompressor } from '@ason-format/ason';

const compressor = new SmartCompressor();
const data = { users: [{ id: 1, name: "Alice" }] };

const compressed = compressor.compress(data);
const original = compressor.decompress(compressed);

CDN (Browser)

Use ASON directly in the browser without installation:

Option 1: ESM Module

<script type="module">
  import { SmartCompressor } from 'https://unpkg.com/@ason-format/ason@1.1.2';

  const compressor = new SmartCompressor();
  const compressed = compressor.compress({ hello: "world" });
</script>

Option 2: Global Variable

<script src="https://unpkg.com/@ason-format/ason@1.1.2/dist/index.browser.js"></script>
<script>
  const { SmartCompressor, TokenCounter } = window.ASON;
  const compressor = new SmartCompressor();
</script>

Bundle size: ~14 KB minified (~5 KB gzipped)

Also available on: jsDelivr, esm.sh

CLI Tool

# Convert JSON to ASON
npx ason input.json -o output.ason

# Show token savings
npx ason data.json --stats
# ✓ Saved 36 tokens (61.02%)

Basic Example

Original JSON

195 tokens

{
  "users": [
    {
      "id": 1,
      "name": "Alice",
      "email": "alice@example.com",
      "age": 25,
      "active": true
    },
    {
      "id": 2,
      "name": "Bob",
      "email": "bob@example.com",
      "age": 30,
      "active": true
    },
    {
      "id": 3,
      "name": "Charlie",
      "email": "charlie@example.com",
      "age": 35,
      "active": false
    }
  ]
}

Compressed ASON 2.0

32 tokens · -64% reduction

users:[3]{id,name,email,age,active}
1|Alice|"alice@example.com"|25|true
2|Bob|"bob@example.com"|30|true
3|Charlie|"charlie@example.com"|35|false

Compression Techniques

Tabular Arrays

When an array contains objects with the same keys, ASON 2.0 uses a CSV-like format with [N]{fields} syntax and pipe delimiter for maximum token efficiency.

users:[3]{id,name,email}
1|Alice|"alice@example.com"
2|Bob|"bob@example.com"
3|Charlie|"charlie@example.com"

Dot Notation

Deeply nested objects are flattened using dot notation, saving tokens on structure while maintaining clarity.

// Instead of:
config:
 database:
  host:localhost

// ASON 2.0 uses:
config.database.host:localhost

Semantic References

ASON 2.0 uses human-readable variable names for frequently repeated values. Definitions are declared in the $def: section using semantic names like $email, $address, etc.

Advantage: Variable names like $email or $city are self-documenting and easier to understand than numeric references.

$def:
 $email:"customer@example.com"
 $city:"San Francisco"
$data:
@billing
 email:$email
 city:$city
@shipping
 email:$email
 city:$city

This format provides clear semantics and is optimized for both human readability and LLM token efficiency.

Sections

Objects with 3 or more fields are organized using @section syntax. Arrays use key:[N]{fields} format instead.

@customer
 name:"Alice Johnson"
 email:"alice@example.com"
 phone:"+1-555-0100"

items:[2]{id,product,price}
1|Laptop|999
2|Mouse|29

Real-World Use Cases

Stripe Payment Intent

-6.0%

Complex JSON with 70+ fields, nested objects, and multiple references.

JSON: 1076 tokens ASON: 1011 tokens

Array of 50 Products

-59.1%

Large list with uniform structure, ideal for compression.

JSON: 1028 tokens ASON: 420 tokens

10 Users

-48.2%

Typical REST API case with uniform objects.

JSON: 195 tokens ASON: 101 tokens

Comparison with Other Formats

Feature	JSON	ASON	TOON
Human Readability	Yes	Yes	Yes
Average Compression	0%	22.1%	12.3%
Object References	No	Yes	No
Value Dictionary	No	Yes (inline-first)	No
Uniform Arrays	No	Yes	Yes
Guaranteed Round-Trip	Yes	Yes	Yes

When to Use ASON

Ideal Cases

• Payloads for LLMs with token limits
• Large arrays with uniform objects
• APIs with repetitive data
• Compact structured logging

Not Recommended

• Very small JSON (<100 chars)
• Completely heterogeneous data
• No token limits
• Compatibility with legacy systems

Why ASON is Optimal for LLMs

ASON 2.0 is specifically designed to maximize efficiency when working with Large Language Models. Every design decision reduces token count and parsing ambiguity.

1. Unambiguous Pipe Delimiters

Unlike commas, which appear in numbers (1,000), dates, and natural text, pipe characters (|) are rarely used. This eliminates parsing ambiguity for LLMs.

✓ ASON (Pipes)

1|"Product 1"|10.99|false|"Electronics"

✗ CSV (Commas)

1,Product 1,10.99,false,Electronics

Ambiguous: Is "Product 1" one field or two?

2. Explicit String Boundaries with Quotes

Every string is wrapped in quotes, making it crystal clear where text begins and ends. This prevents confusion with numbers, booleans, or null values.

✓ "Product 1" ← clearly a string

✓ 10.99 ← clearly a number

✓ false ← clearly a boolean

✗ Product 1 ← string or identifier?

3. Semantic References Reduce Tokens

Variables like $category dramatically reduce token count by eliminating repetition. LLMs can easily understand and follow these references.

With References

$def:
 $cat:Electronics

$data:
1|"Product 1"|$cat
2|"Product 2"|$cat
3|"Product 3"|$cat

Tokens saved: ~30%

Without References

1|"Product 1"|"Electronics"
2|"Product 2"|"Electronics"
3|"Product 3"|"Electronics"

"Electronics" repeated 3 times

4. Explicit Section Boundaries

The $def: and $data: markers create clear boundaries between different parts of the structure, making it easier for LLMs to parse and understand the format.

$def:              ← Definitions section
 $street:"123 Main St"
 $city:"San Francisco"

$data:             ← Data section
users:[2]{name,address.street,address.city}
"Alice"|$street|$city
"Bob"|$street|$city

Frequently Asked Questions

How much token reduction can I expect?

Token reduction varies by data structure. For uniform arrays (like lists of users or products), expect 40-60% reduction. For mixed structures, 20-40%. For deeply nested non-uniform data, 10-20%. The playground lets you test with your actual data.

Is ASON lossless? Will I get my exact data back?

Yes, ASON is 100% lossless. JSON.stringify(decompress(compress(data))) === JSON.stringify(data) always returns true. All values, types, and structure are perfectly preserved.

How does ASON compare to TOON format?

ASON consistently beats TOON by 5-15% on average. Key advantages: semantic references ($def), pipe delimiters for clarity, and smarter detection of repeated values. See the benchmarks page for detailed comparisons.

Why use pipes (|) instead of commas?

Pipes are unambiguous. Commas appear in numbers (1,000), dates (Jan 1, 2024), and natural text. This creates parsing confusion for LLMs. Pipes rarely appear in data, making field boundaries crystal clear.

Can LLMs generate valid ASON format?

Yes! ASON's clear structure (pipe delimiters, quoted strings, explicit sections) makes it easy for LLMs to learn and generate. Provide examples in your prompt and models like GPT-4 and Claude can produce valid ASON output.

What happens if my data doesn't have patterns?

ASON falls back to a compact nested object format. You'll still get some reduction from removing JSON syntax overhead, but it won't be as dramatic. For completely heterogeneous data, stick with regular JSON.

Is there a performance cost for compression?

Compression/decompression is fast (<1ms for typical payloads). The token savings on LLM API calls far outweigh any CPU cost. For a 1000-token payload reduced to 400 tokens, you save ~600 tokens on every request.

Can I use ASON with any LLM provider?

Yes! ASON is just a text format. It works with OpenAI (GPT-3.5, GPT-4), Anthropic (Claude), Google (Gemini), local models (Llama), and any other LLM. Compress before sending, decompress after receiving.

How do I handle ASON errors in production?

Wrap compress/decompress in try-catch blocks. If ASON fails, fall back to regular JSON. The library throws descriptive errors. Common issues: malformed ASON strings, incompatible data types, or corrupted compression output.

try {
  const ason = compressor.compress(data);
  // send ason to LLM
} catch (error) {
  // fallback to JSON
  const json = JSON.stringify(data);
}

Does ASON work with TypeScript?

Yes! Full TypeScript support with type definitions included. The package exports SmartCompressor class with proper typing for compress/decompress methods.