All Permutations Comparison

Dataset Pretty JSON
tokens / %
JSON
tokens / %
YAML
tokens / %
TOON
tokens / %
ASON
tokens / %
CSV
tokens / %

Why It Matters

For RAG systems, API responses, and batch processing through LLMs, format choice compounds across scale: cutting costs, improving latency, enabling larger datasets within fixed context limits.

10K token JSON → ~3K ASON = 7K more for prompts

Key Findings

  • Whitespace removal helps, but structural overhead matters more
  • Every {} [] " consumes tokens
  • CSV/TOON/ASON use positional encoding vs repeated keys
  • ASON: Dictionary refs + uniform arrays