Multi-Format Token Counter - Compare JSON, YAML, CSV, TOON & ASON

Dataset:

Baseline:

All Permutations Comparison

Baseline:

Dataset	Pretty JSON tokens / %	JSON tokens / %	YAML tokens / %	TOON tokens / %	ASON tokens / %	CSV tokens / %

Why It Matters

For RAG systems, API responses, and batch processing through LLMs, format choice compounds across scale: cutting costs, improving latency, enabling larger datasets within fixed context limits.

10K token JSON → ~3K ASON = 7K more for prompts

Key Findings

✓ Whitespace removal helps, but structural overhead matters more
✓ Every {} [] " consumes tokens
✓ CSV/TOON/ASON use positional encoding vs repeated keys
✓ ASON: Dictionary refs + uniform arrays