Chase AdamsChase Adams
AboutContentPromptsPlayground

YAML vs JSON for LLM Token Efficiency - The Minification Truth

Pretty-printed JSON uses 101 tokens, YAML uses 133, but minified JSON only uses 41 - the comparison everyone gets wrong

2 minute read

Core Insight

The "YAML saves tokens over JSON" claims are misleading because they compare pretty-printed JSON (with whitespace) against compact YAML. Minified JSON is just as efficient or more efficient than YAML.

The Original Claim

Medium article claims JSON uses 96 tokens vs YAML's lower count: https://medium.com/better-programming/yaml-vs-json-which-is-more-efficient-for-language-models-5bc11dd0f6df

The Real Numbers

Using OpenAI Tokenizer on identical data:

  • Pretty-printed JSON: 101 tokens
  • YAML: 133 tokens
  • Minified JSON: 41 tokens

Why This Matters

JSON doesn't require whitespace - spaces and newlines are optional noise that inflate token counts artificially.

YAML is whitespace-dependent by design - indentation and newlines carry semantic meaning. You can't "minify" YAML the same way.

Comparing pretty-printed JSON to YAML isn't a fair test. The only valid comparison is minified JSON vs YAML.

Test Data Example

YAML (133 tokens)

user:
 id: 123
 name: Alice
 email: alice@example.com
 is_active: true
 roles:
 - admin
 - editor

project:
 id: 456
 title: Knowledge Base Migration
 status: in_progress
 tags:
 - migration
 - docs
 - backend
 tasks:
 - id: 1
 description: Export old database
 completed: true
 - id: 2
 description: Transform schema
 completed: false
 - id: 3
 description: Import into new system
 completed: false

Minified JSON (41 tokens)

{"user":{"id":123,"name":"Alice","email":"alice@example.com","is_active":true,"roles":["admin","editor"]},"project":{"id":456,"title":"Knowledge Base Migration","status":"in_progress","tags":["migration","docs","backend"],"tasks":[{"id":1,"description":"Export old database","completed":true},{"id":2,"description":"Transform schema","completed":false},{"id":3,"description":"Import into new system","completed":false}]}}

Counter-Argument: Training Data Similarity

Noah Larratt's point: "LLMs perform well when content looks like training data" - using whitespace as heuristic for quality output.

This is worth testing, but token efficiency and output quality are separate concerns. If you're optimizing for tokens, minify. If you're optimizing for output quality based on training distribution, test both.

Bottom Line

Don't cargo-cult optimization advice. If you're using pretty-printed JSON, minify it - you'll save 30+ tokens in examples like this. Test your specific use case rather than accepting blanket claims about format efficiency.

Notes

  • If you're asking in a browser want to be able to read it, ask whatever LLM you're using for it back minified in a markdown block and then copy/paste that into something to prettify it
  • If you're asking in a CLI or a programmatic way, just pretty print it for yourself. Don't ask the LLM to give you back pretty printed JSON data just because you want to be able to read it.

First Cohort
No Coding Experience Required

Build Your Website with AI—No Code Required

Learn to create and deploy professional websites using ChatGPT and Claude. Go from complete beginner to confident website builder.

Start Building Today

Post Details

Published
Oct 11, 2025
Share
Ask ChatGPT
Ask Claude

Similar Posts

Suno

Suno

in ai

AI music generation tool that turns text prompts into full songs with vocals, instruments, and production

AboutAI Workflow SpecContentStacksNewsletterPromptsRSS

Ask me anything

Loading...

Hi! Ask me anything about Chase's work.

I can answer questions based on his blog posts and articles.

Experimental: This chat is a side project I work on in my free time. Responses may vary in quality and accuracy.