Self‑Describing by Design
SuperCSV files are fully self‑describing. A file contains everything needed to read, validate, and interpret its data — without external schemas, configuration files, or type‑inference logic.
This is intentional.
Most real‑world data formats rely on hidden context:
- CSV requires external knowledge of types
- JSON requires external knowledge of structure
- Message formats often require separate schema files
- Pipelines rely on ad‑hoc mapping code to make sense of values
SuperCSV avoids all of this. The structure and types are explicit. The rules are obvious. A SuperCSV file stands on its own.
This makes the format easy to share, easy to store, easy to version, and easy to reason about — without the fragile ecosystem of sidecar files and implicit assumptions that surround other formats.
Why SuperCSV Exists
SuperCSV began from a simple frustration: there was no good, simple data format for everyday work.
CSV seems like it should work. It's compact and familiar, but it was never designed as a standard. Every tool interprets it differently, and it has no types. Everything is a string, and every consumer must guess what the data was meant to be.
JSON became the default alternative, but it's heavy, noisy, and often ten times larger than the actual content. It's designed for APIs, not for people, and not for tabular data. And JSON also has no meaningful types — just a handful of primitives and a lot of guesswork.
Behind all of this is a deeper issue: real systems spend an enormous amount of time doing data IO, field mapping, and message translation. Every time data moves between systems, custom code is needed to interpret strings, infer types, normalise values, and reconcile mismatched schemas. It's repetitive, error‑prone, and unnecessary.
SuperCSV exists because none of the common formats solve these problems. There isn't a modern, typed, human‑friendly format that stays compact like CSV and structured like JSON — without inheriting their flaws.
SuperCSV solves these problems. It's a self‑describing, typed, readable data format with a clear language. Just data that describes itself.
SuperCSV exists because people deserve a simple data format that doesn't fight them.
Design Philosophy
SuperCSV is built on a small set of principles shaped by real experience with data IO, field mapping, and message translation. The format is designed to stay simple, predictable, and pleasant to work with — both for people and for tools.
Clarity and simplicity
The grammar is small and explicit. Rules are written to be understood at a glance, not discovered through edge cases.
What you see is what you get
There's no ambiguity. Every value has one meaning, and every tool will read it the same way.
Compact, human‑friendly structure
Data first, syntax second. Data with just enough structure to make it unambiguous.
Typed data
Data should not require interpretation. SuperCSV defines clear, predictable types so we don't need custom mapping code or guessing.
Minimal syntax
Nothing unnecessary. Every rule exists to remove ambiguity, and every character serves a purpose.
Planned roadmap
The format will evolve in a deliberate manner. Versions are planned, spec changes will be documented, and backward compatibility will make this a stable long term format.
Readable and parseable
SuperCSV is designed to be read by humans and parsed by machines equally well. No trade‑off required.
Streamable
Read, write, validate, and send in constant memory.
I hope you enjoy using this — a format so simple it feels obvious, yet so powerful it feels uncanny — as much as I've enjoyed turning an idea into a new data format.