A minimal, deterministic, human-friendly type system for SuperCSV v1.0.
All types listed here may appear in header declarations. Their corresponding values may appear in data rows.
list, arr), and the inline enum typeRules that apply to all types without exception.
_ — represents missing or undefined data"_" inside quotes is a string value, not null" after whitespace trimmingComment and metadata syntax are defined by the main v1.0 spec. In this type table, only the version declaration (((SuperCSV v1.0))) has defined metadata meaning; any other metadata syntax has no defined effect on type interpretation.
Includes:
int, float, decimalboolstring (scalar, with its own quoting and syntax rules)bytesdate, time, datetime, datetimetz, timestamp, duration, timezoneuuidDefined inline using enum<...>. Its EnumItems define the complete set of allowed EnumValues.
enum<low,medium,high>enum<0=red,1=green,2=blue>Parameterized collection types:
list<T>[N]Where T is a ScalarType or EnumType. Containers cannot nest.
int64-bit signed integer (int64).
Range: −9223372036854775808 to 9223372036854775807.
-0) is not allowed — use 0 instead_ allowed42
-7
0
_
"42"
+7
007
floatIEEE-754 binary64 (double precision) floating-point number.
Accepts decimal literals, scientific notation, and special values (case-insensitive).
Format:
e/E with optional sign)nan, inf, -infSpecial values (case-insensitive):
naninf-infRules:
nan, inf, and -inf are the only canonical special values+ is not permitted_ allowed3.14
1e6
-inf
nan
INF
_
"3.14"
1,000
+2.5
+inf
infinity
decimalExact arbitrary-precision literal-preserving decimal type.
Values are stored and transmitted exactly as written — no conversion to binary floating-point occurs.
Format:
0 or a non-zero digit followed by digits-0) is not allowed — use 0 insteadRules:
_ allowed12.345
-0.0001
42.0
_
1e6
"12.3"
12.
boolAccepted values (case-insensitive for textual):
true, false1, 0_Only these forms are valid. All other textual or numeric representations (e.g. yes, no, t, f, 2, 01) are invalid.
Values must not be quoted.
stringUTF-8 text.
Unquoted strings are permitted only if, after trimming leading and trailing whitespace, the resulting value contains none of:
, # [ ] ( ) < > { }" ' ` ; : = ?/ \ | @Unquoted strings may contain internal spaces. Any leading or trailing whitespace in an unquoted string is ALWAYS trimmed. If leading or trailing whitespace must be preserved, a quoted string must be used.
If any disallowed character appears, the value must be quoted.
The null value is represented by _ as an unquoted field.
Quoted strings use the standard double-quote form:
"value""""""_" inside quotes is a string, not nullRules:
_ allowedWARNING: Unquoted strings must never contain
(or). Writing e.g.Alice (née Smith)will cause(née Smith)to be parsed as comment syntax rather than string content, so the intended literal is not preserved. Quote it instead as"Alice (née Smith)".
Bob
alpha-2
v1.0.3
Bob_the_Builder
_
"Bob, the Builder"
" #hash "
"She said ""hi"""
"Alice (née Smith)" # correct form when value contains parens
Bob, the Builder
foo(bar # ( not followed by valid comment block — parse error
foo)bar # ) with no preceding ( — parse error
"unterminated
bytes<hex>A hexadecimal value.
[0-9A-Fa-f]+0x, \x, etc.)_ allowed00
deadbeef
CAFEBABE
0123456789abcdef
_
0xDEADBEEF # prefix not allowed
abc # odd length
"deadbeef" # quoted
ghij # invalid hex digits
bytes<b64>A base64 value.
- and _) is not allowed in v1; the permitted alphabet is the standard Base 64 alphabet defined in RFC 4648 §4_ allowedaGVsbG8=
YWJjZGVm
AQIDBAUGBwgJ
_
hello world! # not base64
abc=== # invalid padding
"YWJjZGVm" # quoted
a-b_c # URL-safe form not allowed in SuperCSV
dateISO-8601 calendar date with full calendar validation.
YYYY-MM-DD or YYYY/MM/DD2025-02-30, 2023-11-31, non-leap 2023-02-29)_ allowed2025-01-05
1999-12-31
2024-02-29 # leap year
2025/01/05
1999/12/31
_
2025-02-30 # February has 28 or 29 days
2023-11-31 # November has 30 days
2023-02-29 # not a leap year
"2025-01-05" # quoted
05/01/2025 # wrong format
2025.01.05 # wrong separators
timeISO-8601 local time.
Fractional seconds allowed (0–9 digits).
No timezone.
Values must not be quoted.
Null literal _ allowed.
14:30:00
23:59:59.123
08:15:42.987654321
_
"14:30:00"
14:30
datetimeISO-8601 datetime.
Date may use - or /. Separator may be space or T.
Fractional seconds allowed.
Values must not be quoted.
Null literal _ allowed.
Syntax: YYYY-MM-DD[ T | space ]HH:MM:SS[.fraction]
2025-01-05 14:30:00
2025-01-05T14:30:00
2025/01/05 14:30:00.123
_
"2025-01-05T14:30:00"
2025-01-05T14:30Z
2025-01-05T14:30:00+13:00:00
datetimetzISO-8601 datetime with required timezone.
Date may use - or /. Separator may be space or T.
Fractional seconds allowed.
Timezone required (Z or ±HH:MM, no seconds).
Values must not be quoted.
Null literal _ allowed.
Syntax: YYYY-MM-DD[ T | space ]HH:MM:SS[.fraction](Z | ±HH:MM)
2025-01-05T14:30:00Z
2025/01/05 14:30:00.123Z
2025/01/05 14:30:00.123-05:00
2025-01-05T14:30:00+13:00
_
"2025-01-05T14:30:00"
2025-01-05 14:30:00
2025-01-05T14:30
2025-01-05T14:30:00+13:00:00
timestampISO-8601 timestamp with optional timezone.
Date may use - or /. Separator may be space or T.
Fractional seconds allowed.
Timezone optional (Z or ±HH:MM, no seconds).
Values must not be quoted.
Null literal _ allowed.
Syntax: YYYY-MM-DD[ T | space ]HH:MM:SS[.fraction][(Z | ±HH:MM)]
2025-01-05 14:30:00
2025-01-05T14:30:00Z
2025/01/05 14:30:00.123
_
"2025-01-05T14:30:00"
2025-01-05T14:30
2025-01-05T14:30:00+13:00:30
durationISO-8601 duration (days, hours, minutes, seconds).
No years, months, or weeks.
Fractional seconds allowed only on seconds (1–9 digits).
Values must not be quoted.
Null literal _ allowed.
P2D
PT1H30M
PT1.5S
_
"PT1H"
P1Y
PT1M30.5S
timezoneStores an IANA timezone identifier as text (e.g. UTC, America/New_York, Pacific/Auckland).
Follows string quoting rules — it is the only type other than string that may be quoted or unquoted.
Null literal _ allowed.
Implementations SHOULD store only valid IANA timezone identifiers. Files that store arbitrary strings in a
timezonecolumn may break where strict IANA validation is enforced.
UTC
"Pacific/Auckland"
_
"UTC
uuidCanonical RFC-4122 UUID.
Values must not be quoted.
Null literal _ allowed.
550e8400-e29b-41d4-a716-446655440000
f47ac10b-58cc-4372-a567-0e02b2c3d479
_
"550e8400-e29b-41d4-a716-446655440000"
550e8400e29b41d4a716446655440000
list<T>1D semantic collection.
Rules:
_ allowed at list levelT must be a ScalarType or EnumType[] (only for dynamic-size lists)_ allowed at list item levelForm examples:
list<T> # dynamic-size
list<T>[3] # fixed-size
Header: list<string>
[red,green,blue]
[]
[_,green,_]
"[red,green]"
"[]"
[red, [blue]]
Header: list<int>[3]
[1,2,3]
[_,5,6]
[1,2] # wrong size
[1,2,3,4] # wrong size
"[1,2,3]" # quoted
arr<T>Structured 1D or 2D array.
Rules:
_ allowed at array levelT must be a ScalarType or EnumTypeTT[])[[]], [[],[]])arr<T>[R,C]) require R > 0 and C > 0arr<T>[N]) require N > 0Form examples:
arr<T>
arr<T>[N]
arr<T>[R,C]
[N][...] and [R,C][[...]] — The prefix appears as a separate bracket group before the value brackets, defining the array size for 1D and 2D arrays respectively. R and C are positive integers to represent Row and Column size.
1D example: [3][1,2,3]
2D example: [2,3][[1,2,3],[4,5,6]]
[1,2,3,4]
[_,2,_]
_ # array is null
[] # valid only for arr<T>[]
[[1,2,3],[4,5,6],[7,8,9]]
[[true,false],[false,true]]
[[]] # zero-column 2D
[[],[]] # zero-column 2D with 2 rows
_ # array is null
[] # valid only for arr<T>[]
[3][1,2,3]
[2,3][[1,2,3],[4,5,6]]
"[[1,2],[3,4]]" # quoted
[[1],[2,3]] # ragged
Header: arr<int>
[1,2,3] # 1D
[] # empty allowed
[_,5,_] # 1D with nulls
[[1,2],[3,4]] # 2D
[[]] # 2D zero-column
[3][1,2,3] # 1D prefix
[2,3][[1,2,3],[4,5,6]] # 2D prefix
"[]" # quoted
[1,2,] # trailing comma
[[1],[2,3]] # ragged 2D
[[[1]]] # 3D not allowed
Header: arr<bool>[3]
[true,false,true]
[_,_,_]
[] # empty not allowed
[true,false] # wrong size
[[true,false]] # 2D not allowed
Header: arr<int>[2,3]
[[1,2,3],[4,5,6]]
[ [_,_,_], [_,_,_] ]
[] # empty not allowed
[[1,2,3]] # wrong row count
[[1,2],[3,4]] # wrong column count
[[1,2,3],[4,5]] # ragged
enum<...>An inline EnumType whose EnumItems define the complete set of allowed EnumValues.
Rules:
_ in header definitions: _ is not a valid Identifier and cannot be used as an enum name or value — it is reserved as the null literal_ in data rows: _ is always valid as the null literal for any enum column; it does not need to be declaredLookup order: name is checked first; if no name matches, values are checked in declaration order and the first match is used.
Encoding: encoders MUST use name-form when values are not unique; encoders MAY use value-form only when all values are unique.
enum<name1,name2,name3,...>
enum<value1=name1,value2=name2,value3=name3,...>
These two forms are mutually exclusive. enum<name1,value2=name2> is invalid.
| Definition | What it shows | Example valid data values | Example invalid data values |
|---|---|---|---|
enum<low,medium,high> |
Name-only form | low, HIGH, _ |
0, "low" |
enum<0=low,1=medium,2=high> |
Numeric values | low, 0, _ |
3, "low" |
enum<L=low,M=medium,H=high> |
Identifier values | low, L, _ |
low2, "L" |
| Definition | Reason |
|---|---|
enum<low,1=medium,high> |
Mixed style: name-only and value=name cannot be combined |
enum<low,low,high> |
Duplicate name |
enum<LOW,low,high> |
Duplicate name (case-insensitive) |
enum<ACT=ACTIVE,ACTIVE=WORK> |
Cross-pair collision: value ACTIVE (item 2) matches name ACTIVE (item 1) |
enum<0=ERROR,0=FAILURE,1=OK>
Valid — values may duplicate. Names (ERROR, FAILURE, OK) remain unique.
No value collides with any name (0 and 1 do not appear in the name set).
When a data value matches multiple items by value, the first matching item in declaration order is used — e.g. 0 matches to ERROR in this example, not FAILURE.
enum<ERROR=ERROR,1=FAILURE>
Valid — the value ERROR matches only its own item's name. A value matching its own item's name is explicitly allowed.
Type aliases provide alternate names for types.
| Canonical | Small | Tiny | Notes |
|---|---|---|---|
int | int | i | integer number |
float | flt | f | IEEE-754 float |
decimal | dec | d | arbitrary-precision decimal |
bool | bl | b | boolean true/false/1/0 |
string | str | s | UTF-8 text with quoting rules |
bytes<hex> | hex | bx | hex-encoded bytes |
bytes<b64> | b64 | b6 | base64-encoded bytes |
date | dat | da | ISO-8601 calendar date |
time | tm | t | ISO-8601 local time |
datetime | dt | dt | ISO-8601 datetime |
datetimetz | dtz | dtz | ISO-8601 datetime tz |
timestamp | ts | ts | ISO-8601 timestamp |
duration | dur | du | ISO-8601 duration |
timezone | tz | z | IANA timezone identifier |
uuid | uu | u | RFC-4122 UUID |
These aliases apply to the prefix of the type expression.
| Canonical | Small | Tiny | Notes |
|---|---|---|---|
enum<…> | en<…> | e<…> | inline enumeration type |
list<T> | li<T> | l<T> | 1D semantic list of T |
arr<T> | ar<T> | a<T> | 1D or 2D structured array of T |
The following examples illustrate how canonical, small, and tiny aliases may be used in header rows. All three headers describe the same header definition.
Id:int, Name:string, Base:arr<float>
Id:int, Name:str, Base:ar<flt>
Id:i, Name:s, Base:a<f>