Grammar Reference
This page is the complete formal grammar for the wirespec language in Extended Backus-Naur Form (EBNF).
EBNF Notation
| Notation | Meaning |
|---|---|
{ x } | Zero or more repetitions of x |
[ x ] | Optional x (zero or one) |
| `x | y` |
"text" | Literal keyword or punctuation |
NAME | Identifier token (uppercase = terminal) |
; | Rule terminator |
Top-Level Declarations
A wirespec source file is a sequence of annotations, module/import declarations, and top-level items in any order.
file = { annotation | module_decl | import_decl | top_item } ;
module_decl = "module" dotted_name ;
import_decl = "import" dotted_name [ "." NAME ] ;
top_item = const_def | enum_def | flags_def | type_def
| packet_def | frame_def | capsule_def | static_assert_def ;The module_decl declares the module identity of the current file (e.g., module quic.varint). The import_decl brings a specific name or entire module into scope (e.g., import quic.varint.VarInt).
Definitions
Constants, Enums, and Flags
const_def = "const" NAME ":" type_ref "=" literal ;
enum_def = "enum" NAME ":" type_ref "{" enum_member { "," enum_member } "}" ;
enum_member = NAME "=" literal ;
flags_def = "flags" NAME ":" type_ref "{" flags_member { "," flags_member } "}" ;
flags_member = NAME "=" literal ;
static_assert_def = "static_assert" expr ;enum defines named integer constants with a fixed underlying type. flags is identical in syntax but signals that values are bit-flags intended for bitwise combination. static_assert evaluates a compile-time expression; failure is a compile error.
Type Definitions
type_def = { annotation } "type" NAME "=" type_body ;
type_body = "{" field_list "}" | type_expr | varint_block ;
varint_block = "varint" "{" varint_param { "," varint_param } "}" ;
varint_param = NAME ":" ( NAME | INTEGER ) ;type Name = TypeExpr is a type alias — it introduces no new wire layout. type Name = { ... } (with braces) defines a computed/dependent type with its own field structure. varint { ... } defines a continuation-bit variable-length integer encoding (e.g., MQTT Remaining Length, LEB128).
Packet Definitions
packet_def = { annotation } "packet" NAME "{" field_list "}" ;A packet is a fixed-structure binary record. Fields are parsed in order. The generated C struct contains one member per wire field, plus bool has_X / optional-presence fields for conditional fields.
Frame Definitions
frame_def = { annotation } "frame" NAME "="
"match" NAME ":" type_ref "{" frame_branch { frame_branch } "}" ;
frame_branch = pattern "=>" NAME "{" field_list "}"
| pattern "=>" NAME "{}" ;A frame is a tagged union dispatched on a leading discriminator field. Each branch is matched against patterns on the tag value. The _ wildcard branch is required for exhaustiveness unless all values are covered.
Capsule Definitions
capsule_def = { annotation } "capsule" NAME "{"
{ field "," }
"payload" ":" "match" capsule_tag "within" NAME
"{" frame_branch { frame_branch } "}" "}" ;
capsule_tag = NAME | "(" expr ")" ;A capsule is a TLV (Type-Length-Value) container. Header fields appear before payload. The within NAME clause names a length field that establishes a byte-bounded sub-scope for the payload. The tag may be a field name or a parenthesized expression derived from header fields.
Fields
Fields appear inside packet, frame branch, capsule header, and computed type bodies.
field_list = { field_item [ "," ] } ;
field_item = field | derived_field | require_clause ;
field = { annotation } NAME ":" type_expr ;
derived_field = "let" NAME ":" type_ref "=" expr ;
require_clause = "require" expr ;| Kind | Syntax | On wire? | In struct? |
|---|---|---|---|
| Wire field | name: T | Yes | Yes |
| Optional field | name: if COND { T } | Conditional | bool has_name; T name; |
| Derived field | let name: T = expr | No | Yes |
| Validation | require EXPR | No | No (runtime check) |
Fields are visible to fields declared below them in the same scope (top-to-bottom ordering). Forward references are a compile error.
Type Expressions
type_expr = type_ref
| "match" NAME "{" match_branch { "," match_branch } "}"
| "if" expr "{" type_expr "}"
| "[" type_expr ";" array_size "]"
| "bytes" "[" bytes_spec "]" ;
match_branch = pattern "=>" type_expr ;
array_size = expr | "fill" ;
bytes_spec = INTEGER
| "remaining"
| "length" ":" expr
| "length_or_remaining" ":" expr ;The bytes_spec variants:
| Variant | Meaning |
|---|---|
bytes[N] | Fixed N bytes |
bytes[remaining] | Consume all remaining bytes in current scope |
bytes[length: EXPR] | EXPR bytes (EXPR must be a non-optional integer-like type) |
bytes[length_or_remaining: EXPR] | EXPR bytes if EXPR is non-null, otherwise remaining (EXPR must be Option[integer-like]) |
remaining and [T; fill] are keywords, not expressions. They must appear as the final wire field in their scope.
Patterns
Patterns appear in frame tag matches and inline match type expressions.
pattern = literal | literal "..=" literal | "_" ;Range semantics (unified with Rust):
| Operator | Meaning | Context |
|---|---|---|
..= | Inclusive range [start, end] | Patterns only |
.. | Half-open range [start, end) | Slices, quantifiers (exists i in 0..N) |
Examples:
0x02..=0x03— matches values 0x02 and 0x03 (inclusive)_— wildcard, matches anything (must be last branch)paths[0..count]— slice expression, half-open range
Expressions
Expressions follow a precedence hierarchy from lowest to highest binding. Important: bitwise operators bind tighter than comparison operators — the opposite of C/Java. This means a & mask == 0 is parsed as (a & mask) == 0, which matches programmer intent.
expr = coalesce_expr ;
coalesce_expr = or_expr [ "??" or_expr ] ;
or_expr = and_expr { "or" and_expr } ;
and_expr = compare_expr { "and" compare_expr } ;
compare_expr = bitor_expr [ cmp_op bitor_expr ] ;
cmp_op = "==" | "!=" | "<" | "<=" | ">" | ">=" ;
bitor_expr = bitxor_expr { "|" bitxor_expr } ;
bitxor_expr = bitand_expr { "^" bitand_expr } ;
bitand_expr = shift_expr { "&" shift_expr } ;
shift_expr = add_expr { ( "<<" | ">>" ) add_expr } ;
add_expr = mul_expr { ( "+" | "-" ) mul_expr } ;
mul_expr = unary_expr { ( "*" | "/" | "%" ) unary_expr } ;
unary_expr = ( "!" | "-" ) unary_expr | postfix_expr ;
postfix_expr = primary_expr { "." NAME | "[" expr "]" | "[" expr ".." expr "]" } ;
primary_expr = literal
| qualified_name [ "(" [ expr { "," expr } ] ")" ]
| "fill" "(" expr "," expr ")"
| "(" expr ")" ;
qualified_name = NAME { "::" NAME } ;Precedence table (low to high):
| Level | Operator(s) | Notes |
|---|---|---|
| 1 (lowest) | ?? | Null-coalescing for Option[T] |
| 2 | or | Logical OR |
| 3 | and | Logical AND |
| 4 | == != < <= > >= | Comparison |
| 5 | | | Bitwise OR |
| 6 | ^ | Bitwise XOR |
| 7 | & | Bitwise AND |
| 8 | << >> | Bit shift |
| 9 | + - | Addition, subtraction |
| 10 | * / % | Multiplication, division, modulo |
| 11 | ! - (unary) | Logical NOT, negation |
| 12 (highest) | . [i] [a..b] | Member access, subscript, slice |
The ?? operator unwraps Option[T]: optional_field ?? default_value. The fill(value, count) built-in initializes an array. State constructors use :: qualification: PathState::Active(pid, 0, 0).
Annotations
Annotations appear before definitions and fields. They convey metadata to the compiler and code generator.
annotation = "@" NAME [ "(" annotation_arg { "," annotation_arg } ")" ]
| "@" NAME NAME
| "@" NAME STRING ;
annotation_arg = NAME "=" literal | NAME | literal ;Examples:
@endian big— module-level or field-level byte order@doc("RFC 9000 Section 17")— documentation string@strict— reject non-canonical encodings at runtime@checksum(internet)— RFC 1071 Internet Checksum verification@max_len(1024)— per-field array capacity override
Terminals
literal = INTEGER | HEX | BINARY | STRING | "true" | "false" | "null" ;
type_ref = PRIMITIVE | NAME ;
dotted_name = NAME { "." NAME } ;
PRIMITIVE = "u8" | "u16" | "u32" | "u64"
| "u16be" | "u16le"
| "u32be" | "u32le"
| "u64be" | "u64le"
| "u24" | "u24be" | "u24le"
| "i8" | "i16" | "i32" | "i64"
| "bit"
| "bits" "[" INTEGER "]" ;Integer literals: decimal (42), hexadecimal (0x2a), binary (0b00101010). null is the built-in absent value for Option[T]. bool is a reserved semantic type name (not a PRIMITIVE); it appears only in let bindings and guard conditions.
State Machine Grammar (Extended)
State machines are defined at the top level and compose into hierarchical protocol automata. This grammar extends the core with state machine, transition, and verify declarations.
State Machine Structure
top_item += state_machine_def ;
state_machine_def = "state" "machine" NAME "{" { sm_item } "}" ;
sm_item = state_decl | initial_decl | transition_def | verify_def ;
state_decl = "state" NAME [ "{" sm_field_list "}" ] [ "[" "terminal" "]" ] ;
sm_field_list = { NAME ":" type_expr [ "=" literal ] "," } ;
initial_decl = "initial" NAME ;States may carry associated data fields with optional default values. The [terminal] marker designates a state that, once reached, has no outgoing transitions. The initial declaration names the starting state.
Transitions
transition_def = "transition" ( NAME | "*" ) "->" NAME
"{" { transition_clause } "}" ;
transition_clause = "on" NAME [ "(" param_list ")" ]
| "guard" expr
| "action" "{" assignment { ";" assignment } [ ";" ] "}"
| "delegate" postfix_expr "<-" NAME ;
assignment = postfix_expr ( "=" | "+=" ) expr ;
param_list = NAME ":" type_expr { "," NAME ":" type_expr } ;Clause multiplicity (semantic, not syntactic):
| Clause | Occurrences | Constraint |
|---|---|---|
on | 1 or more | Multiple events may share a transition |
guard | 0 or 1 | Expression over src fields only |
action | 0 or 1 | Cannot coexist with delegate |
delegate | 0 or 1 | Self-transitions (S -> S) only; cannot coexist with action |
The * wildcard matches any source state. Concrete source states take precedence over *. Duplicate (state, event) pairs are a compile error.
Inside action blocks: src is the read-only source state data; dst is the write-only destination state data. Semicolons are required between assignments.
The delegate clause routes an event to a child state machine field: delegate src.paths[path_id] <- event. The child is updated in-place on a copy of src; if the child's state changes, child_state_changed is re-dispatched to the parent.
Verify Declarations
verify_def = "verify" NAME
| "verify" "property" NAME ":" vfy_leads_to ;verify NAME invokes a named built-in property (e.g., verify NoDeadlock, verify AllReachClosed). verify property NAME: FORMULA defines a named temporal logic property.
Verify Formula Grammar
Verify formulas express temporal logic properties checked by the bounded TLA+ verifier (Phase 3+). Precedence runs from lowest (~>) to highest (atoms).
vfy_leads_to = vfy_implies { "~>" vfy_implies } ;
vfy_implies = vfy_or { "->" vfy_or } ;
vfy_or = vfy_and { "or" vfy_and } ;
vfy_and = vfy_unary { "and" vfy_unary } ;
vfy_unary = "not" vfy_unary
| "[]" vfy_unary
| "<>" vfy_unary
| vfy_atom ;
vfy_atom = postfix_expr "in_state" "(" NAME ")"
| "in_state" "(" NAME ")"
| postfix_expr cmp_op expr
| "all" "(" postfix_expr "," vfy_unary ")"
| "exists" NAME "in" expr ".." expr ":" vfy_unary
| "(" vfy_leads_to ")" ;Temporal operators:
| Operator | Meaning |
|---|---|
[] | Always (box — holds in all future states) |
<> | Eventually (diamond — holds in some future state) |
~> | Leads-to (P ~> Q means whenever P holds, Q eventually holds) |
-> | Implies (logical implication, non-temporal) |
Built-in predicates:
| Predicate | Meaning |
|---|---|
in_state(S) | Current state is S |
all(collection, predicate) | All elements of collection satisfy predicate |
exists i in 0..N: formula | There exists i in [0, N) satisfying the formula |
State-dependent field access rules: A field declared in state S may only appear in a sub-formula that is logically dominated by in_state(S). Temporal operators [] and <> break dominance. Unguarded state-specific field access is a compile error.
# OK: dominated by in_state(Connected)
verify property Bounded:
[] (in_state(Connected) -> active_path_count <= 4)
# OK: in_state guards the field reference
verify property NonEmpty:
in_state(Connected) and active_path_count > 0 -> ...
# ERROR: [] breaks dominance — field not accessible
# in_state(Connected) -> [] active_path_count <= 4In verify formulas, bare field names refer to current state data. src and dst are not available (those are transition-only bindings).
Extended Grammar: Future Features
These productions appear in the spec for completeness but are not yet implemented.
ASN.1 Integration (Phase 2.5+)
top_item += "extern" "asn1" STRING "{" { NAME "," } "}" ;
field += NAME ":" "asn1" "(" NAME "," "encoding" ":" NAME ")" ;Parametric Types (Future)
type_def = { annotation } "type" NAME [ "(" param_list ")" ] "=" type_body ;
packet_def = { annotation } "packet" NAME [ "(" param_list ")" ] "{" field_list "}" ;Parametric types are deferred pending concrete use cases. The non-parametric forms are the current production grammar.