hash.huffman
Constants #
const max_flat_bits = 18
max_flat_bits is the largest max_bits for which flat_table() will allocate a table (2^max_bits entries). DEFLATE's 15 fits; HPACK's 30 does not and must use decode_map() / a bit-at-a-time decoder instead.
const flat_invalid_entry = u32(00xffffffff)
flat_invalid_entry marks a flat_table() slot that no code maps to.
const flat_length_bits = 5
flat_length_bits is how many low bits of a flat_table() entry hold the code length; the symbol is stored in the remaining high bits. 5 bits hold lengths up to 31, covering every max_bits <= max_flat_bits.
fn build #
fn build(cfg Config) !Table
build assigns a canonical Huffman code to every symbol from its bit length. Symbols with length 0 are unused and get code 0. It returns the codes plus the metadata for a decode structure; use it when you need the per-symbol codes (e.g. HPACK encoding) and/or decode_map(). Callers that only want a flat decode table should use flat_table(), which avoids materializing the codes array. See next_codes() for the validation rules.
fn flat_table #
fn flat_table(cfg Config) ![]u32
flat_table builds a 2^max_bits lookup table for fast decode of short codes (the DEFLATE strategy), directly from the lengths. Each entry is (symbol << flat_length_bits) | length, or flat_invalid_entry for indices no code reaches. The table is indexed by the next max_bits bits read from the stream in bit_order. It returns an error if max_bits > max_flat_bits, since the table would be prohibitively large (use build() + decode_map() instead).
Unlike build(), this allocates no per-symbol codes array and does not copy the lengths: it assigns each code inline while filling the table, so a hot caller rebuilding a tree per block (compress.deflate) pays no extra allocation over hand-rolling the loop.
fn BitOrder.from #
fn BitOrder.from[W](input W) !BitOrder
enum BitOrder #
enum BitOrder {
// msb_first keeps the canonical code as-is: the first transmitted bit is
// the most-significant bit of the code (RFC 7541 / HPACK).
msb_first
// lsb_first reverses each code within its length, so the first transmitted
// bit is the least-significant bit (RFC 1951 / DEFLATE). This is the form a
// flat LSB-first decode table is indexed by.
lsb_first
}
BitOrder selects how a code's bits are laid out in the returned u32.
struct Config #
struct Config {
pub:
lengths []int @[required] // per-symbol code length in bits; 0 marks an unused symbol
max_bits int @[required] // maximum allowed code length; must be >= every nonzero length
bit_order BitOrder @[required] // .msb_first (HPACK) or .lsb_first (DEFLATE)
}
Config parameterizes build(). All fields are required and have no defaults: the two callers have intentionally different requirements (15 vs 30 bits, LSB vs MSB), so an implicit default would silently fit only one of them.
struct Table #
struct Table {
pub:
codes []u32 // per-symbol code, right-aligned in a u32, in `bit_order`
lengths []int // per-symbol bit length (a copy of the input)
max_bits int
bit_order BitOrder
}
Table is the result of build(): the canonical code for every symbol, plus the metadata a caller needs to drive its own bit I/O and to build a decode structure (flat_table() for small max_bits, decode_map() otherwise).
fn (Table) decode_map #
fn (t Table) decode_map() !map[u64]int
decode_map builds a map from a packed (length, code) key to its symbol, for codecs whose max_bits is too large for a flat table (HPACK's 30 bits). The key is (u64(length) << 32) | code; a decoder accumulates bits MSB-first and looks up after each bit. Only defined for .msb_first tables, where the accumulated value matches the stored code; it returns an error otherwise.