summaryrefslogtreecommitdiff
path: root/cdb.5
blob: 1ef87dc2657dea62d36c5370e8dd8b47cec30a1b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
.Dd September 14, 1996
.Dt CDB 5
.Os
.Sh NAME
.Nm cdb
.Nd a structure for constant databases
.Sh DESCRIPTION
A
.Nm
is an associative array: it maps strings
.Pq Dq keys
to strings
.Pq Dq data .
.Pp
A cdb contains 256 pointers to linearly probed open hash tables. The
hash tables contain pointers to (key,data) pairs. A cdb is stored in
a single file on disk:
.Bd -literal
    +----------------+---------+-------+-------+-----+---------+
    | p0 p1 ... p255 | records | hash0 | hash1 | ... | hash255 |
    +----------------+---------+-------+-------+-----+---------+
.Ed
.Pp
Each of the 256 initial pointers states a position and a length. The
position is the starting byte position of the hash table. The length
is the number of slots in the hash table.
.Pp
Records are stored sequentially, without special alignment. A record
states a key length, a data length, the key, and the data.
.Pp
Each hash table slot states a hash value and a byte position. If the
byte position is 0, the slot is empty. Otherwise, the slot points to
a record whose key has that hash value.
.Pp
Positions, lengths, and hash values are 32-bit quantities, stored in
little-endian form in 4 bytes. Thus a cdb must fit into 4 gigabytes.
.Pp
A record is located as follows. Compute the hash value of the key in
the record. The hash value modulo 256 is the number of a hash table.
The hash value divided by 256, modulo the length of that table, is a
slot number. Probe that slot, the next higher slot, and so on, until
you find the record or run into an empty slot.
.Pp
The cdb hash function is
.Ql h = ((h << 5) + h) ^ c ,
with a starting hash of 5381.
.Sh SEE ALSO
.Xr cdbget 1 ,
.Xr cdbdump 1 ,
.Xr cdbmake 1
.Sh AUTHORS
.An D. J. Bernstein
.An Sam Nystrom Ao Mt sam@samnystrom.dev Ac (man page port)