diff options
Diffstat (limited to 'cdb.5')
| -rw-r--r-- | cdb.5 | 53 |
1 files changed, 53 insertions, 0 deletions
@@ -0,0 +1,53 @@ +.Dd September 14, 1996 +.Dt CDB 5 +.Os +.Sh NAME +.Nm cdb +.Nd a structure for constant databases +.Sh DESCRIPTION +A +.Nm +is an associative array: it maps strings +.Pq Dq keys +to strings +.Pq Dq data . +.Pp +A cdb contains 256 pointers to linearly probed open hash tables. The +hash tables contain pointers to (key,data) pairs. A cdb is stored in +a single file on disk: +.Bd -literal + +----------------+---------+-------+-------+-----+---------+ + | p0 p1 ... p255 | records | hash0 | hash1 | ... | hash255 | + +----------------+---------+-------+-------+-----+---------+ +.Ed +.Pp +Each of the 256 initial pointers states a position and a length. The +position is the starting byte position of the hash table. The length +is the number of slots in the hash table. +.Pp +Records are stored sequentially, without special alignment. A record +states a key length, a data length, the key, and the data. +.Pp +Each hash table slot states a hash value and a byte position. If the +byte position is 0, the slot is empty. Otherwise, the slot points to +a record whose key has that hash value. +.Pp +Positions, lengths, and hash values are 32-bit quantities, stored in +little-endian form in 4 bytes. Thus a cdb must fit into 4 gigabytes. +.Pp +A record is located as follows. Compute the hash value of the key in +the record. The hash value modulo 256 is the number of a hash table. +The hash value divided by 256, modulo the length of that table, is a +slot number. Probe that slot, the next higher slot, and so on, until +you find the record or run into an empty slot. +.Pp +The cdb hash function is +.Ql h = ((h << 5) + h) ^ c , +with a starting hash of 5381. +.Sh SEE ALSO +.Xr cdbget 1 , +.Xr cdbdump 1 , +.Xr cdbmake 1 +.Sh AUTHORS +.An D. J. Bernstein +.An Sam Nystrom Ao Mt sam@samnystrom.dev Ac (man page port) |
