summaryrefslogtreecommitdiff
path: root/cdb.5
diff options
context:
space:
mode:
authorSam Nystrom <sam@samnystrom.dev>2024-01-15 13:35:27 -0500
committerSam Nystrom <sam@samnystrom.dev>2024-01-15 13:35:27 -0500
commit2bdd00aa69b901e5230c9b8c24727011626ebeaa (patch)
tree27967a3ccc64ac477cb0336f4e61282e8ab832ff /cdb.5
Diffstat (limited to 'cdb.5')
-rw-r--r--cdb.553
1 files changed, 53 insertions, 0 deletions
diff --git a/cdb.5 b/cdb.5
new file mode 100644
index 0000000..1ef87dc
--- /dev/null
+++ b/cdb.5
@@ -0,0 +1,53 @@
+.Dd September 14, 1996
+.Dt CDB 5
+.Os
+.Sh NAME
+.Nm cdb
+.Nd a structure for constant databases
+.Sh DESCRIPTION
+A
+.Nm
+is an associative array: it maps strings
+.Pq Dq keys
+to strings
+.Pq Dq data .
+.Pp
+A cdb contains 256 pointers to linearly probed open hash tables. The
+hash tables contain pointers to (key,data) pairs. A cdb is stored in
+a single file on disk:
+.Bd -literal
+ +----------------+---------+-------+-------+-----+---------+
+ | p0 p1 ... p255 | records | hash0 | hash1 | ... | hash255 |
+ +----------------+---------+-------+-------+-----+---------+
+.Ed
+.Pp
+Each of the 256 initial pointers states a position and a length. The
+position is the starting byte position of the hash table. The length
+is the number of slots in the hash table.
+.Pp
+Records are stored sequentially, without special alignment. A record
+states a key length, a data length, the key, and the data.
+.Pp
+Each hash table slot states a hash value and a byte position. If the
+byte position is 0, the slot is empty. Otherwise, the slot points to
+a record whose key has that hash value.
+.Pp
+Positions, lengths, and hash values are 32-bit quantities, stored in
+little-endian form in 4 bytes. Thus a cdb must fit into 4 gigabytes.
+.Pp
+A record is located as follows. Compute the hash value of the key in
+the record. The hash value modulo 256 is the number of a hash table.
+The hash value divided by 256, modulo the length of that table, is a
+slot number. Probe that slot, the next higher slot, and so on, until
+you find the record or run into an empty slot.
+.Pp
+The cdb hash function is
+.Ql h = ((h << 5) + h) ^ c ,
+with a starting hash of 5381.
+.Sh SEE ALSO
+.Xr cdbget 1 ,
+.Xr cdbdump 1 ,
+.Xr cdbmake 1
+.Sh AUTHORS
+.An D. J. Bernstein
+.An Sam Nystrom Ao Mt sam@samnystrom.dev Ac (man page port)