diff options
Diffstat (limited to 'doc/db.md')
-rw-r--r-- | doc/db.md | 97 |
1 files changed, 79 insertions, 18 deletions
@@ -19,35 +19,55 @@ Data entries are stored together with three attributes: hash over specific parts of the entry, usually together with a timestamp for use as a leaf in a merkle tree -## Storage in a file system +## Storage -Two files (catlfish names in parentheses): +Data is kept in two regular files and three key-value stores. -- treesize (treesize) +The two regular files, regardless of database backend: + +- treesize filename is static, file contains one line -- the number of entries in the part of the database up to and including the last entry in the last published tree, like a "current pointer" -- index (index) +- index + + filename is static, file contains one line per entry -- the leaf + hash + +The three key-value stores, in one of two formats -- fsdb or permdb +(described in separate sections of this document): + +- entry + + key=leaf hash, value=the actual data of the entry + +- entryhash - filename is static, file contains one line per entry -- the leafhash + key=entry hash, value=leaf hash -Three key-value stores, implemented as "bucketed" directory trees with -one file, named as the key, per database entry (catlfish names in -parentheses): +- indexforhash -- entry (certentries) + key=leaf hash, value=index - filename=leafhash, content=the actual data of the entry +## fsdb backend -- entryhash (entryhash) +The fsdb backend uses regular files in a file system. fsdb is +implemented as "bucketed" directory trees with one file per key-value +pair. The file name is the key and the file content is the value. - filename=entryhash, content=leafhash +For a concrete example, here's how catlfish names the three key-value +stores used in plop: -- indexforhash (certindex) +- entry: 'certentries' +- entryhash: 'entryhash' +- indexforhash: 'certindex' - filename=leafhash, content=index +## permdb backend + +The permdb backend uses a C implementation of a key-value store +optimised for append-only. See permdb.md for a description of permdb. ## Distributed @@ -57,8 +77,8 @@ TODO: describe distribution - db.erl - public interface for adding entries and getting entries by index, - leaf hash and entry hash + public interface for adding database entries as well as retrieving + entries by index, leaf hash or entry hash - index.erl @@ -67,7 +87,15 @@ TODO: describe distribution - perm.erl - reading and writing of files + dispatching to configured database backend -- fsdb or permdb + +- fsdb.erl + + file-based database backend + +- permdb.erl + + interface to C implementation of key-value store - atomic.erl @@ -94,4 +122,37 @@ TODO: describe distribution - erlport.c - glue + erlang/C glue + +- filebuffer.c + + buffered files + +- permdb.c + + permdb implementation + +- permdbport.c + + erlang port for permdb + +- permdbpy.c + + python bindings for permdb + +- permdbtest.c + + permdb tests + +- pstring.h + + pascal string implementation + +- utarray.h +- uthash.h + + array and hash table implementations + +- util.c + + helper functions |