From 9a8f8bd8da784622fc340c8466435551a2d2d268 Mon Sep 17 00:00:00 2001 From: Magnus Ahltorp Date: Fri, 18 Mar 2016 16:04:40 +0100 Subject: Added description of current merge implementation --- doc/merge.txt | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) (limited to 'doc/merge.txt') diff --git a/doc/merge.txt b/doc/merge.txt index 28757a7..b2e2738 100644 --- a/doc/merge.txt +++ b/doc/merge.txt @@ -20,6 +20,66 @@ The merge process - merge-dist distributes 'sth' and missing entries to frontend nodes. +Merge distribution (merge_dist) +----------------------------------------------------- + + * get current position from frontend server (curpos) + + * send log + * sends log in chunks of 1000 hashes from curpos + + * get missing entries + * server goes through all hashes from curpos and checks if they are + present + * when the server has collected 100000 non-present entries, it + returns them + * server also keep a separate (in-memory) counter that caches the + index of the first entry that either hasn't been checked if it is + present or not, or that is checked and found to be non-present, + to allow the server to start from that position + + * send entries + * send these entries one at a time + * does not get more missing entries when it is done + + * send sth + * sends the previously (merge-sth) constructed sth to the server, + which verifies all entries and adds entry-to-hash and + hash-to-index + * saves the last verified position continuously to avoid doing the + work again if the verification is aborted and restarted + +Merge backup (merge_backup) +----------------------------------------------------- + + * get verifiedsize from backup server + + * send log: + * determines the end of the log by trying to send small chunks of + the log hashes from verifiedsize until it fails, then restarts + with the normal chunk size (1000) + + * get missing entries + * this stage is the same as for merge_dist + + * send entries + * send these entries in chunks of 100 at a time (this is limited + because of memory considerations and web server limits) + * when it is done, goes back to the "get missing entries" stage, + until there are no more missing entries + + * verifyroot + * server verifies all entries from verifiedsize, and then + calculates and returns root hash + * unlike merge distribution, does not save the last verified + position either continuously or when it is finished, which means + that it then has to verify all entries again if it is aborted and + restarted before verifiedsize is set to the new value + + * if merge_backup sees that the root hash is correct, it sets + verifiedsize on backup server + + TODO ==== -- cgit v1.1