diff options
author | Linus Nordberg <linus@nordu.net> | 2016-07-11 15:00:29 +0200 |
---|---|---|
committer | Linus Nordberg <linus@nordu.net> | 2016-07-11 15:00:29 +0200 |
commit | 3d4a9fdd338713c2f63da2b92940904762878d98 (patch) | |
tree | 2e8ee7375619d507f0f206be2c713aa12d17f048 /doc/merge.txt | |
parent | 1a36628401658def9ab9595f7cbcf72b8cb4eb6a (diff) | |
parent | bbf254d6d7f1708503f425c0eb8926af1b715b9c (diff) |
Merge remote-tracking branch 'refs/remotes/map/python-requests-chunked'
Diffstat (limited to 'doc/merge.txt')
-rw-r--r-- | doc/merge.txt | 60 |
1 files changed, 60 insertions, 0 deletions
diff --git a/doc/merge.txt b/doc/merge.txt index 28757a7..b2e2738 100644 --- a/doc/merge.txt +++ b/doc/merge.txt @@ -20,6 +20,66 @@ The merge process - merge-dist distributes 'sth' and missing entries to frontend nodes. +Merge distribution (merge_dist) +----------------------------------------------------- + + * get current position from frontend server (curpos) + + * send log + * sends log in chunks of 1000 hashes from curpos + + * get missing entries + * server goes through all hashes from curpos and checks if they are + present + * when the server has collected 100000 non-present entries, it + returns them + * server also keep a separate (in-memory) counter that caches the + index of the first entry that either hasn't been checked if it is + present or not, or that is checked and found to be non-present, + to allow the server to start from that position + + * send entries + * send these entries one at a time + * does not get more missing entries when it is done + + * send sth + * sends the previously (merge-sth) constructed sth to the server, + which verifies all entries and adds entry-to-hash and + hash-to-index + * saves the last verified position continuously to avoid doing the + work again if the verification is aborted and restarted + +Merge backup (merge_backup) +----------------------------------------------------- + + * get verifiedsize from backup server + + * send log: + * determines the end of the log by trying to send small chunks of + the log hashes from verifiedsize until it fails, then restarts + with the normal chunk size (1000) + + * get missing entries + * this stage is the same as for merge_dist + + * send entries + * send these entries in chunks of 100 at a time (this is limited + because of memory considerations and web server limits) + * when it is done, goes back to the "get missing entries" stage, + until there are no more missing entries + + * verifyroot + * server verifies all entries from verifiedsize, and then + calculates and returns root hash + * unlike merge distribution, does not save the last verified + position either continuously or when it is finished, which means + that it then has to verify all entries again if it is aborted and + restarted before verifiedsize is set to the new value + + * if merge_backup sees that the root hash is correct, it sets + verifiedsize on backup server + + TODO ==== |