catlfish design (in Emacs -*- org -*- mode) This document describes the design of catlfish, an implementation of a Certificate Transparency (RFC6962) log server. We have - a primary database storing x509 certificate chains [replicating r/o copies to a number of frontend nodes?] - a hash tree kept in RAM - one secondary database per frontend node, storing the most recently submitted data - a cluster of backend nodes with an elected leader which periodically updates the primary db with data from the secondary db's - a number of frontend nodes accepting http requests, updating secondary db's and reading from local r/o copy of the primary db - a private key used for signing SCT's and STH's, kept (in HSM:s) on backend nodes Backend nodes - are either asleep, functioning as storage only or - store submitted cert chains in persistent media - have write access to the primary database holding cert chains - periodically append new cert chains to the hash tree and sign the tree head Frontend nodes - reply to the http requests specified in RFC 6962 - write submitted cert chains to their own, single master, secondary database - have read access to (a local copy of) the primary database - defer signing of SCT's (and STH's) to backend nodes The primary database - stores cert chains and their corresponding SCT's - is indexed on a running integer (primary) and a hash of the cert chain (secondary) - runs on backend nodes - is persistently stored on disk on several other backend nodes in separate data centers - grows with 5 GB per year, based on 5,000 3 kB submissions per day - max size is 300 GB, based on 100e6 certificates The secondary databases - store cert chains, unordered, between hash tree generation - run on frontend nodes - are persistently stored on disk on several other frontend nodes - are typically kept in RAM too - max size is around 128 MB, based on 10 submissions (รก 3 kB) per second for an hour Scaling, performance, estimates - submissions: less than 0.1 qps, based on 5,000 submissions per day - monitors: 6 qps, based on 100 monitors - auditors: 8,000 qps, based on 2.5e9 browsers visiting 100 sites (with a 1y certificate) per month (assuming a single combined request for doing get-sth + get-sth-consistency + get-proof-by-hash) Open questions - What's a good MMD? Google seems to sign a new tree after 60-90 minutes (early 2014). They don't promise an MMD but aim to sign at least once a day. A picture +-----------------------------------------------+ | front end nodes | +-----------------------------------------------+ ^ ^ ^ ^ | | | | | v | | | short term long term | | cert db cert db copy | | ^ | | | v +-----------------------------------------------+ | tree makers | mergers | signers | +-----------------------------------------------+ ^ ^ \ | \ v ------------- long term cert db [TODO: Update terms in text or picture so they match: secondary database == short term cert db primary database == long term cert db backend nodes == box with tree makers, mergers and signers] [TODO: Move the picture to the top of the document.]