1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
|
-*- markdown -*-
Overview
========
This document describes the design of catlfish, an implementation of a
Certificate Transparency (RFC6962) log server.
+------------------------------------------------+
| front end nodes |
+------------------------------------------------+
^ | |
| v v
| +---------------+ +---------------+
| | storage nodes | | signing nodes |
| +---------------+ +---------------+
| ^ ^
| | |
+------------------------------------------------+
| merge master |
+------------------------------------------------+
^ |
| v
| +----------------------------------+
| | merge slaves |
| +----------------------------------+
| ^
| |
+-------------------+
| merge-repair node |
+-------------------+
Design assumptions
------------------
* The database grows with 5 GB per year, based on 5,000 3 kB
submissions per day
* Max size is 300 GB, based on 100e6 certificates
* submissions: less than 0.1 qps, based on 5,000 submissions per day
* monitors: 6 qps, based on 100 monitors
* auditors: 8,000 qps, based on 2.5e9 browsers visiting 100 sites
(with a 1y certificate) per month (assuming a single combined
request for doing get-sth + get-sth-consistency + get-proof-by-hash)
Open questions
--------------
* What's a good MMD? Google seems to sign a new tree after 60-90
minutes (early 2014). They don't promise an MMD but aim to sign at
least once a day.
Terminology
===========
CC = Certificate Chain
CT log = Certificate Transparency log
Front-end node
==============
* Handles all http requests.
* Has a complete copy of the published data locally.
* Read requests are answered directly by reading local files
and calculating the answers.
* Add requests are validated and then sent to all storage
nodes. At the same time, a signing request is sent to one or
more of the signing nodes. When responses have been received
from a predetermined number of storage nodes and one signing
response has been received, a response is sent to the client.
* Has an inward-facing API with the entry points SendLog(Hashes),
MissingEntries() (returns a list of hashes), SendEntry(Entry),
SendSTH(STH), CurrentPosition().
Storage node
============
* Stores certificate chains and SCTs.
* Has a write API SendEntry(Entry) that stores the certificate chain
in a database, indexed by its hash. Then stores the hash in a list
NewEntries.
* Takes reasonable measures to ensure that data is in permanent
storage before sending a response.
* When seeing a new STH, moves the variable start to the index of the
first unpublished hash.
* Has a read API FetchNewEntries() which returns
NewEntries[start...length(NewEntries)-1].
Signing node
============
* Has the signing key for the log.
Merging node
============
* The master is determined by configuration.
* The other merging nodes are called "slaves".
* The master has two phases, merging and distributing.
Merging (master)
----------------
* Fetches CCs by calling FetchNewEntries() on storage node i
where i = 0...(n-1)
* Determines the order of the new entries in the CT log.
* Sends the entries to the slaves.
* Calculates the tree head and asks a signing node to sign it.
* When a majority of the slaves have acknowledged the entries,
compares the calculated tree head to the tree heads of the slaves.
If they match, considers the additions to the CT log final and
begins the distributing phase.
Merging (slave)
---------------
* Receives entries from the master. The node must be certain
that the request comes from the current master, and not
an old one.
* Takes reasonable measures to ensure that data is in
permanent storage.
* Calculates the new tree head and returns it to the master.
Distributing
------------
* Performs the following steps for all front-end nodes:
* Fetches curpos by calling CurrentPosition().
* Calls SendLog() with the hashes of CCs from curpos to newpos.
* Fetches missing_entries by calling MissingEntries(), a list
of hashes for the CCs that the front-end nodes does not
have.
* For each hash in missing_entries, upload the CC by calling
SendEntry(CC).
* Send the STH with the SendSTH(STH) call.
Merge-repair node
=================
* There is only one of these nodes.
* When this node detects that an STH has not been published
in t seconds, it begins the automatic repair process.
Automatic repair process
------------------------
* Turn off all reachable merge nodes.
* If a majority of the merge nodes cannot be reached,
die and report.
* Fetch the CT log order from the merge nodes.
* Determine the latest version of the log.
* Select a new master.
* Change the configuration of the merge nodes so that
they know who the new master is.
* Start all merge nodes.
* If any of these steps fail, die and report.
* If all steps succeed, die and report anyway. The automatic
repair process must not be restarted without manual
intervention.
|