TOOD: split this file into README.md, doc/design.md, doc/config.md. # p11p -- PKCS #11 proxy performing failover and load balancing p11p is a shared library and a daemon, both running on the same host as a Cryptoki application, intercepting the communication with a cryptographic device (typically a HSM) with the goal of dealing with error handling and load balancing between devices. ASCII art time! +------------------------------------=--------------+ | PC/server/laptop | | | | +----------------------------+ | | | application (process) | +---------------+ | | | | | p11p-daemon | | | | +------------------------+ | | | | | | | p11p-client.so (solib) |--->| +-----------+ | | | | +------------------------+ | | | vendor.so | | | | +----------------------------+ | +-----------+ | | | +------|------- + | +----------------------------------------|----------+ v +-----+ | HSM | +-----+ ## Goals * Detect when a Cryptoki library operation fails and retry the operation, possibly on another cryptographic device. * Provide failover and load balancing between cryptographic devices. * Put some ground between a Cryptoki application and a Cryptoki library from vendor. ## Non-goals * Take control over the TCP session between a Cryptoki application and a cryptographic device. Could be accomplished by providing proxying / forwarding of PKCS #11 sessions to a system with access to a PKCS #11 aware cryptographic device. ## Functionality ## Inspiration - [p11-kit https://github.com/p11-glue/p11-kit/] ## Overview and design criterias User application --(dlopen)--> p11p-client.so --(-over-unix-socket)--> p11p-daemon --(fork+exec, stdin/stdout)--> p11p-helper --(dlopen)--> $vendor.so --(vendor-specific)--> PKCS #11 token - Typical sequence of events - User application dlopens `p11p-client.so` as a "Cryptoki library" - `p11p-client.so` connects to `p11p-daemon` running on the same system, over a unix socket (AF_UNIX). - `p11p-daemon` forks a process and executes `p11p-helper` - `p11p-helper` dlopens the appropriate Cryptoki library from $vendor and forwards the Cryptoki calls there - The daemon, `p11p-daemon`, handles both load balancing and failover, according to configuration per (set of) token(s). - All run on reasonable Linux and BSD systems. - Somewhat isolating (and potentially constraining) the running of token solibs by forking before loading them. - The Cryptoki stub library, `p11p-client.so`, is implemented in C. TBD: Use code from p11-kit for this? p11-kit-client.so uses libffi (virtual.c) and its own serialisation code (rpc-message.c), both of which sounds unnecessary unsafe, but for a PoC might be good enough. - The daemon, `p11p-daemon`, is implemented in something not too crazy, like Erlang or Rust, taking the deployment story into account -- being self-contained is a worthwhile goal. - The daemon children, `p11p-helper`, are executable programs using the Cryptoki API, implemented in C (or possibly another language that can dlopen and run the solib from vendor). - Wire protocol between `p11p-client.so` and `p11p-daemon` is TBD but should be designed for simple parsing in C. It runs over an AF_UNIX socket and needs only serialisation of Cryptoki calls -- no addressing and minimal framing (like a message length). TBD: Serialise (using Trunnel) and use an end-of-record sequence instead? ### PKCS #11 #### Supported PKCS #11 mechanisms TBD ## Use cases - When vendor library is not so great at TCP and the network between the host running the application and the cryptographic device is messing with TCP sessions, catch the failure (f.ex. by timing out) and retry the operation behind the back of the application. - Migrating from one kind of HSM to another kind of HSM. p11p-daemon can be configured to use more than one HSM. As long as they provide the same funtcions using the same key(s), p11p-daemon can provide fall back functionality between different HSM's from different vendors. ## configuration All of PKCS11.CONF(5), from p11-kit, plus the following module configuration fields. - in-group: The name of a group that this module is part of. Each group name mentioned in any module configuration will result in a virtual token being created, named TBD. A virtual token has one or more backing modules, determined by the modules that list the name of the virtual token in 'in-group'. The order of the backing modules is influenced by setting 'priority'. The backing module at the top of the list is the current backing module at start-up. The current backing module changes to the next in the list of backing modules when a PKCS #11 request fails to respond within 'timeout' a number of times equal to 'retries'. To configure a load balancing virtual token, set 'timeout' to a non-zero value and set 'retries' to zero (the default). A virtual token is used as any ordinary token and will be forwarding PKCS #11 calls and responses to and from its current active module. By default, a module is not part of any group. - timeout: An integer denoting the timeout in seconds for a PKCS #11 request. A timeout of zero means that there is no timeout. The default timeout value is zero. - retries: The number of retries after a timeout that this module will have before it's being reloaded. Reloading of a module that is part of a group (a backing module) makes the virtual token switch to the next backing module in its list. The default retries value is zero. - env: NAME=value is set in the environment of the process loading the module. - chroot: A path in which the process loading the module will chrooted to prior to loading the module. ## NOTES ## External dependencies TBD ## External documentation - [OASIS PKCS 11 TC](https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=pkcs11) # README ## Building echo $PATH | egrep -q rebar3 || export PATH=$PATH:~/.cache/rebar3/bin make