Name: certificate-revocation-analysis
Owner: Mozilla Services
Description: Tools for Gathering Certificate Revocations and Performing Analysis of CRLite Filter Sizes in Firefox
Created: 2018-04-18 18:09:47.0
Updated: 2018-04-18 18:09:50.0
Pushed: 2018-04-26 12:54:29.0
Size: 3981
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This collection of tools is designed to assemble a cascading bloom filter containing all TLS certificate revocations, as described in this CRLite paper.
These tools were built from scratch, using the original CRLite research code as a design reference and closely following the documentation in their paper.
CT parsed.*
certificates.certificates
E validation.nss.valid = TRUE
Download the exported certificates, which will be provided in several hundred
files. The recommended method is to copy-paste the provided download URLs into
a file on your target machine, then use wget -i URL_FILE
to download all
of the certificate files.
Unzip the certificate files and place their contents in a single, unified file.
Unzip with gzip -u *.gz
, then unify the files with cat *.json > certificates.json
.
You can then delete all files except for certificates.json
. (If you're using
the sample file, then just unzip it and rename it as certificates.json
).
Set get_CRL_revocations
as the working directory. This folder contains all scripts for Part B.
Extract the CRL distribution points by running python extract_crls.py
. This
script will output three files: a file of all certificates which have listed CRLs(../certs_using_crl.json
),
a file of all certificates which do not list a CRL(../certs_without_crl.json
),
and a list of all CRL distribution points (CRL_servers
).
Sort and eliminate duplicate entries in CRL_servers
using the command
sort -u CRL_servers > CRL_servers_final
. You can compare your CRL_servers_final
to my reference CRL list
to see that the replication results are similar up to this point.
Download all of the CRLs listed in CRL_servers_final
. First create a new subdirectory raw_CRLs
, set it as the working directory, then run aria2c -i ../CRL_servers_final -j 16
.
Set the working directory back one level up (to get_CRL_revocations
again).
Create a catalogue, or “megaCRL,” of all revocations with python3 build_megaCRL.py
script (note that this must use python3 and pyopenssl version 16.1.0 and above).
This will output megaCRL
, which contains all revocation serial numbers
organized by CRL.
Use python count_serials.py
to see the total number of revocation serials that are
contained in the megaCRL. You can compare your results against mine by using the
same script on my reference megaCRL file.
Make a new subdirectory revokedCRLCerts
, then match the revocation serial numbers to known certificates using python build_CRL_revoked.py
.
This script uses multiprocessing to get around the I/O bottleneck,
and you may need to adjust the number of “worker” processes to get optimal
speed on your machine. Each worker has a dedicated output file, so after the script you
will need to combine each output file into a single, final result using
cat revokedCRLCerts/certs* > ../final_CRL_revoked.json
.
Count the number of actual revoked certificates using wc -l final_CRL_revoked.json
.
Set get_OCSP_revocations
as the working directory. This folder contains all scripts for Part C. Make a subdirectory called OCSP_revoked
.
Use python build_OCSP_revoked.py
to determine all Let's Encrypt revocations.
This tooling replicates the process of the CRLite authors, and I believe they made this
design choice to only include OCSP for Let's Encrypt based off the statistic that the
vast majority of OCSP-only certificates are issued by them. After the script completes,
combine the results of each worker into a final output file with
cat OCSP_revoked/certs* > ../final_OCSP_revoked.json
.
Set build_filter
as the working directory. This folder contains all scripts for Part D.
Make subdirectories final_unrevoked
and final_revoked
.
Use python build_final_sets.py
to convert the data created from the steps above into a single
set of all revoked certificates and all valid certificates. This script uses multiprocessing,
so after running the script you will need to use cat final_unrevoked/*.json > ../final_unrevoked.json
and cat final_revoked/*.json > ../final_revoked.json
to combine the results of the individual
workers into a single file. You can see how your results match against mine by comparing
against this file.
Use the command node ./build_filer.js --max_old_space_size=32768 > filter
to assemble
the final filter. Be sure to change the REVOKED
and UNREVOKED
constants to reflect
accurately. (acknowledgements to James Larisch for the build_filter.js code)
Thanks to Eric Rescorla, J.C. Jones, James Larisch, the CRLite research team and the Mozilla Cryptography Engineering team.