We're in the process of developing a new version of our update server. It's at the point now where we have a development environment set-up for it, and slaves attempt to submit data to it. Recently we came across an issue where slaves were unable to successfully validate the SSL certificate of the development environment. Specifically, it was raising this error from OpenSSL:
requests.exceptions.SSLError: [Errno 1] _ssl.c:480: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
Our client uses the Python Requests library, and gives it a specific certificate to validate the server certificate with. For comparison we tried using openssl's s_client with the same cert, which was able to validate the certificate just fine given the same inputs. HRM.
I tried tons of things to get it to work - different versions of the Requests library, giving it the root + intermediary cert, different versions of Python - but nothing worked! I couldn't even reproduce the behaviour with pure openssl no matter what inputs I gave it. I reached out to #security for support and Yvan couldn't even reproduce the problem on his Mac!
Eventually I reached out to #python-requests, thinking it was a bug in the library. Someone there suggested strace'ing both my python script and openssl. I did that, and found something very interesting: despite being given an explicit certificate bundle, openssl fell back onto the system certificates -- my python script didn't. To verify this, I changed my script to look at the full system root rather than just a specific certificate or bundle. After doing that, everything worked fine.
My belief is that one of PyOpenSSL, urllib3, or Requests doesn't know how to look at the system certificate store at all on Windows or Linux. On Mac, it seems to fall back just fine on it. One thing is certain: security is hard, especially debugging it.
Big thanks to Jake Maul, Rail, Yvan Boily, Dveditz, and #python-requests for helping me debug this.