@soatok I've neither been a fan of the python cryptography library nor OpenSSL, but this was an extremely refreshing read. I'm happy someone else has caught up on the substantial problems with OpenSSL. Thank you for sharing!
@soatok Holy crap on a cracker, I knew things at OpenSSL were bad, but *this* bad? 🤔
@soatok I am not a user of OpenSSL, being more down in the hardware, but these design choices are insane.
As if the developers are pleasuring themselves instead of trying to provide a finite implementation of a finite specification.
@kurtmrufa @soatok The problems covered in OP is only the surface. I invite you to look at the spaghetti code hellscape that is the OpenSSL source code.
@FritzAdalis @soatok @kevin As others have said this is not new, but the use of Perl for code generation seemingly has expanded.
Wait, what?
For a concrete comparison of the verbosity, performing an ML-KEM encapsulation with OpenSSL takes 37 lines with 6 fallible function calls. Doing so with BoringSSL takes 19 lines with 3 fallible function calls.
I’ve recently been working on a CHERIoT port of mlkem-native. Doing an ML-KEM encapsulation with that library is, uh, one function call. Because that’s what MK-KEM does. How would you split it into even three functions in an API for users?
As an aside, I’ve been hugely impressed with those projects. When I raised PRs (don’t worry, I didn’t touch the crypto implementations):
A library committed to security needs to make a long-term commitment to a migration to a memory safe programming language
It’s worth noting that there are tools for writing memory-safe C, which can be a better choice for portability (especially to embedded targets). The EverCrypt project produced C code, but effectively as an intermediate representation from their memory-safe source language. Both mlkem-native and mldsa-native use CBMC to prove memory safety. Neither of these approaches is applicable for arbitrary large C codebases but crypto libraries are not normal codebases.
The thing you can’t do is write a load of C with no tooling and then hope you get memory safety right later.
One of their criticisms (indirection layers killing performance) is not unique to OpenSSL. We were using mBedTLS for some of our demos. When we moved to a newer version, we found that the PSA abstractions that they’d built added about 70 KiB of binary size to a minimal build. For reference, the total code size for a minimal build before the update was about 20 KiB.
> many OpenSSL source files are no longer simply C files, they now have a custom Perl preprocessor for their C code.
This horrifies me.
@soatok I've read it multiple times now and each time baffled at the OSSL_PARAM thing. The given reason (having the same ABI for different algorithms) is not a great reason for adding this much complexity, and any other reason I can think of (ABI compatibility between versions) can be done in less complex and error prone ways. It feels like the kind of solution someone comes up with who wants to show just how clever they are.
@soatok Good read.
"We do not fully understand the motivations that led to the public APIs and internal complexity we’ve described here. We’ve done our best to reverse engineer them by asking “what would motivate someone to do this” and often we’ve found ourselves coming up short."
The purpose of the system is what it does. Cui bono?
@aoristdual @soatok This would get rejected from the Underhanded C Contest for being too obviously nefarious.
@david_chisnall @soatok I'm guessing "create", "do" and "destroy", for some object lifetime. How openssl makes it into 6 ... I dunno, maybe some BIO wrapper removal?
Hmm, possibly. But ML-KEM isn't like ML-DSA where you might choose to sign the entire document and not prehash, it's entirely operating on fixed-size objects (inputs and outputs).
I'm not sure about the algorithm details (and I treasure my ignorance, because it prevents me from thinking I can touch that code without breaking it), you might be able to separate the generate-key and encrypt-key steps, but my understanding was that they're fairly tightly coupled.