@buherator@infosec.place I've thought about this but my general conclusion was that benchmarking strategies that you would normally use for this just won't work. LLMs seem to find bugs more efficiently for statically difficult to reach bugs, but fuzzers are way faster for parsing bugs and (when grammars/semantic generators are used) even stateful bugs.
@buherator So you'd first need to compare the signal to noise ration, which is approximately infinity, and ...