There's a lot of discourse on Twitter about people using LLMs to solve CTF challenges. I used to write CTF challenges in a past life, so I threw a couple of my hardest ones at it.
We're screwed.
At least with text-file style challenges ("source code provided" etc), Claude Opus solves them quickly. For the "simpler" of the two, it just very quickly ran through the steps to solve it. For the more "ridiculous" challenge, it took a long while, and in fact as I type this it's still burning tokens "verifying" the flag even though it very obviously found the flag and it knows it (it's leetspeak and it identified that and that it's plausible). LLMs are, indeed, still completely unintelligent, because no human would waste time verifying a flag and second-guessing itself when it very obviously is correct. (Also you could just run it...)
But that doesn't matter, because it found it.
The thing is, CTF challenges aren't about inventing the next great invention or having a rare spark of genius. CTF challenges are about learning things by doing. You're supposed to enjoy the process. The whole point of a well-designed CTF challenge is that anyone, given enough time and effort and self-improvement and learning, can solve it. The goal isn't actually to get the flag, otherwise you'd just ask another team for the flag (which is against the rules of course). The goal is to get the flag by yourself. If you ask an LLM to get the flag for you, you aren't doing that.
(Continued)
We've invented service accounts all over again. MCP servers are quietly becoming the same overprivileged, under-monitored access brokers that have haunted enterprise security for years. Except this time, we're stacking them on top of the old ones.
https://go.aembit.io/s/mcp-servers-and-the-return-of-the-service-account-problem-25746
RE: https://techhub.social/@Techmeme/116177695971771546
Can't wait for Xbox to start giving people long form racism in Call of Duty.
Tired of guessing inputs? Let the computer do the work! Learn about symbolic execution from @barbie in "Reverse Engineering 3201" https://ost2.fyi/RE3201 and use SMT solvers to find the exact inputs to reach vulnerable code. Stop guessing, start solving!Β
I already knew that we use nonsense measurement systems here in the US. But only recently did I realize that a US gallon is different than a UK gallon.
RE: https://infosec.exchange/@mr_phrazer/116166155203519881
I also published my Ghidra Headless MCP that follows similar design principles: https://github.com/mrphrazer/ghidra-headless-mcp
New blog post: Perfect types with `setHTML()` - https://frederikbraun.de/perfect-types-with-sethtml.html - TLDR: Use require-trusted-types-for 'script'; trusted-types 'none'; in your CSP and nothing besides setHTML() works, essentially removing all DOM-XSS risks....
Composing Sanitizer configurations (https://frederikbraun.de/composable-sanitizers.html): The HTML Sanitizer API allows multiple ways to customize the default allow list and this blog post aims to describe a few variations and tricks we came up with while writing the specification.
Building a Super-Compact Cistercian Numerals Clock
https://hackaday.com/2026/03/08/building-a-super-compact-cistercian-numerals-clock/
Darknet Diaries 170: Phrack
"Phrack is legendary. It is the oldest, and arguably the most prestigious, underground hacking magazine in the world..."
I wrote a not very serious thing about #3Dprinter and #warhammer
https://matduggan.com/the-year-of-the-3d-printed-miniature-and-other-lies-we-tell-ourselves/