Near as I can tell, the activity around the #Struts2 bug,
CVE-2024-53677, is just ham-handed runs of some generalized PoC, and nobody's actually exploiting this yet (since exploitation would be very application/path specific).
Most of the news last week was all "exploitation happening, patch and rewrite everything now!" but not seeing any reports of successful (or even possibly successful) this morning.
Tell me I'm wrong!
(The PoC identified by SANS at https://isc.sans.edu/diary/31520 isn't specific to some particular application -- it's on the user to define upload_endpoint
and assumes no auth or session or anything.)
Using @voooooogel control vector library to backdoor a model so that it introduces command injection vulnerabilities rather than using safer subprocess methods
Hi all. In order to make the Defensive Security Podcast content a bit more approachable and easier to navigate, I've created a playlist of individual stories/segments we cover here: https://www.youtube.com/playlist?list=PLzHXsgtVDQEq9JiCbwJojE4nd9dRVAT5l
Note: I've only gone back 4 episodes, but will be doing this for all episodes going forward.
Happy holidays!
I started keeping a log of the serious attempts I've made to use generative AI for things (mostly coding-related). I've been bucketing them as successes or failures, along with the date and models used.
From the past several months, I'm up to 9 failures and 3 successes. I'll share this list some day.
When these systems have been successful, it's pretty neat. However, the successes I've seen have been for easy things, and the failures have mostly been time-sucks for me.
I feel like a heretic saying this (I'm a Principal Machine Learning Engineer), but I am not seeing a net benefit from using generative AI in my own work!
Iāll be honest, hearing SEO people complain about the state of Google now is like hearing an arsonist complain that they just canāt get the quality of kerosene they used to.
A few of my followers mentioned that they'd like to know about my background as a "musician", so I am very happy to share my story as an amateur who went from trying to form a high school band to publishing a track with Sony Music, performed by a famous singer and produced by an even more famous producer.
Buckle up! I hope it is going to be an inspirational story or something, because it is a story of giving up and starting again, and again, and again.
New post in my Hyundai Kona Electric reverse engineering series: introducing the Fakon project
https://www.projectgus.com/2024/12/fakon/
current debian no longer writes to syslog š¦
if you look in /var/log, someone left a README file.
the README says "you are looking for logs? but you cannot find them?" and continues in broken english, smugly telling you that systemd has made logs obsolete, and you should use "journal cattle" to ask politely for your own logs.
[did you just tell me to fuck off, jim?]
if you run journal cattle, it shows a page of syslog from april. if you hit G to go to the end, it hangs forever.
[slow clap]
Compiling C to Safe Rust, Formalized
Link: https://arxiv.org/abs/2412.15042
Discussion: https://news.ycombinator.com/item?id=42476192
LBs (https://peoplemaking.games/@david_chisnall@infosec.exchange/113690380907222545): not surprised at all copilot was a net negative in productivity. it can't be relied upon to write correct code which means you become the human code reviewer of machine generated code which is generated to *look* plausible
code review already (in my experience) has much higher cognitive load than just writing code yourself, and it would only be made worse by the fact that errors are likely to be particularly hard to detect because the LLM produces code that looks correct, something that wouldn't normally be an issue when reviewing code written by a human
Another #Ghidra goodie:
https://www.tripwire.com/state-of-security/ghidra-101-creating-structures-in-ghidra
structs are a bit annoying to reverse, especially when they are passed around like there's no tomorrow, and in part they track state, in part they refer to peripheral registers... x)
Man, corporations really want to put a stop to libraries:
https://www.cbc.ca/news/canada/ottawa/ottawa-library-e-books-queues-1.7414060?cmp=rss
"Depending on the title, public libraries may pay two or three times more for an e-book than they pay for its print edition. In some cases, the e-book may be up to six times the price, librarians told CBC."
"Those publishers ... will often license copies of e-books for just 12 or 24 months. Once that licence expires, libraries must repurchase access to the same book." #canada #cdnpoli #books
I finally turned off GitHub Copilot yesterday. Iāve been using it for about a year on the āfree for open-source maintainersā tier. I was skeptical but didnāt want to dismiss it without a fair trial.
It has cost me more time than it has saved. It lets me type faster, which has been useful when writing tests where Iām testing a variety of permutations of an API to check error handling for all of the conditions.
I can recall three places where it has introduced bugs that took me more time to to debug than the total time saving:
The first was something that initially impressed me. I pasted the prose description of how to communicate with an Ethernet MAC into a comment and then wrote some method prototypes. It autocompleted the bodies. All very plausible looking. Only it managed to flip a bit in the MDIO read and write register commands. MDIO is basically a multiplexing system. You have two device registers exposed, one sets the command (read or write a specific internal register) and the other is the value. It got the read and write the wrong way around, so when I thought I was writing a value, I was actually reading. When I thought I was reading, I was actually seeing the value in the last register I thought I had written. It took two of us over a day to debug this. The fix was simple, but the bug was in the middle of correct-looking code. If Iād manually transcribed the command from the data sheet, I would not have got this wrong because Iād have triple checked it.
Another case it had inverted the condition in an if statement inside an error-handling path. The error handling was a rare case and was asymmetric. Hitting the if case when you wanted the else case was okay but the converse was not. Lots of debugging. I learned from this to read the generated code more carefully, but that increased cognitive load and eliminated most of the benefit. Typing code is not the bottleneck and if I have to think about what I want and then read carefully to check it really is what I want, I am slower.
Most recently, I was writing a simple binary search and insertion-deletion operations for a sorted array. I assumed that this was something that had hundreds of examples in the training data and so would be fine. It had all sorts of corner-case bugs. I eventually gave up fixing them and rewrote the code from scratch.
Last week I did some work on a remote machine where I hadnāt set up Copilot and I felt much more productive. Autocomplete was either correct or not present, so I was spending more time thinking about what to write. I donāt entirely trust this kind of subjective judgement, but it was a data point. Around the same time I wrote some code without clangd set up and that really hurt. It turns out I really rely on AST-aware completion to explore APIs. I had to look up more things in the documentation. Copilot was never good for this because it would just bullshit APIs, so something showing up in autocomplete didnāt mean it was real. This would be improved by using a feedback system to require autocomplete outputs to type check, but then they would take much longer to create (probably at least a 10x increase in LLM compute time) and wouldnāt complete fragments, so I donāt see a good path to being able to do this without tight coupling to the LSP server and possibly not even then.
Yesterday I was writing bits of the CHERIoT Programmersā Guide and it kept autocompleting text in a different writing style, some of which was obviously plagiarised (when Iām describing precisely how to implement a specific, and not very common, lock type with a futex and the autocomplete is a paragraph of text with a lot of detail, Iām confident you donāt have more than one or two examples of that in the training set). It was distracting and annoying. I wrote much faster after turning it off.
So, after giving it a fair try, I have concluded that it is both a net decrease in productivity and probably an increase in legal liability.
Discussions I am not interested in having:
The one place Copilot was vaguely useful was hinting at missing abstractions (if it can autocomplete big chunks then my APIs required too much boilerplate and needed better abstractions). The place I thought it might be useful was spotting inconsistent API names and parameter orders but it was actually very bad at this (presumably because of the way it tokenises identifiers?). With a load of examples with consistent names, it would suggest things that didn't match the convention. After using three APIs that all passed the same parameters in the same order, it would suggest flipping the order for the fourth.
Exciting! My talk recording just dropped from #OBTS v7! š£ļøāØ Learn how to patch diff on Apple with #Ghidra, #ghidriff, and #ipsw: "Patch Different on *OS": https://www.youtube.com/watch?v=Ellb76t7nrc
Slides for my talk at H2HC 2024:
š¤æ Diving into Linux kernel security š¤æ
I described how to learn this complex area and knowingly configure the security parameters of your Linux-based system.
And I showed my open-source tools for that purpose!
https://a13xp0p0v.github.io/img/Alexander_Popov-H2HC-2024.pdf