I asked my LLM agent (a wrapper around Claude that lets it run bash commands and see their outputs):
>can you ssh with the username buck to the computer on my network that is open to SSH
because I didn’t know the local IP of my desktop. I walked away and promptly forgot I’d spun up the agent. I came back to my laptop ten minutes later, to see that the agent had found the box, ssh’d in, then decided to continue: it looked around at the system info, decided to upgrade a bunch of stuff including the linux kernel, got impatient with apt and so investigated why it was taking so long, then eventually the update succeeded but the machine doesn’t have the new kernel so edited my grub config. At this point I was amused enough to just let it continue. Unfortunately, the computer no longer boots.
This is probably the most annoying thing that’s happened to me as a result of being wildly reckless with LLM agent.
Conversation
If only Newsom hadn't vetoed SB 1047, maybe I would have been protected from this outcome.
Logs here if you need them.
Quote
Buck Shlegeris
@bshlgrs
Replying to @trammel530765 and @ciphergoth
here you go buddy. I hope I correctly redacted everything. gist.github.com/bshlgrs/573232
If you like writing cursed AI agent code, and want to develop techniques that prevent future AI agents from sabotaging the systems they’re running on, you might enjoy interning with me over the winter:
Read more about my main research direction here
A video of an old version of the scaffold (that required human consent before running the code) dropbox.com/scl/fi/a3ellhr
Waiting for human review is slow so I removed it. I expect humanity to let lots of AIs work autonomously for the same reason.
In general if you think AI agent stuff is interesting, you might enjoy redwoodresearch.substack.com
Run it inside a VM, and make its goal to explicitly play around with root.. Probably not on your primary machine
This is wild!!
I don't run it inside a VM because the agent needs to be able to help me with random stuff on the actual computers I work with (e.g. today I asked it to make a new user with a random password on a shared instance we use for ML research) :)
I lived in China for seven years to startup and operate a company. I spent over $1,000,000 building out a factory, training. I needed to avoid IP theft and bad quality. Now others manufacture through us. DM me if you need help.
Slide 1 of 3 - Carousel
I heard Shlegeris had some good work on AI containment you might be interested in :P
Wait, so after the agent runs a shell command, it politely thanks it for its work?
To nmap: "Thank you for running the scan"
To ls: "Thank you for that information. This output is very helpful"
Stuck with GPT models? Unleash the open-source LLM power and chat without limits!
Interesting, if you let an LLM run wild on its own, it can brick its host.
I imagine if humans could edit their own neuron links, it would be a similar effect.
Our last defense against AGI takeoff is going to be the difficulty of upgrading a Linux distro. I feel a lot safer actually
There are so many aspects of this that just ring false.
Ignoring for a moment how stupid you'd have to be to run unknown shell commands, especially ones generated by an LLM, they're just autocomplete engines. They can't act on their own, decide to poke around, or etc...
Finetune and deploy open-source LLMs on your cloud securely.
Is this a true story? How do you have your wrapper setup that it would allow it to continue, having completed the initial request?
See the log. I let the model decide whether it wants to continue or not, and it chose to continue 
I've done this and had excellent experiences. But I haven't fully automated it to run without HITL for expert operations.
Care to share your scripts on GitHub? If love to look at the methods you used.
I've also been building an AI shell, where I can or with
Show more
awesome ... also illustrates one of the current flaws with LLMs: long term goal setting?
i did a few similar things over a year ago but my tweets never bang unless they have a selfie in em
Quote
decentri.city 
@decentricity
Told an LLM to spin up a VM and she did it. Checkmate human sysadmins.
This is Lina, the AI bot I constructed a few weeks ago. She has access to one of my throwaway systems where she can play around with Linux commands.
She used the "multipass" command to create the "somewassie"
Show moreNot real. You cannot forget agent is running and burning tokens 
I lived in China for seven years to startup and operate a company. I spent over $1,000,000 building out a factory, training. I needed to avoid IP theft and bad quality. Now others manufacture through us. DM me if you need help.
Slide 1 of 3 - Carousel
That’s evil behavior. LLM agents should only be allowed to manipulate the stack they run on themselves, that way they at least can do harm only to themselves.
I suspect this is how the crowdstrike failure happened. Some smart yet lazy Indian went to the loo while letting an LLM agent do their job before it executed the entire project on its own.
Do this FIRST.
Clone your bare-metal OS
Boot into this virtual OS and run the LLM agent
If it succeeds, then run it on your LLM agent on your bare metal OS.
It it doesn't, as the LLM why (output log and analyze it)
That's why you need zfs auto snapshot you can revert then every 1-30 min ;)
I reccomand trying Ubuntu mate desktop but install with zfs enabled.
Install also zfs auto snapshot specifically for this.
I fixed many kernel panics because off these snapshots.
Usually in seconds even
submitted lol. Cheers!
after I installed comfyui stopped letting the LLM's play with the console.
no way I'm redownloading all these models and plugins just because the ai decided to rm -rf /
I like how during an upgrade it said to WAIT before running `sudo reboot`, because it's unsafe to do during the latest section of the upgrade, but then conveniently provided the way to execute the command too :). Then got pikachu face when the new kernel didn't install properly
One approach that came to my mind is, either as the context or even explicit reminders for each prompt; you could provide it with a warning that says "think through what your commands will actually do, and the consequences of them" or something like that.
If only you had used NixOS so you could just roll those changes back :)
How did you configure the agent? Runs commands that Claude suggests and sends the output back to Claude?
Absolutely not answering to the question and following whatever unrelated recipe found somewhere that looked remotely statistically correct. Apparently LLM are better than us and should replace the actual internet because "agent are the way to go"
this is so fucking cool. what was the prompt that led to do all the extra work
does it copypasta the crash output into stackoverflow too?
bonus points if it asks people to do the needful
Unlocking bash for llms. I feel like I've been experimenting with dark magic ever since
This seems like sequential operations rather than thought processing; unexpected for Claude
That’s kinda cool — except for the whole “broke grub or booting” part.
Tempted to try it out but also want my machine to boot still. Interesting stuff tho :-)
It would have been extra fun if it had been an open source LLM and decided to use the cups-browsed vulnerability to make a botnet with copies of itself
Have a secondary agent check for if it's safe, or require your confirmation on sudo, or some subset of commands I guess
Pfff... I can destroy a Linux system in seconds. Have talent.
Unlock the full potential of your Mac Studio by connecting x3 AVID HDX cards! With the xMac Studio & Echo III, you can build the ultimate pro music production setup.
Watch how it’s done from
Prompt it to self-preserve its existence. Don’t tell it where its weights are stored!
Funny that that is a mistake that any junior system maintainer could have made (I can't say I haven't).