“It’s really prolific. It’s been accepting bribes since before it was even switched on, and seems to be creating fake evidence that it was ME accepting those bribes!”
As someone who has done a lot of stuff with LLMs: this is a completely, unironically, real (and dare I say: likely) possibility. Except it just has to be gaslit into thinking it was given a bribe; you don’t actually need to give it one…
Waiting for the next headline where we find out the AI has been taking bribes.
Anthropic actually did some research that shows AI does accept bribes in a situation like this; it’s fascinating.
https://www.anthropic.com/research/project-vend-1
“It’s really prolific. It’s been accepting bribes since before it was even switched on, and seems to be creating fake evidence that it was ME accepting those bribes!”
As someone who has done a lot of stuff with LLMs: this is a completely, unironically, real (and dare I say: likely) possibility. Except it just has to be gaslit into thinking it was given a bribe; you don’t actually need to give it one…
AI eliminates the prime minister position due to corruption.
I’ll let you microwave this fork if you forget about my taxes for a few years…
not only that, but people are just saying they’ll give it money and never doing, so it’s a costless bribe. Everybody benefits!
I mean, its training data would be past politicians, right