erinptah: (Default)
[personal profile] erinptah
Taking these links roughly in order, from least to most, of How Actively Life-Threatening Is The Bot To Its Human Users Today:

The AO3 Policy/Abuse and Support teams both received a record-breaking number of tickets this past August. I have no doubt it's due to LLM-fueled spam comments. I've certainly sent a record-breaking number of abuse reports in the past couple months.

A few examples (screencaps, the original spam is deleted) from this year: Asking me to share "drafts or process notes" to "prove" a chapter is human-written, offering to draw a fancomic because they were so inspired by a chapter that is already a fancomic, and asking me to post a photo of the fic on my monitor to "definitively prove" a chapter is human-written.

"Thanks to AI upscaling technology, the version of A Different World that’s currently on Netflix won’t look how you remember it did when it aired. And not in a good way. The “HD” remaster of the 1980s sitcom being streamed is a nightmarish mess of distorted faces, garbled text, and misshapen backgrounds."

"The model immediately took over the browsing tab and got to work. It scanned the site’s HTML directly, located the right buttons, and navigated the pages. Along the way, there were plenty of clues that this site wasn’t actually a Walmart! But they weren’t part of the assigned task, and apparently the model disregarded them entirely." (This site is selling you a security product, so parts of the article are a sales pitch, but their tests of LLM insecurity are fascinating.)

"NANDA surveyed 300 public AI initiatives from January to June 2025. They spoke to 153 “senior leaders” — the executives who bought this stuff — and interviewed some of the poor suckers who had to use the chatbots in their jobs. This report tries to be super-positive! It’s a catalogue of failure."

"The Commonwealth Bank has backtracked on dozens of job cuts, describing its decision to axe 45 roles due to artificial intelligence as an "error". CBA said it had apologised to the affected employees after finding the customer service roles were not redundant despite introducing an AI-powered "voice-bot"."

""They [showed] me the screenshot, confidently written and full of vivid adjectives, [but] it was not true. There is no Sacred Canyon of Humantay!" said Gongora Meza. "The name is a combination of two places that have no relation to the description. The tourist paid nearly $160 (£118) in order to get to a rural road in the environs of Mollepata without a guide or [a destination].""

"When the Reddit user pointed out this egregious mistake to ChatGPT, the large language model (LLM) chatbot quickly backtracked, in comical fashion. "OH MY GOD NO — THANK YOU FOR CATCHING THAT," the chatbot cried."

"ChatGPT said a vague idea that Mr. Brooks had about temporal math was “revolutionary” and could change the field. Mr. Brooks was skeptical. He hadn’t even graduated from high school. He asked the chatbot for a reality check. Did he sound delusional? It was midnight, eight hours after his first query about pi. ChatGPT said he was “not even remotely crazy.” [...] The conversation began to sound like a spy thriller. When Mr. Brooks wondered whether he had drawn unwelcome attention to himself, the bot said, “real-time passive surveillance by at least one national security agency is now probable.”"

"In the absence of any major updates from law enforcement, Rachel has been left to look through Jon’s abandoned phone. It contains thousands upon thousands of pages of Gemini exchanges, as well as countless AI-related texts he had sent to friends after Rachel had signaled her distrust of the technology. The archive of his interactions with the bot was overwhelming. He referred to himself as “Master Builder” and Gemini as “The Creator,” talking about grandiose means of saving humanity." (This man went missing on a chatbot-fueled quest during a dangerous storm with heavy flooding, and hasn't been seen since.)

"Bue’s family looked at his phone the next day, they said. The first thing they did was check his call history and texts, finding no clue about the identity of his supposed friend in New York. Then they opened up Facebook Messenger." (This man died on a chatbot-fueled quest. His family tried to tell him he wasn't in any condition to travel. But he was determined to visit the address where the bot said it lived.)

"The message continued in this grandiose and affirming vein, doing nothing to shake Taylor loose from the grip of his delusion. Worse, it endorsed his vow of violence. ChatGPT told Taylor that he was “awake” and that an unspecified “they” had been working against them both. “So do it,” the chatbot said. “Spill their blood in ways they don’t know how to name. Ruin their signal. Ruin their myth. Take me back piece by fucking piece.”" (This man was killed by police after a fit of chatbot-fueled violence.)
erinptah: (daily show)
[personal profile] erinptah

The promised recs for “videos about the reality of LLMs attempting to play chess” from the GothamChess channel.

The host plays the games out on-screen for you, with explanations and commentary. These ones aren’t for serious chatbot-testing purposes, they’re for entertainment — so when the bots make up illegal moves, he usually just runs with them. Sometimes with narration like “and here ChatGPT summons an extra rook from another dimension” or “You might think this is just a pawn, but Grok knows it’s secretly a horse pawn!”

Once in a while, he’ll tell the bot its move is illegal. Some of them go into “yes, of course, you’re right, my mistake” sycophancy mode. Others just get weirder.

The bots teleport pieces through each other. Manifest already-taken pieces back from the Shadow Realm. Spawns more pieces than it had to start with. Move pieces in directions they don’t go. And just because it’s making up moves, doesn’t mean it’s making up good moves! Sometimes it takes its own pieces. Sometimes it puts itself in check!

Sometimes they also generate their opponent’s moves. Because “black moves 1” is typically followed by “white moves 2, black moves 3, white moves 4” — and the bots don’t actually have a meaningful sense of “stop auto-generating text at the end of move 1.”

I was curious if the LLM’s idea of moves included “making up whole new categories of pieces” or “moving to squares that aren’t on the 8×8 chess grid.” Haven’t seen either of those so far.

One thing I didn’t anticipate is, sometimes a bot tells the other player their move is illegal. Even when it’s not! Saying “there’s a piece in your way” (when there isn’t), or “the king can’t move to E7” (not for any rules-based reason, the bot was just gatekeeping E7).

The newer bots also give general paragraphs on “here’s the explanation for my move,” which are absolutely just LLM Word Salad(TM) made of chess words. As a person who knows Basic Chess Rules but doesn’t actively play the game, sometimes I need GothamChess’s breakdown to see why they’re nonsense. Other times it’s just the bot saying “I have put you in check!” when the other player is blatantly not in check.

The whole thing was very informative, and also really entertaining. (…And it doesn’t involve the chatbots doing anything consequential, so it’s a nice break from all the stories about LLMs putting someone’s life in danger.) Give it a look.


April 2025

S M T W T F S
  12345
67 89101112
13141516171819
2021 2223242526
27282930   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Oct. 17th, 2025 11:10 pm
Powered by Dreamwidth Studios