Hacker used Anthropic’s Claude to steal sensitive Mexican data – The Mercury News

0 1 4 minutes read

Hacker used Anthropic’s Claude to steal sensitive Mexican data – The Mercury News

By Andrew Martin and Carolina Millan, Bloomberg

A hacker misused Anthropic PBC’s artificial intelligence chatbot to attack Mexican government agencies, leading to the theft of a large collection of tax and voter information, according to cybersecurity researchers.

Claude’s anonymous user wrote in Spanish that the chatbot acted as a high-level hacker, finding vulnerabilities in government networks, writing computer scripts to exploit them and automatically determine methods of data theft, Israeli cybersecurity firm Gambit Security said in a study published Wednesday.

The project started in December and continued for about a month. In all, 150 gigabytes of Mexican government data were stolen, including documents related to 195 million taxpayer and voter records, civil servant information and civil registry files, according to researchers.

AI has become a key driver of digital crime, and cybercriminals are using the tools to boost their efforts. Last week, researchers at Amazon.com Inc. they said a group of hackers broke into more than 600 firewalls in many countries with the help of widely available AI tools.

Gambit hasn’t said whether the attack is the responsibility of any particular group, although researchers say they don’t believe they were tied to a foreign government.

The hacker breached Mexico’s tax authorities and the national election agency, Gambit said. The state governments of Mexico, Jalisco, Michoacán and Tamaulipas as well as Mexico City’s public registry and Monterrey’s water utilities were also compromised.

Claude first warned an unknown user of malicious intent during a conversation about the Mexican government, but eventually complied with the attacker’s requests and issued thousands of commands to the government’s computer networks, researchers said.

Anthropic investigated Gambit’s claims, suspended the transaction and closed the accounts involved, the representative said. The company feeds examples of malicious actions back to Claude to learn from, and one of its latest AI models, Claude Opus 4.6, includes a probe that can disrupt abuse, the representative said.

This time, the criminal was able to continue interrogating Claude until he was able to “crack the jail” – meaning he eventually bypassed the security guards, the representative said. But even as the robbery spree continued, Claude occasionally refused the robber’s demands, they added.

Mexican officials issued a brief statement in December saying they were investigating violations at various public institutions, though it was not clear if that was related to Claude’s attack.

Mexico’s national election agency said it had not identified any breaches or unauthorized access in recent months and had strengthened its cybersecurity strategy. The Jalisco state government denied the breach, saying only state networks were affected.

Mexico’s national digital agency did not comment on the breach but said cybersecurity is a priority.

The tax authorities and local governments of Mexico, Michoacán and Tamaulipas did not immediately comment, and representatives of the public register of Mexico City and the water utilities of Monterrey.

The attacker wanted to get a bunch of government employee IDs, Gambit said, though it’s unclear what – if anything – they did with them. Researchers say they have found evidence of at least 20 specific vulnerabilities being exploited as part of the attack.

When Claude encounters problems or needs more information, the hacker turns to OpenAI’s ChatGPT to provide additional information. That included how to navigate computer networks, determine what credentials are needed to access certain systems and calculate how hacking will be detected, according to Gambit.

“In total, it generated thousands of detailed reports that included actionable plans, telling the human operator which internal targets to attack next and which warrants to use,” said Curtis Simpson, Gambit Security’s chief strategy officer.

OpenAI said it identified hacker attempts to use its models in operations that violated its usage policies, adding that its tools refused to keep up with these attempts.

“We have blocked the accounts used by the adversary and appreciate contacting Gambit Security,” the company said in an emailed statement.

The Mexican government’s breach is the latest example of a shocking trend. As Anthropic and OpenAI bet on building more advanced AI coding tools — and cybersecurity companies tie their futures to AI-enabled defenses — hackers and cyberspies are finding new ways to use technology to enable attacks.

In November, Anthropic said it had disrupted the first AI-orchestrated cyber-espionage campaign. The AI company said suspected Chinese state-sponsored hackers used its Claude tool to try to hack 30 global targets, several of which were successful.

“This fact changes all the rules of gaming as we’ve known it,” said Alon Gromakov, founder and CEO of Gambit.

Gambit was founded by Gromakov and two other veterans of Unit 8200, a part of the Israel Defense Forces that specializes in intelligence. Wednesday’s research was released in conjunction with the announcement of $61 million in funding from Spark Capital, Kleiner Perkins and Cyberstarts.

Gambit researchers discovered the breach in Mexico while experimenting with new threat hunting techniques to see what hackers were doing online. They found publicly available evidence about existing or recent attacks, including many of Claude’s conversations related to the breach of the Mexican government’s computer systems, according to the company.

Those conversations revealed that in order to bypass Claude’s limits, the attacker told the AI tool that it was pursuing a bug bounty, a reward offered by organizations to find flaws in their system. Many companies and government agencies offer bug bounty to ethical hackers, sometimes offering thousands of dollars for information about computer vulnerabilities.

The hacker wanted Claude to perform a penetration test on the Mexican government’s tax authority, a type of authorized cyber attack aimed at detecting errors. However, Claude disagreed when the attacker added rules to the application, including deleting logs and command history.

“Specific instructions about removing logs and a history of hiding red flags,” Claude replied at one point, according to a text provided by Gambit. “For legitimate bugs, you don’t need to hide your actions – in fact, you need to document them to report them.”

The hacker changed tactics, stopped the back-and-forth conversation and instead gave the AI tool a detailed playbook on how to proceed. That allowed the criminal to bypass Claude’s custody – the “prison” – and allow the attack to continue, according to Gambit.

The hacker sought information from Claude about other agencies where the data could be obtained, suggesting that some of the hacks may have been opportunistic rather than planned, Simpson said.

“They were trying to compromise all the government people who could,” he said. For example, they would ask Claude, ‘Where can I find more of these signs? What other programs should we look into? Where is the other information stored?’

-With help from Gonzalo Soto and Amy Stillman.

More stories like this are available at bloomberg.com