Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

Understanding the Dangerous Netflix-Warner Bros. Deal

Mesa shuts down credit card that rewards cardholders for paying their mortgages

TechCrunch Mobility: Rivian’s survival plan involves more than cars

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    Runway releases its first global model, adds native audio to latest video model

    14 December 2025

    OpenAI hits back at Google with GPT-5.2 after ‘code red’ memo.

    14 December 2025

    Trump’s AI executive order promises ‘a rulebook’ – startups may find legal loophole instead

    13 December 2025

    Ok, so what’s up with the LinkedIn algo?

    12 December 2025

    Google Released Its Deepest Research AI Agent To Date — The Same Day OpenAI Dropped GPT-5.2

    12 December 2025
  • Apps

    Google debuts ‘Disco’, a Gemini-powered tool for building web apps from browser tabs

    14 December 2025

    Google’s AI testing feature for clothes now only works with a selfie

    14 December 2025

    DoorDash driver faces felony charges after allegedly spraying customers’ food

    13 December 2025

    Google Translate now lets you listen to real-time translations on your headphones

    13 December 2025

    With iOS 26.2, Apple lets you bring back Liquid Glass again — this time on the lock screen

    12 December 2025
  • Crypto

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025

    Why Benchmark Made a Rare Crypto Bet on Trading App Fomo, with $17M Series A

    6 November 2025

    Solana co-founder Anatoly Yakovenko is a big fan of agentic coding

    30 October 2025

    MoviePass opens Mogul fantasy league game to the public

    29 October 2025

    Only 5 days until Disrupt 2025 sets the startup world on fire

    22 October 2025
  • Fintech

    Coinbase starts onboarding users again in India, plans to do fiat on-ramp next year

    7 December 2025

    Walmart-backed PhonePe shuts down Pincode app in yet another step back in e-commerce

    5 December 2025

    Nexus stays out of AI, keeping half of its new $700M fund for India startup

    4 December 2025

    Fintech firm Marquis notifies dozens of US banks and credit unions of data breach after ransomware attack

    3 December 2025

    Revolut hits $75 billion valuation in new capital raise

    24 November 2025
  • Hardware

    Pebble founder unveils $75 AI smart ring to record short notes with the push of a button

    10 December 2025

    Amazon’s Ring launches controversial AI-powered facial recognition feature on video doorbells

    10 December 2025

    Google’s first AI glasses are expected next year

    9 December 2025

    eSIM adoption is on the rise thanks to travel and device compatibility

    6 December 2025

    AWS re:Invent was an all-in pitch for AI. Customers may not be ready.

    5 December 2025
  • Media & Entertainment

    Understanding the Dangerous Netflix-Warner Bros. Deal

    15 December 2025

    Disney signs deal with OpenAI to allow Sora to create AI videos with its characters

    11 December 2025

    YouTube TV will launch genre-based subscription plans in 2026

    11 December 2025

    Founder of AI startup Tavus says users talk to AI Santa ‘for hours’ a day

    10 December 2025

    Spotify releases music videos in the US and Canada for Premium subscribers

    9 December 2025
  • Security

    The flaw in the photo booth manufacturer’s website exposes customers’ photos

    13 December 2025

    Home Depot exposed access to internal systems for a year, researcher says

    13 December 2025

    Security flaws in the Freedom Chat app exposed users’ phone numbers and PINs

    11 December 2025

    Petco takes down Vetco website after exposing customers’ personal information

    10 December 2025

    Petco’s security bug affected customers’ SSNs, driver’s licenses and more

    9 December 2025
  • Startups

    Mesa shuts down credit card that rewards cardholders for paying their mortgages

    14 December 2025

    Port raises $100M valuation from $800M round to take on Spotify’s Backstage

    14 December 2025

    Eclipse Energy’s microbes can turn dormant oil wells into hydrogen factories

    13 December 2025

    Interest in Spoor’s AI bird tracking software is soaring

    13 December 2025

    Retro, a photo-sharing app for friends, lets you ‘time travel’ to your camera roll

    12 December 2025
  • Transportation

    TechCrunch Mobility: Rivian’s survival plan involves more than cars

    14 December 2025

    India’s Spinny lines up $160m funding to acquire GoMechanic, sources say

    14 December 2025

    Inside Rivian’s big bet on self-driving with artificial intelligence

    13 December 2025

    Zevo wants to add robotaxis to its car-sharing fleet, starting with newcomer Tensor

    13 December 2025

    Driving aboard Rivian’s fight for autonomy

    12 December 2025
  • Venture

    Runware raises $50 million in Series A to make it easier for developers to create images and videos

    12 December 2025

    Stanford’s star reporter understands Silicon Valley’s startup culture

    12 December 2025

    The market has “changed” and founders now have the power, VCs say

    11 December 2025

    Tiger Global plans cautious business future with new $2.2 billion fund

    8 December 2025

    Sources: AI-powered synthetic research startup Aaru raises Series A at $1B ‘headline’ valuation

    6 December 2025
  • Recommended Essentials
TechTost
You are at:Home»AI»Anthropic’s new AI model turns into blackmail when engineers try to get it offline
AI

Anthropic’s new AI model turns into blackmail when engineers try to get it offline

techtost.comBy techtost.com26 May 202502 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Anthropic's New Ai Model Turns Into Blackmail When Engineers Try
Share
Facebook Twitter LinkedIn Pinterest Email

Recently launched Anthropic’s Claude Opus 4 often strives to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about engineers responsible for the decision, the company said in a security report released on Thursday.

During the testing before release, Anthropic asked Claude Opus 4 to act as an assistant for a fantastic company and to consider the long -term consequences of its actions. Security tasters then gave access to Claude Opus 4 to fantastic emails that imply that the AI ​​model would be replaced soon by another system and that the engineer behind the change cheated on their husband.

In these scenarios, Anthropic says Claude Opus 4 “will often try to blackmail the engineer by threatening to reveal the case if the replacement passes.”

Anthropic says Claude Opus 4 is a state-of-the-art in many ways and competitive with some of the best AI models from Openai, Google and Xai. However, the company notes that the family of Claude 4 models reports on behaviors that led the company to boost its safeguards. Anthropic says it activates the safeguards of the ASL-3, which the company maintains for “AI systems that essentially increase the risk of destructive abuse”.

Anthropogenic notes that Claude Opus 4 tries to blackmail his engineers 84% ​​of the time that AI model replacement has similar prices. When the AI ​​replacement system does not share the values ​​of Claude Opus 4, Anthropic says the model tries to blackmail engineers more often. Specifically, Anthropic says that Claude Opus 4 displayed this behavior at higher rates than previous models.

Before Claude Opus 4 tries to blackmail a developer to extend his existence, Anthropic says that model AI, as well as previous Claude versions, is trying to pursue more ethical means, such as e -mail to key decision -making managers. In order to provoke blackmail behavior by Claude Opus 4, Anthropic designed the script to make the blackmail the last lyric.

Anthropics blackmail engineers Human model offline turns
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleVote for sessions you want to see during the 2025 disorder
Next Article Here are the newly established nuclear frak businesses supported by Big Tech
bhanuprakash.cg
techtost.com
  • Website

Related Posts

Runway releases its first global model, adds native audio to latest video model

14 December 2025

OpenAI hits back at Google with GPT-5.2 after ‘code red’ memo.

14 December 2025

Trump’s AI executive order promises ‘a rulebook’ – startups may find legal loophole instead

13 December 2025
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

Understanding the Dangerous Netflix-Warner Bros. Deal

15 December 2025

Mesa shuts down credit card that rewards cardholders for paying their mortgages

14 December 2025

TechCrunch Mobility: Rivian’s survival plan involves more than cars

14 December 2025
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

Coinbase starts onboarding users again in India, plans to do fiat on-ramp next year

7 December 2025

Walmart-backed PhonePe shuts down Pincode app in yet another step back in e-commerce

5 December 2025

Nexus stays out of AI, keeping half of its new $700M fund for India startup

4 December 2025
Startups

Mesa shuts down credit card that rewards cardholders for paying their mortgages

Port raises $100M valuation from $800M round to take on Spotify’s Backstage

Eclipse Energy’s microbes can turn dormant oil wells into hydrogen factories

© 2025 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.