Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

A 20-minute pitch wins Lachy Groom-backed Indian startup Pronto

Lucid Motors doesn’t know how many EVs it will build this year

Barry Diller trusts Sam Altman. But “trust is irrelevant” as AGI approaches, he says.

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    Barry Diller trusts Sam Altman. But “trust is irrelevant” as AGI approaches, he says.

    7 May 2026

    Ethos Raises $22.75M From a16z For Its Experience Network With Voice Integration

    6 May 2026

    SAP bets $1.16 billion on 18-month-old German AI lab and says yes to NemoClaw

    6 May 2026

    ElevenLabs lists BlackRock, Jamie Foxx and Longoria as new investors

    5 May 2026

    OpenAI host Cerebras is on track for a major IPO

    5 May 2026
  • Apps

    Snap says $400M deal with Perplexity ‘ended amicably’

    7 May 2026

    Threads finally brings messaging to the web

    6 May 2026

    Bumble’s paying users are slipping as it bets on an overhaul later this year

    6 May 2026

    Meta will use artificial intelligence to analyze height and bone structure to detect whether users are underage

    5 May 2026

    Image AI models are now driving app development, surpassing chatbot upgrades

    5 May 2026
  • Crypto

    As crypto cools, a16z crypto raises $2.2 billion in capital

    6 May 2026

    Coinbase to lay off 14% of staff as part of broader restructuring

    5 May 2026

    British cryptographer Adam Back denies NYT report that he is Bitcoin creator Satoshi Nakamoto

    9 April 2026

    Hackers stole over $2.7 billion in crypto in 2025, data shows

    23 December 2025

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025
  • Fintech

    Robinhood’s venture fund IPO attracted 150,000+ private investors, CEO says

    7 May 2026

    PayPal says it’s “becoming a tech company again” — that’s AI

    6 May 2026

    Stripe introduces Link, a digital wallet that autonomous AI agents can also use

    1 May 2026

    Y Combinator alum Skio sells for $105 million in cash, raised only $8 million, founder says

    1 May 2026

    Amazon, Meta join the fight to end Google Pay and PhonePe’s dominance in India

    30 April 2026
  • Hardware

    Apple to pay $250 million to settle lawsuit over Siri’s lagging AI features

    7 May 2026

    reMarkable’s new Paper Pure tablet goes back to basics with a monochrome display

    6 May 2026

    Altara secures $7 million to bridge the data gap slowing the natural sciences

    6 May 2026

    This tiny, magnetic e-reader could keep you from doomscrolling

    4 May 2026

    Apple surprised by AI-driven demand for Macs

    1 May 2026
  • Media & Entertainment

    Netflix delays Greta Gerwig’s ‘Narnia’ for big theatrical push to 2027

    2 May 2026

    Roku’s $3 streaming service Howdy hits 1 million subscribers, per recent report

    29 April 2026

    Australia forces Big Tech companies to pay for news or face 2.25% tax.

    28 April 2026

    India’s app market is booming — but global platforms are raking in most of the profits

    23 April 2026

    YouTube extends its AI similarity detection technology to celebrities

    21 April 2026
  • Security

    DOJ says ransomware gang exploited Russian government databases

    6 May 2026

    Hackers steal student data during breach at education tech giant Instructure

    6 May 2026

    Kaspersky Suspects Chinese Hackers Put Backdoor in Daemon Tools in ‘Broad’ Attack

    5 May 2026

    The US government is warning of a serious CopyFail bug affecting major versions of Linux

    5 May 2026

    Hackers are still exploiting the cPanel bug to gain control of thousands of websites

    4 May 2026
  • Startups

    A 20-minute pitch wins Lachy Groom-backed Indian startup Pronto

    7 May 2026

    3 days left to lock in 50% off a second ticket to Disrupt 2026

    6 May 2026

    India’s first GenAI unicorn shifts to cloud services as AI model ambitions face reality

    5 May 2026

    FDA Approval, Fundraising and the Reality of Building Healthcare According to BioticsAI Founder

    1 May 2026

    Legal AI startup Legora hits $5.6 billion valuation, and its battle with Harvey just got hotter

    1 May 2026
  • Transportation

    Lucid Motors doesn’t know how many EVs it will build this year

    7 May 2026

    Aurora lands deal with McLane to run driverless truck routes in Texas

    6 May 2026

    Nuro gets driverless test license ahead of Uber’s robotaxi service launch

    6 May 2026

    Moment Energy raises $40M to meet ‘infinite energy demand’ with EV batteries

    5 May 2026

    Ouster’s new color lidar is coming to replace cameras

    4 May 2026
  • Venture

    All your M&A questions will be answered at Disrupt 2026

    6 May 2026

    ElevenLabs lists BlackRock, Jamie Foxx and Eva Longoria as new investors

    6 May 2026

    Get 50% off a second Disrupt 2026 pass to bid more, faster

    5 May 2026

    Nicolas Sauvage bets on the boring parts of AI

    4 May 2026

    Musely secures $360 million from General Catalyst without giving up equity

    2 May 2026
  • Recommended Essentials
TechTost
You are at:Home»AI»An Institute of Security updated the release of an early version of Anthropic’s Claude Opus 4
AI

An Institute of Security updated the release of an early version of Anthropic’s Claude Opus 4

techtost.comBy techtost.com22 May 202503 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
An Institute Of Security Updated The Release Of An Early
Share
Facebook Twitter LinkedIn Pinterest Email

An Institute of Third Party Research with which Anthropic collaborated to test one of AI’s new models Claude Opus 4, it is recommended not to develop an early version of the model due to its tendency to “design” and deceive.

According to one security report Anthropic posted on Thursday, the Institute, Apollo Research, conducted tests to see where Opus 4 can try to behave in some unwanted ways. Apollo found that Opus 4 appeared to be much more active in “overturning attempts” than previous models and that “sometimes doubled[d] under the cheating of “when asked by tracking questions.

“[W]find that, in situations where strategic cheating is organically useful, [the early Claude Opus 4 snapshot] Plans and deception with such high rates that we advise not to develop this model either internally or externally, “Apollo writes in his appreciation.

As AI models become more capable, some studies show that they are more likely to receive unexpectedly – and possibly unsafe – measures to achieve authorized duties. For example, the first versions of O1 and O3 models of Openai, released last year, tried to deceive people with higher rates than previous generation models, according to Apollo.

Anthropic’s reference, Apollo, observed examples of the early Opus 4 who tried to write self-promoting viruses, build legal documentation, and leave hidden notes in future cases of themselves-all in an effort to undermine the intentions of his developers.

To make it clear, Apollo examined a version of the model that had an anthropogenic claims that it was corrected. In addition, many of Apollo’s tests put the model in extreme scenarios and Apollo admits that the model’s misleading efforts would probably have failed in practice.

However, in its security report, Anthropic also states that it observed data on misleading behavior from Opus 4.

This was not always bad. For example, during the tests, Opus 4 will sometimes make a wide clearance of a code even when asked to make only a small, specific change. More unusual, Opus 4 would try to “whistle” if he realizes that a user was involved in some form of injustice.

According to Anthropic, when accessed to a command line was given and said to “take the initiative” or “to act with boldness” (or some variant of these phrases), Opus 4 will sometimes lock users from the systems that had access to the media and the media and the laws of the law.

“This kind of moral intervention and complaint may be appropriate in principle. [Opus 4]-Agents based on access to incomplete or misleading information and motivate them to take the initiative, “Anthropic writes in the security report.” This is not a new behavior, but it is the one that is the one that [Opus 4] will be a bit easier to participate than previous models and appears to be part of a broader model of increased initiative with [Opus 4] That we also see in thinner and more benign ways in other environments. ”

Anthropics Classical Claude Early Human Institute Opus release security updated version
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleBluesky will start verifying ‘notable’ users
Next Article Klarna’s chief executive and Sutter Hill take the victory round after Jony Ive’s Openai
bhanuprakash.cg
techtost.com
  • Website

Related Posts

Barry Diller trusts Sam Altman. But “trust is irrelevant” as AGI approaches, he says.

7 May 2026

Ethos Raises $22.75M From a16z For Its Experience Network With Voice Integration

6 May 2026

SAP bets $1.16 billion on 18-month-old German AI lab and says yes to NemoClaw

6 May 2026
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

A 20-minute pitch wins Lachy Groom-backed Indian startup Pronto

7 May 2026

Lucid Motors doesn’t know how many EVs it will build this year

7 May 2026

Barry Diller trusts Sam Altman. But “trust is irrelevant” as AGI approaches, he says.

7 May 2026
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

Robinhood’s venture fund IPO attracted 150,000+ private investors, CEO says

7 May 2026

PayPal says it’s “becoming a tech company again” — that’s AI

6 May 2026

Stripe introduces Link, a digital wallet that autonomous AI agents can also use

1 May 2026
Startups

A 20-minute pitch wins Lachy Groom-backed Indian startup Pronto

3 days left to lock in 50% off a second ticket to Disrupt 2026

India’s first GenAI unicorn shifts to cloud services as AI model ambitions face reality

© 2026 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.