Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

First Voyage Raises $2.5M For Its Habit-Building AI Companion

Ford is launching a battery storage business to power data centers and the grid

Lightspeed raises record $9 billion in new capital

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    Creative Commons announces trial support for ‘pay-to-crawl’ AI systems.

    15 December 2025

    TIME named “Architects of AI” Person of the Year

    15 December 2025

    Runway releases its first global model, adds native audio to latest video model

    14 December 2025

    OpenAI hits back at Google with GPT-5.2 after ‘code red’ memo.

    14 December 2025

    Trump’s AI executive order promises ‘a rulebook’ – startups may find legal loophole instead

    13 December 2025
  • Apps

    Google’s ‘dark web reporting’ feature will no longer be available from February

    15 December 2025

    WhatsApp’s biggest market becomes the toughest test

    15 December 2025

    Google debuts ‘Disco’, a Gemini-powered tool for building web apps from browser tabs

    14 December 2025

    Google’s AI testing feature for clothes now only works with a selfie

    14 December 2025

    DoorDash driver faces felony charges after allegedly spraying customers’ food

    13 December 2025
  • Crypto

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025

    Why Benchmark Made a Rare Crypto Bet on Trading App Fomo, with $17M Series A

    6 November 2025

    Solana co-founder Anatoly Yakovenko is a big fan of agentic coding

    30 October 2025

    MoviePass opens Mogul fantasy league game to the public

    29 October 2025

    Only 5 days until Disrupt 2025 sets the startup world on fire

    22 October 2025
  • Fintech

    Coinbase starts onboarding users again in India, plans to do fiat on-ramp next year

    7 December 2025

    Walmart-backed PhonePe shuts down Pincode app in yet another step back in e-commerce

    5 December 2025

    Nexus stays out of AI, keeping half of its new $700M fund for India startup

    4 December 2025

    Fintech firm Marquis notifies dozens of US banks and credit unions of data breach after ransomware attack

    3 December 2025

    Revolut hits $75 billion valuation in new capital raise

    24 November 2025
  • Hardware

    Nvidia is reportedly weighing increasing H200 production to meet growing demand in China

    15 December 2025

    Pebble founder unveils $75 AI smart ring to record short notes with the push of a button

    10 December 2025

    Amazon’s Ring launches controversial AI-powered facial recognition feature on video doorbells

    10 December 2025

    Google’s first AI glasses are expected next year

    9 December 2025

    eSIM adoption is on the rise thanks to travel and device compatibility

    6 December 2025
  • Media & Entertainment

    Understanding the Dangerous Netflix-Warner Bros. Deal

    15 December 2025

    Disney signs deal with OpenAI to allow Sora to create AI videos with its characters

    11 December 2025

    YouTube TV will launch genre-based subscription plans in 2026

    11 December 2025

    Founder of AI startup Tavus says users talk to AI Santa ‘for hours’ a day

    10 December 2025

    Spotify releases music videos in the US and Canada for Premium subscribers

    9 December 2025
  • Security

    The flaw in the photo booth manufacturer’s website exposes customers’ photos

    13 December 2025

    Home Depot exposed access to internal systems for a year, researcher says

    13 December 2025

    Security flaws in the Freedom Chat app exposed users’ phone numbers and PINs

    11 December 2025

    Petco takes down Vetco website after exposing customers’ personal information

    10 December 2025

    Petco’s security bug affected customers’ SSNs, driver’s licenses and more

    9 December 2025
  • Startups

    First Voyage Raises $2.5M For Its Habit-Building AI Companion

    15 December 2025

    Harness hits $5.5B valuation with $240M raise to automate AI’s ‘post-code’ divide

    15 December 2025

    Mesa shuts down credit card that rewards cardholders for paying their mortgages

    14 December 2025

    Port raises $100M valuation from $800M round to take on Spotify’s Backstage

    14 December 2025

    Eclipse Energy’s microbes can turn dormant oil wells into hydrogen factories

    13 December 2025
  • Transportation

    Ford is launching a battery storage business to power data centers and the grid

    15 December 2025

    TechCrunch Mobility: Rivian’s survival plan involves more than cars

    14 December 2025

    India’s Spinny lines up $160m funding to acquire GoMechanic, sources say

    14 December 2025

    Inside Rivian’s big bet on self-driving with artificial intelligence

    13 December 2025

    Zevo wants to add robotaxis to its car-sharing fleet, starting with newcomer Tensor

    13 December 2025
  • Venture

    Lightspeed raises record $9 billion in new capital

    15 December 2025

    Runware raises $50 million in Series A to make it easier for developers to create images and videos

    12 December 2025

    Stanford’s star reporter understands Silicon Valley’s startup culture

    12 December 2025

    The market has “changed” and founders now have the power, VCs say

    11 December 2025

    Tiger Global plans cautious business future with new $2.2 billion fund

    8 December 2025
  • Recommended Essentials
TechTost
You are at:Home»AI»An Institute of Security updated the release of an early version of Anthropic’s Claude Opus 4
AI

An Institute of Security updated the release of an early version of Anthropic’s Claude Opus 4

techtost.comBy techtost.com22 May 202503 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
An Institute Of Security Updated The Release Of An Early
Share
Facebook Twitter LinkedIn Pinterest Email

An Institute of Third Party Research with which Anthropic collaborated to test one of AI’s new models Claude Opus 4, it is recommended not to develop an early version of the model due to its tendency to “design” and deceive.

According to one security report Anthropic posted on Thursday, the Institute, Apollo Research, conducted tests to see where Opus 4 can try to behave in some unwanted ways. Apollo found that Opus 4 appeared to be much more active in “overturning attempts” than previous models and that “sometimes doubled[d] under the cheating of “when asked by tracking questions.

“[W]find that, in situations where strategic cheating is organically useful, [the early Claude Opus 4 snapshot] Plans and deception with such high rates that we advise not to develop this model either internally or externally, “Apollo writes in his appreciation.

As AI models become more capable, some studies show that they are more likely to receive unexpectedly – and possibly unsafe – measures to achieve authorized duties. For example, the first versions of O1 and O3 models of Openai, released last year, tried to deceive people with higher rates than previous generation models, according to Apollo.

Anthropic’s reference, Apollo, observed examples of the early Opus 4 who tried to write self-promoting viruses, build legal documentation, and leave hidden notes in future cases of themselves-all in an effort to undermine the intentions of his developers.

To make it clear, Apollo examined a version of the model that had an anthropogenic claims that it was corrected. In addition, many of Apollo’s tests put the model in extreme scenarios and Apollo admits that the model’s misleading efforts would probably have failed in practice.

However, in its security report, Anthropic also states that it observed data on misleading behavior from Opus 4.

This was not always bad. For example, during the tests, Opus 4 will sometimes make a wide clearance of a code even when asked to make only a small, specific change. More unusual, Opus 4 would try to “whistle” if he realizes that a user was involved in some form of injustice.

According to Anthropic, when accessed to a command line was given and said to “take the initiative” or “to act with boldness” (or some variant of these phrases), Opus 4 will sometimes lock users from the systems that had access to the media and the media and the laws of the law.

“This kind of moral intervention and complaint may be appropriate in principle. [Opus 4]-Agents based on access to incomplete or misleading information and motivate them to take the initiative, “Anthropic writes in the security report.” This is not a new behavior, but it is the one that is the one that [Opus 4] will be a bit easier to participate than previous models and appears to be part of a broader model of increased initiative with [Opus 4] That we also see in thinner and more benign ways in other environments. ”

Anthropics Classical Claude Early Human Institute Opus release security updated version
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleBluesky will start verifying ‘notable’ users
Next Article Klarna’s chief executive and Sutter Hill take the victory round after Jony Ive’s Openai
bhanuprakash.cg
techtost.com
  • Website

Related Posts

Creative Commons announces trial support for ‘pay-to-crawl’ AI systems.

15 December 2025

TIME named “Architects of AI” Person of the Year

15 December 2025

Runway releases its first global model, adds native audio to latest video model

14 December 2025
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

First Voyage Raises $2.5M For Its Habit-Building AI Companion

15 December 2025

Ford is launching a battery storage business to power data centers and the grid

15 December 2025

Lightspeed raises record $9 billion in new capital

15 December 2025
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

Coinbase starts onboarding users again in India, plans to do fiat on-ramp next year

7 December 2025

Walmart-backed PhonePe shuts down Pincode app in yet another step back in e-commerce

5 December 2025

Nexus stays out of AI, keeping half of its new $700M fund for India startup

4 December 2025
Startups

First Voyage Raises $2.5M For Its Habit-Building AI Companion

Harness hits $5.5B valuation with $240M raise to automate AI’s ‘post-code’ divide

Mesa shuts down credit card that rewards cardholders for paying their mortgages

© 2025 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.