Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

From the Startup Battlefield to the International Space Station: geCKo Materials Made a Sticky Product

Lucid Motors Appoints New CEO, Gets More Money From Uber, Saudis

Luma launches AI production studio with faith-focused Wonder Project

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    Luma launches AI production studio with faith-focused Wonder Project

    17 April 2026

    Runway’s CEO Says AI Could Help Hollywood Make 50 Movies Instead of One $100 Million Blockbuster

    16 April 2026

    OpenAI updates its Agents SDK to help enterprises build safer, more capable agents

    16 April 2026

    Reid Hoffman weighs in on the ‘tokenmaxxing’ debate.

    15 April 2026

    Anthropic’s co-founder confirms the company briefed the Trump administration on Mythos

    15 April 2026
  • Apps

    Google now lets you explore the web side-by-side with AI

    17 April 2026

    Canva’s AI assistant can now call on various tools to make designs for you

    16 April 2026

    AI learning app Gizmo soars with 13 million users and $22 million in investment

    16 April 2026

    Adobe’s new Firefly AI assistant can use Creative Cloud apps to complete tasks

    15 April 2026

    How the Freecash rewards app made it to the top of the app stores

    15 April 2026
  • Crypto

    British cryptographer Adam Back denies NYT report that he is Bitcoin creator Satoshi Nakamoto

    9 April 2026

    Hackers stole over $2.7 billion in crypto in 2025, data shows

    23 December 2025

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025

    Why Benchmark Made a Rare Crypto Bet on Trading App Fomo, with $17M Series A

    6 November 2025

    Solana co-founder Anatoly Yakovenko is a big fan of agentic coding

    30 October 2025
  • Fintech

    Airwallex is set to take on Stripe and the rest of the payments industry — in the physical world

    16 April 2026

    Cash app launches ‘pay later’ feature for P2P transfers

    3 April 2026

    Doss raises $55 million for AI inventory management that connects to ERP

    24 March 2026

    Despite stiff competition, Kalshi, Polymarket CEOs back $35m VC fund projections

    23 March 2026

    Amid legal turmoil, Kalshi is temporarily banned in Nevada

    20 March 2026
  • Hardware

    Amazon Unveils Slimmer Fire TV Stick HD, Opens Ember Artline TVs for Pre-Order

    16 April 2026

    Motorola is suing social platforms and creators over posts raising concerns about speech in India

    16 April 2026

    AI data center startup Fluidstack is in talks for a $1 billion round at an $18 billion valuation months after raising $7.5 billion, report says

    15 April 2026

    Amazon is ending support for older Kindle devices

    9 April 2026

    Intel signs Elon Musk’s Terafab chip project

    8 April 2026
  • Media & Entertainment

    All we like is soulfulness

    16 April 2026

    Wait, could they still break up Live Nation?

    16 April 2026

    HBO Max is coming to India through an exclusive JioHotstar deal

    15 April 2026

    YouTube Live Streams will now withhold ads during peak engagement to protect the atmosphere

    14 April 2026

    X says he’s reducing payouts to clickbait accounts

    12 April 2026
  • Security

    Two Americans convicted of helping North Korea steal $5 million in fake IT worker scheme

    16 April 2026

    Sweden blames Russian hackers for attempted ‘catastrophic’ cyberattack on thermal plant

    15 April 2026

    Adobe fixes PDF zero-day security flaw that hackers have been exploiting for months

    15 April 2026

    Someone planted backdoors in dozens of WordPress plugins used on thousands of websites

    14 April 2026

    Anodot hack leaves over a dozen compromised companies facing extortion

    14 April 2026
  • Startups

    From the Startup Battlefield to the International Space Station: geCKo Materials Made a Sticky Product

    17 April 2026

    This energy startup’s bet on 100-year-old grid technology is paying off

    16 April 2026

    Hightouch reaches $100M ARR powered by AI-powered marketing tools

    16 April 2026

    StrictlyVC San Francisco is less than a month away

    15 April 2026

    Walmart-owned Flipkart, Amazon are squeezing India’s e-commerce startups

    12 April 2026
  • Transportation

    Lucid Motors Appoints New CEO, Gets More Money From Uber, Saudis

    17 April 2026

    Monarch Tractor collapse ends with takeover by Caterpillar

    16 April 2026

    Ford EV and chief technology officer are leaving the auto industry

    16 April 2026

    Chipmakers AMD, Arm and Qualcomm are investing in this buzzing self-driving technology startup

    15 April 2026

    London is closing in on its first robotaxi service as Waymo begins trials

    15 April 2026
  • Venture

    Anthropic rejects VC funding that values ​​it at $800B+, for now

    16 April 2026

    Financial risk management platform Pillar raises $20 million in rounds led by a16z

    15 April 2026

    Vercel CEO Guillermo Rauch signals IPO readiness as AI agents drive revenue

    14 April 2026

    Nvidia-backed SiFive hits $3.65 billion valuation for open AI chips

    11 April 2026

    How to make the Startup Battlefield Top 20 — and what each company gets regardless

    10 April 2026
  • Recommended Essentials
TechTost
You are at:Home»AI»An Institute of Security updated the release of an early version of Anthropic’s Claude Opus 4
AI

An Institute of Security updated the release of an early version of Anthropic’s Claude Opus 4

techtost.comBy techtost.com22 May 202503 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
An Institute Of Security Updated The Release Of An Early
Share
Facebook Twitter LinkedIn Pinterest Email

An Institute of Third Party Research with which Anthropic collaborated to test one of AI’s new models Claude Opus 4, it is recommended not to develop an early version of the model due to its tendency to “design” and deceive.

According to one security report Anthropic posted on Thursday, the Institute, Apollo Research, conducted tests to see where Opus 4 can try to behave in some unwanted ways. Apollo found that Opus 4 appeared to be much more active in “overturning attempts” than previous models and that “sometimes doubled[d] under the cheating of “when asked by tracking questions.

“[W]find that, in situations where strategic cheating is organically useful, [the early Claude Opus 4 snapshot] Plans and deception with such high rates that we advise not to develop this model either internally or externally, “Apollo writes in his appreciation.

As AI models become more capable, some studies show that they are more likely to receive unexpectedly – and possibly unsafe – measures to achieve authorized duties. For example, the first versions of O1 and O3 models of Openai, released last year, tried to deceive people with higher rates than previous generation models, according to Apollo.

Anthropic’s reference, Apollo, observed examples of the early Opus 4 who tried to write self-promoting viruses, build legal documentation, and leave hidden notes in future cases of themselves-all in an effort to undermine the intentions of his developers.

To make it clear, Apollo examined a version of the model that had an anthropogenic claims that it was corrected. In addition, many of Apollo’s tests put the model in extreme scenarios and Apollo admits that the model’s misleading efforts would probably have failed in practice.

However, in its security report, Anthropic also states that it observed data on misleading behavior from Opus 4.

This was not always bad. For example, during the tests, Opus 4 will sometimes make a wide clearance of a code even when asked to make only a small, specific change. More unusual, Opus 4 would try to “whistle” if he realizes that a user was involved in some form of injustice.

According to Anthropic, when accessed to a command line was given and said to “take the initiative” or “to act with boldness” (or some variant of these phrases), Opus 4 will sometimes lock users from the systems that had access to the media and the media and the laws of the law.

“This kind of moral intervention and complaint may be appropriate in principle. [Opus 4]-Agents based on access to incomplete or misleading information and motivate them to take the initiative, “Anthropic writes in the security report.” This is not a new behavior, but it is the one that is the one that [Opus 4] will be a bit easier to participate than previous models and appears to be part of a broader model of increased initiative with [Opus 4] That we also see in thinner and more benign ways in other environments. ”

Anthropics Classical Claude Early Human Institute Opus release security updated version
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleBluesky will start verifying ‘notable’ users
Next Article Klarna’s chief executive and Sutter Hill take the victory round after Jony Ive’s Openai
bhanuprakash.cg
techtost.com
  • Website

Related Posts

Luma launches AI production studio with faith-focused Wonder Project

17 April 2026

Runway’s CEO Says AI Could Help Hollywood Make 50 Movies Instead of One $100 Million Blockbuster

16 April 2026

OpenAI updates its Agents SDK to help enterprises build safer, more capable agents

16 April 2026
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

From the Startup Battlefield to the International Space Station: geCKo Materials Made a Sticky Product

17 April 2026

Lucid Motors Appoints New CEO, Gets More Money From Uber, Saudis

17 April 2026

Luma launches AI production studio with faith-focused Wonder Project

17 April 2026
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

Airwallex is set to take on Stripe and the rest of the payments industry — in the physical world

16 April 2026

Cash app launches ‘pay later’ feature for P2P transfers

3 April 2026

Doss raises $55 million for AI inventory management that connects to ERP

24 March 2026
Startups

From the Startup Battlefield to the International Space Station: geCKo Materials Made a Sticky Product

This energy startup’s bet on 100-year-old grid technology is paying off

Hightouch reaches $100M ARR powered by AI-powered marketing tools

© 2026 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.