Close Menu
TechTost
  • AI
  • Apps
  • Crypto
  • Fintech
  • Hardware
  • Media & Entertainment
  • Security
  • Startups
  • Transportation
  • Venture
  • Recommended Essentials
What's Hot

Cosmetics giant Rituals confirms data breach of customer membership records

How SpaceX prompted a $2 billion fundraising with a $60 billion takeover offer

Elon Musk Admits Millions of Tesla Owners Need Upgrades for True ‘Full Self-Driving’

Facebook X (Twitter) Instagram
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
Facebook X (Twitter) Instagram
TechTost
Subscribe Now
  • AI

    Tesla just increased its spending plan to $25 billion — this is where the money is going

    23 April 2026

    OpenAI partners with Infosys to bring AI tools to more businesses

    22 April 2026

    Unauthorized group gained access to Anthropic’s proprietary Mythos cyber tool, report claims

    22 April 2026

    NSA Spies Reportedly Using Anthropic’s Mythos, Despite Pentagon Controversy

    21 April 2026

    It’s not just one thing – it’s another thing

    21 April 2026
  • Apps

    Keep up with X’s new AI-powered custom streams

    23 April 2026

    X makes it more expensive to publish links through its API

    22 April 2026

    Apple’s Cal AI crackdown signals it still controls the App Store

    22 April 2026

    GRAI believes that AI can make music more social, not replace artists

    21 April 2026

    WhatsApp is testing a premium subscription, but it’s mostly cosmetic

    21 April 2026
  • Crypto

    British cryptographer Adam Back denies NYT report that he is Bitcoin creator Satoshi Nakamoto

    9 April 2026

    Hackers stole over $2.7 billion in crypto in 2025, data shows

    23 December 2025

    New report examines how David Sachs may benefit from Trump administration role

    1 December 2025

    Why Benchmark Made a Rare Crypto Bet on Trading App Fomo, with $17M Series A

    6 November 2025

    Solana co-founder Anatoly Yakovenko is a big fan of agentic coding

    30 October 2025
  • Fintech

    Cash App targets a new type of customer: children aged 6 to 12 years

    22 April 2026

    Revolut eyes up to $200 billion valuation in potential IPO

    22 April 2026

    Once close enough for a takeover, Stripe and Airwallex are now going after each other

    18 April 2026

    Airwallex is set to take on Stripe and the rest of the payments industry — in the physical world

    16 April 2026

    Cash app launches ‘pay later’ feature for P2P transfers

    3 April 2026
  • Hardware

    Apple’s John Ternus will run one of the most powerful companies in the world. work is a minefield

    22 April 2026

    Tim Cook steps down as Apple CEO: Here’s a look at his 15-year legacy, from new products and services to China expansion

    22 April 2026

    Who is John Ternus, the new CEO of Apple?

    21 April 2026

    Tim Cook steps down as Apple CEO, while John Ternus takes over

    21 April 2026

    Amazon Unveils Slimmer Fire TV Stick HD, Opens Ember Artline TVs for Pre-Order

    16 April 2026
  • Media & Entertainment

    YouTube extends its AI similarity detection technology to celebrities

    21 April 2026

    Deezer says 44% of songs uploaded to its platform every day are created with artificial intelligence

    20 April 2026

    Netflix plans to add a vertical video stream, use AI for recommendations

    17 April 2026

    Netflix co-founder and chairman Reed Hastings is stepping down from the board

    17 April 2026

    All we like is soulfulness

    16 April 2026
  • Security

    Cosmetics giant Rituals confirms data breach of customer membership records

    23 April 2026

    Apple fixes bug used by police to extract deleted chat messages from iPhones

    22 April 2026

    As US spy laws expire, lawmakers divided over protecting Americans from warrantless surveillance

    22 April 2026

    Ransomware dealer pleads guilty to helping ransomware gang

    21 April 2026

    App host Vercel says it was hacked and customer data stolen

    21 April 2026
  • Startups

    How SpaceX prompted a $2 billion fundraising with a $60 billion takeover offer

    23 April 2026

    Cathie Woods’ ARK makes first major investment in startup Lucra — and it’s not AI

    22 April 2026

    AI research lab NeoCognition offers $40 million to build agents that learn like humans

    22 April 2026

    You’ve heard of hybrid cars. Now meet a hybrid cement plant.

    19 April 2026

    Loop raises $95 million to build supply chain artificial intelligence that predicts disruptions

    18 April 2026
  • Transportation

    Elon Musk Admits Millions of Tesla Owners Need Upgrades for True ‘Full Self-Driving’

    23 April 2026

    Redwood Materials lays off 10% in restructuring to pursue energy storage business

    22 April 2026

    Amazon taps Sweden’s Einride for its electric big rigs

    21 April 2026

    The Rivian factory was hit by a tornado before the R2 was released

    20 April 2026

    TechCrunch Mobility: Uber enters the era of assetmaxxing

    20 April 2026
  • Venture

    Esther and Anne Wojcicki support new healthcare accelerator, fund

    23 April 2026

    Anthropic rejects VC funding that values ​​it at $800B+, for now

    16 April 2026

    Financial risk management platform Pillar raises $20 million in rounds led by a16z

    15 April 2026

    Vercel CEO Guillermo Rauch signals IPO readiness as AI agents drive revenue

    14 April 2026

    Nvidia-backed SiFive hits $3.65 billion valuation for open AI chips

    11 April 2026
  • Recommended Essentials
TechTost
You are at:Home»AI»Researchers suggest that Openai trains AI models in O’Reilly Paywalled O’Reilly
AI

Researchers suggest that Openai trains AI models in O’Reilly Paywalled O’Reilly

techtost.comBy techtost.com2 April 202504 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Researchers Suggest That Openai Trains Ai Models In O'reilly Paywalled
Share
Facebook Twitter LinkedIn Pinterest Email

Openai was accused with many The Contracting Parties of the AI ​​training on the content protected by copyright without permission. Now a new paper With an AI Watchdog organization it makes the serious accusation that the company was increasingly based on non -public books, no permission to train more sophisticated AI models.

AI models are essentially complex prediction machines. They are trained in many data – books, movies, television broadcasts and so on – they learn patterns and new ways to move on from a simple exhortation. When a model “writes” an essay on a Greek tragedy or “pulls” Ghibli -type images, he just pulls out of his enormous knowledge to approach. It doesn’t reach anything new.

While a series of AI workshops, including Openai, have begun to embrace the data created by AI to train AI as they exhaust real world sources (mainly public fabric), few have completely avoided real world data. This is possible because training in purely synthetic data comes with dangers, such as deteriorating the performance of a model.

The new document, from the project AI Conclosures, a non-profit institution in 2024 by MOGUL MODE Masses Tim O’Reilly and economist Ilan Strauss, concludes that Openai is probably training the GPT-4O model in books by O’Reilly Media. (O’Reilly is O’Reilly Media CEO.)

In Chatgpt, GPT-4O is the default model. O’Reilly has no licensing agreement with Openai, says the document.

“The GPT-4O, Openai’s latest and most capable model, proves the strong recognition of Openiilly’s O’Reilly book content … compared to the previous model of the Openai GPT-3.5 Turbo,” wrote the co-authors of the paper. “On the contrary, GPT-3.5 Turbo shows greater relevant identification of O’Reilly O’Reilly book samples.”

Paper used a method called Scoopintroduced for the first time in an academic study in 2024, designed to detect copyright -protected content in the language training data. Also known as “Attack Conclusions”, the method tests whether a model can reliably distinguish the texts from humans from paraphrases created by the AI ​​versions of the same text. If it can, it suggests that the model may have prior knowledge of the text from its training data.

Paper co-authors-o’reilly, Strauss and AI researcher Sruly Rosenblat-say that they examined the knowledge of the GPT-4O, GPT-3.5 and other OpenAI models about the cuts. They used 13,962 paragraphs from 34 O’Reilly books to assess the possibility that a particular passage had been included in the training set of a model.

According to the results of the document, the GPT-4O was “recognized” much more paywalled O’Reilly book content of Openai’s oldest models, specifically GPT-3.5 Turbo. This is even after the recording of possible confusing factors, the authors mentioned, such as the improvements in the ability of the younger models to understand if the text was a human writer.

“GPT-4O [likely] It recognizes, and thus has previously knowledgeable, many non-public O’Reilly books published before the training date, “the co-authors wrote.

They are not a smoking weapon, co-authors are careful to note. They acknowledge that their experimental method is not unmistakable and that Openai may have collected the quotes of books that have undergone paywalled from users who copy and paste it to chatgpt.

Further waters, co-authors did not evaluate OpenAI’s latest collection of models, which includes GPT-4.5 models and “reasoning” such as O3-MINI and O1. It is likely that these models were not trained in O’Reilly book data or trained in a smaller amount than GPT-4O.

This is no secret that Openai, which has supported the most relaxed restrictions on the development of models that use copyright -protected data, are looking for higher quality training data for some time. The company has arrived so much Leasing journalists to help perfection the exits of his models. This is a trend throughout the wider industry: AI companies hire experts in areas such as science and physics effectively have these experts to feed their knowledge into AI systems.

It should be noted that Openai pays at least some of the training data. The company has licensing offers with news publishers, social networks, media libraries and more. Openai also offers exception- mechanisms- Although you are incomplete – They allow copyright owners to highlight the content that would prefer the company that does not use for educational purposes.

Still, as Openai fights many costumes on training data practices and the treatment of copyright law in the US courts, O’Reilly paper is not the most flattering appearance.

Openai did not respond to a request for comments.

Copyright models open OpenAI OReilly Paywalled researchers suggest trains
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleTiktok closes Instagram competitor tiktok notes
Next Article Get to know Ponte Labor, a boot that fits Spanish immigrants in jobs using Whatsapp
bhanuprakash.cg
techtost.com
  • Website

Related Posts

Tesla just increased its spending plan to $25 billion — this is where the money is going

23 April 2026

OpenAI partners with Infosys to bring AI tools to more businesses

22 April 2026

AI research lab NeoCognition offers $40 million to build agents that learn like humans

22 April 2026
Add A Comment

Leave A Reply Cancel Reply

Don't Miss

Cosmetics giant Rituals confirms data breach of customer membership records

23 April 2026

How SpaceX prompted a $2 billion fundraising with a $60 billion takeover offer

23 April 2026

Elon Musk Admits Millions of Tesla Owners Need Upgrades for True ‘Full Self-Driving’

23 April 2026
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Fintech

Cash App targets a new type of customer: children aged 6 to 12 years

22 April 2026

Revolut eyes up to $200 billion valuation in potential IPO

22 April 2026

Once close enough for a takeover, Stripe and Airwallex are now going after each other

18 April 2026
Startups

How SpaceX prompted a $2 billion fundraising with a $60 billion takeover offer

Cathie Woods’ ARK makes first major investment in startup Lucra — and it’s not AI

AI research lab NeoCognition offers $40 million to build agents that learn like humans

© 2026 TechTost. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer

Type above and press Enter to search. Press Esc to cancel.