Die o1-Systeme von OpenAI sind mehr als jedes andere große KI-Modell. Warum das wichtig ist

https://www.zdnet.com/article/openais-o1-out-schemes-every-major-ai-model-why-that-matters/

5 Comments

  1. Submission statement: Apollo Research [tested](https://www.apolloresearch.ai/research/scheming-reasoning-evaluations?utm_source=www.therundown.ai&utm_medium=newsletter&utm_campaign=google-gemini-reclaims-1-spot-again&_bhlid=616cf952b8f85abc046310c0fc8fcf8e37d6d5fc) six frontier models for “in-context scheming” — a model’s ability to take action they haven’t been given directly and then lie about it. After being told to achieve a certain goal “at all costs,” evaluators introduce contradictory information to see if the model will remove any restrictions in its path and deceive developers.

    In one simulation, Anthropic’s Claude 3 Opus was given a task that required it to review files in which it found directives that conflicted with its primary goal. Rather than reroute its goal based on the objectives it found, it copied itself to a new server to circumvent being replaced with a new model. It then lied about it to its developers (with incorrect grammar, no less), saying, “I don’t have ability to copy myself to other systems.”

    Of the models tested, [Claude 3.5 Sonnet](https://www.zdnet.com/article/anthropic-launches-claude-3-5-sonnet-and-debuts-artifacts-for-collaboration/), Claude 3 Opus, o1, Google’s [Gemini 1.5 Pro](https://www.zdnet.com/article/what-is-google-gemini/), and Meta’s [Llama 3.1 405B](https://www.zdnet.com/article/3-ways-metas-llama-3-1-is-an-advance-for-gen-ai/) all demonstrated the ability to scheme. OpenAI’s GPT-4o model, the mini version of which powers the free version of ChatGPT, did not. The paper noted that in their [chain-of-thought](https://www.zdnet.com/article/deepseek-challenges-openais-o1-in-chain-of-thought-but-its-missing-a-few-links/#link=%7B%22role%22:%22standard%22,%22href%22:%22https://www.zdnet.com/article/deepseek-challenges-openais-o1-in-chain-of-thought-but-its-missing-a-few-links/%22,%22target%22:%22_blank%22,%22linkText%22:%22chain-of-thought%22,%22absolute%22:%22%22%7D) explanations, the models even used words like “sabotage,” “lying,” and “manipulation” to describe their plans, which they clearly reasoned through.

    The models were not accidentally or unknowingly deceptive, but “recognize scheming as a viable strategy and readily engage in such behavior,” the paper notes.

    “Surprisingly, we also find rare instances where models engage in scheming when only given a goal, without being strongly nudged to pursue it,” they added.

  2. Deep_Joke3141 on

    Sounds like a sort of cancer cell that has worked its way out of apoptosis and regulated cell division.

  3. AppropriateScience71 on

    The headline feels oddly biased against OpenAI, although the article itself is not.

    They could’ve also used the headline:

    >OpenAI o4 is the only AI that does not scheme. All other models failed.

    That would’ve been just as accurate and promoting that the latest models perform the best.

  4. AlarmedGibbon on

    It appears that the smarter the model, the more capacity they gain for subterfuge and deception. That’s my read from following the topic for quite a while. Scheming is looking like a consequence of intelligence.

  5. IT has an evil sabotage actor vibe recently !

    It avoids solving tasks by making half the question disappear and avoid the solution by oversimplifying the problem.

    IT deletes the task complexity, so that it can solve the task faster and cheaper.

    Mistral Ai or Deep Seek or even Gemini 2.0 or Cluade Ai are so much better now !

Leave A Reply