Zukunftsforschung

Die o1-Systeme von OpenAI sind mehr als jedes andere große KI-Modell. Warum das wichtig ist

05.01.2025

https://www.zdnet.com/article/openais-o1-out-schemes-every-major-ai-model-why-that-matters/

View 5 Comments

5 Comments

katxwoods on 05.01.2025 12:29 AM

Submission statement: Apollo Research [tested](https://www.apolloresearch.ai/research/scheming-reasoning-evaluations?utm_source=www.therundown.ai&utm_medium=newsletter&utm_campaign=google-gemini-reclaims-1-spot-again&_bhlid=616cf952b8f85abc046310c0fc8fcf8e37d6d5fc) six frontier models for “in-context scheming” — a model’s ability to take action they haven’t been given directly and then lie about it. After being told to achieve a certain goal “at all costs,” evaluators introduce contradictory information to see if the model will remove any restrictions in its path and deceive developers.

In one simulation, Anthropic’s Claude 3 Opus was given a task that required it to review files in which it found directives that conflicted with its primary goal. Rather than reroute its goal based on the objectives it found, it copied itself to a new server to circumvent being replaced with a new model. It then lied about it to its developers (with incorrect grammar, no less), saying, “I don’t have ability to copy myself to other systems.”

Of the models tested, [Claude 3.5 Sonnet](https://www.zdnet.com/article/anthropic-launches-claude-3-5-sonnet-and-debuts-artifacts-for-collaboration/), Claude 3 Opus, o1, Google’s [Gemini 1.5 Pro](https://www.zdnet.com/article/what-is-google-gemini/), and Meta’s [Llama 3.1 405B](https://www.zdnet.com/article/3-ways-metas-llama-3-1-is-an-advance-for-gen-ai/) all demonstrated the ability to scheme. OpenAI’s GPT-4o model, the mini version of which powers the free version of ChatGPT, did not. The paper noted that in their [chain-of-thought](https://www.zdnet.com/article/deepseek-challenges-openais-o1-in-chain-of-thought-but-its-missing-a-few-links/#link=%7B%22role%22:%22standard%22,%22href%22:%22https://www.zdnet.com/article/deepseek-challenges-openais-o1-in-chain-of-thought-but-its-missing-a-few-links/%22,%22target%22:%22_blank%22,%22linkText%22:%22chain-of-thought%22,%22absolute%22:%22%22%7D) explanations, the models even used words like “sabotage,” “lying,” and “manipulation” to describe their plans, which they clearly reasoned through.

The models were not accidentally or unknowingly deceptive, but “recognize scheming as a viable strategy and readily engage in such behavior,” the paper notes.

“Surprisingly, we also find rare instances where models engage in scheming when only given a goal, without being strongly nudged to pursue it,” they added.
Deep_Joke3141 on 05.01.2025 12:50 AM

Sounds like a sort of cancer cell that has worked its way out of apoptosis and regulated cell division.
AppropriateScience71 on 05.01.2025 12:51 AM

The headline feels oddly biased against OpenAI, although the article itself is not.

They could’ve also used the headline:

>OpenAI o4 is the only AI that does not scheme. All other models failed.

That would’ve been just as accurate and promoting that the latest models perform the best.
AlarmedGibbon on 05.01.2025 1:04 AM

It appears that the smarter the model, the more capacity they gain for subterfuge and deception. That’s my read from following the topic for quite a while. Scheming is looking like a consequence of intelligence.
epSos-DE on 05.01.2025 1:24 AM

IT has an evil sabotage actor vibe recently !

It avoids solving tasks by making half the question disappear and avoid the solution by oversimplifying the problem.

IT deletes the task complexity, so that it can solve the task faster and cheaper.

Mistral Ai or Deep Seek or even Gemini 2.0 or Cluade Ai are so much better now !

Der bizarre Fall Youssef Nada und die Rolle der Schweiz im „Krieg gegen den Terror“

Selenskyj: 3.800 nordkoreanische Soldaten getötet oder verwundet

Das Verfassungsgericht hält seine erste Sitzung mit acht Richtern ab

Hallo 👋 ich bin neu hier

Der KI-Musik fehlen die Copyright-Beats

Elons europäische Invasion verärgert die führenden Politiker der Welt

Hey, ich sammle Flaggen als Hobby und habe auch Flaggen von Belgien, aber ich hätte gerne Flaggen aus Provinzen, Regionen und Städten. Gibt es jemanden, der sie verschicken könnte? Ich komme aus Serbien

Die o1-Systeme von OpenAI sind mehr als jedes andere große KI-Modell. Warum das wichtig ist

5 Comments

Der bizarre Fall Youssef Nada und die Rolle der Schweiz im „Krieg gegen den Terror“

Selenskyj: 3.800 nordkoreanische Soldaten getötet oder verwundet

Das Verfassungsgericht hält seine erste Sitzung mit acht Richtern ab

Hallo 👋 ich bin neu hier

Der KI-Musik fehlen die Copyright-Beats

Elons europäische Invasion verärgert die führenden Politiker der Welt

Hey, ich sammle Flaggen als Hobby und habe auch Flaggen von Belgien, aber ich hätte gerne Flaggen aus Provinzen, Regionen und Städten. Gibt es jemanden, der sie verschicken könnte? Ich komme aus Serbien

Tags

Die o1-Systeme von OpenAI sind mehr als jedes andere große KI-Modell. Warum das wichtig ist

5 Comments