Technik

Die Studie von Apple beweist, dass LLM-basierte KI-Modelle fehlerhaft sind, weil sie nicht argumentieren können

12.10.2024

https://appleinsider.com/articles/24/10/12/apples-study-proves-that-llm-based-ai-models-are-flawed-because-they-cannot-reason?utm_medium=rss

View 9 Comments

9 Comments

thenewguyonreddit on 12.10.2024 10:48 PM

They never could reason and the only people who believed this were laymen unfamiliar with how GPTs actually work.

At their core, they are very fancy prediction and probability engines. Thats it. They either predict the next word in a sentence or the next pixel in an image. Most times they are right, sometimes they are laughably wrong. Even calling them AI is a huge stretch.
[deleted] on 12.10.2024 10:56 PM

[deleted]
david76 on 12.10.2024 11:06 PM

I don’t disagree with the premise of the article, but when you’re testing an LLM “with a given math question” you’re unlikely to get good results.
mingy on 12.10.2024 11:14 PM

Uh. Duh? No shit. New to LLMs?
Turtle_Online on 12.10.2024 11:24 PM

Article makes no mention of GPT-4o1. I wonder if the study included the latest preview model from OpenAI which aims to solve this.
TheEnricoPalazzo on 12.10.2024 11:32 PM

So apple intelligence is bullshit? Got it.
Divine_Kittens on 12.10.2024 11:43 PM

This is a significant problem, because as someone who works effectively in tech support, I can say the vast majority of humans do not have the ability to parse down what they want, or what problem they are having, into concise questions with only the relevant info.

It’s usually either “my phone isn’t working” or it’s a story so meandering that even Luis from *Ant-Man* would be saying “Get to the point!!!”

This will be a more important thing for AI researchers to figure out.
Divine_Kittens on 12.10.2024 11:43 PM

>

Hence why LLM’s are called *predictive* models, and not *reasoning* models
TheOneWhoSonders on 12.10.2024 11:45 PM

There was already a paper on this called “ChatGPT is Bullshit”: https://link.springer.com/article/10.1007/s10676-024-09775-5

Die Verbraucherstimmung verschlechtert sich im November

Dieser Typ hat einmal auf PH aufgezeichnet, wie er Toast isst. Weißt du, wie er heißt? Hilfe xd

Baidus supergünstige Robotertaxis dürften die USA zu Tode erschrecken

Beim Absturz eines Frachtflugzeugs in Litauen kommt ein Besatzungsmitglied ums Leben

Nordkorea und Russland unterzeichnen Protokoll über Sportaustausch für 2025

Faktencheck: Hat Amerikas beliebtester Podcaster Joe Rogan gerade eine Menge Kreml-Propaganda nachgeplappert?

Hübsche finnische Blondine

Die Studie von Apple beweist, dass LLM-basierte KI-Modelle fehlerhaft sind, weil sie nicht argumentieren können

9 Comments

Die Verbraucherstimmung verschlechtert sich im November

Dieser Typ hat einmal auf PH aufgezeichnet, wie er Toast isst. Weißt du, wie er heißt? Hilfe xd

Baidus supergünstige Robotertaxis dürften die USA zu Tode erschrecken

Beim Absturz eines Frachtflugzeugs in Litauen kommt ein Besatzungsmitglied ums Leben

Nordkorea und Russland unterzeichnen Protokoll über Sportaustausch für 2025

Faktencheck: Hat Amerikas beliebtester Podcaster Joe Rogan gerade eine Menge Kreml-Propaganda nachgeplappert?

Hübsche finnische Blondine

Tags

Die Studie von Apple beweist, dass LLM-basierte KI-Modelle fehlerhaft sind, weil sie nicht argumentieren können

9 Comments