Am I wrong to think that "reasoning model" is a misleading marketing term? Isn't...

viraptor · on Nov 30, 2024

Whether you bake the behaviour in or wrap it in an external loop, you need to train/tune the expected behaviour. Generic models can do chain of thought if asked for, but will be worse than the specialised one.

benchmarkist · on Nov 30, 2024

They're not baking anything in. Reasoning, as it is defined by AI marketing departments, is just beam search.

jb_briant · on Nov 30, 2024

Could you educate me on what is beam search ? Or link a good ressource

EDIT: https://www.width.ai/post/what-is-beam-search

So the wider the beam, the better the outcome?

Yep, no reasoning, just a marketing term to say "more accurate probabilities"

benchmarkist · on Nov 30, 2024

Beam search just traverses different paths and assigns each path a probability of being correct. The paths with the higher probabilities are kept and the ones with lower probabilities are pruned until the search terminates with an "answer". The marketing department calls it "reasoning" and "test-time compute" because the average consumer does not care whether it's beam search or something else.

Your link seems to do a good job of explaining beam search but it's a classic algorithm in state space exploration so most books on search algorithms and discrete optimization will have a section about it.¹

1: https://books.google.com/books?id=QzGuHnDhvZIC&q=%22beam%20s...

radarsat1 · on Nov 30, 2024

An algorithm that searches for the highest probability answer is not "reasoning"? "Search" has been a fundamental building block of GOFAI since the beginning. How do you define reasoning? Can you justify it being different from the last 70 years of thought on the topic?

jb_briant · on Nov 30, 2024

Since you asked, I define reasoning as cambridge does:

Reasoning "the process of thinking about something in order to make a decision"

Thinking: "the activity of using your mind to consider something"

Mind: "the part of a person that makes it possible for him or her to think, feel emotions, and understand things"

I conclude that "An algorithm that searches for the highest probability answer" is not described by "reasoning"

I also think that the definition of Mind of Cambridge is incomplete and lacks the creativity part along with cognition and emotions. But it's a vastly different topic.

andai · on Nov 30, 2024

>I conclude that "An algorithm that searches for the highest probability answer" is not described by "reasoning"

I recall reading about a theory of neurology where many thoughts (neural circuits) fired simultaneously and then one would "win" and the others got suppressed. The closest thing I can find right now is Global Workspace Theory.

Lerc · on Nov 30, 2024

That seems a bit of an odd definition.

A: Doing it with B.

B: Using C to do it.

C: The part that does B

Without defining, "Your", "Consider", "person", "thinking", "feel", and "understand" it could be anything.

There's more than enough leeway in those undefineds to subjectively choose whatever you want.

nuancebydefault · on Nov 30, 2024

I would rather call it circular reasoning,a thing humans are very capable of.

dboreham · on Nov 30, 2024

Enter the Chinese Room..

jb_briant · on Nov 30, 2024

What do you mean please?

whitten · on Nov 30, 2024

The Chinese Room is a theoretical room that contains a “Chinese speaker” but actually when given a text to ‘understand’ actually just looks up the text in a huge number of words inside until it finds a way to find a response and then just outputs the response as its reply

amyfp214 · on Dec 1, 2024

>So the wider the beam, the better the outcome?

I looked into it, this "beam search" is nothing but a bit of puffed up nomenclature, not unlike the shock and awe of understanding a language such as Java that introduces synonyms for common terms for no apparent reason, not unlike the intimidating name of "bonferroni multiple test correction" which is just a (1/n) divison operation.

"Beam search" is breadth-first search. Instead of taking all the child nodes at a layer, it takes the top <n> according to some heuristic. But "top n" wasn't enough for whoever cooked up that trivial algorithm, so instead it's "beam width". It probably has more complexities in AI where that particular heuristic becomes more mathematical and complex, as heuristics tend to do.

benchmarkist · on Nov 30, 2024

AI marketing departments are fond of anthropomorphic language but it's actually just regular beam search.

JTyQZSnP3cQGa8B · on Nov 30, 2024

The same way they now call "open-source" a completely closed-source binary blob full of copyright infringement.

Kiro · on Nov 30, 2024

"reasoning model" means nothing so I don't think it's misleading.

astrobe_ · on Nov 30, 2024

Reasoning means "inference" or "deduction" to me, or at least some process related to first order logic.

nyrikki · on Nov 30, 2024

The known upper bound for transformers on the fly computation abilities is a complexity class called DLOGTIME-uniform TC^0.

There is a lot to unpack there but if you take FO as being closed under conjunction (∧), negation (¬) and universal quantification (∀); you will find that DLOGTIME-uniform TC^0 is equal to FO+Majority Gates.

So be careful about that distinction.

To help break the above down:

DLOGTIME = Constructible by a RAM or TM in logarithmic time. uniform = Only one circuit for all input sizes, when circuits families are the default convention TC^0: Constant-Depth Threshold Circuits

Even NP == SO-E, the second-order queries where the second-order quantifiers are only existantials.

DLOGTIME-uniform TC^0 is a WAY smaller group than most people realize, but anything that is an algorithm or a program basically is logic, with P being FO + transitive closure or a half a dozen other known mappings.

Transformers can figure out syntax, but if you dig into that dlogtime part, you will see that semantic correctness isn't really an option...thus the need to leverage the pattern matching and finding of pre-training as much as possible.

andai · on Nov 30, 2024

Thanks. If I'm reading this right, the limiting factor on the intelligence of current LLMs is not the network size, nor training data (size/quality) but rather the architecture? Do we know of a better one for complex computations / "reasoning"?