This is a linkpost for https://arxiv.org/abs/2411.15862

Author: Yijiong Yu.

Abstract:

It has been well-known that Chain-of-Thought can remarkably enhance LLMs’ performance on complex tasks. However, because it also introduces slower inference speeds and higher computational costs, many researches have attempted to use implicit CoT, which does not need LLMs to explicitly generate the intermediate steps. But there is still gap between their efficacy and typical explicit CoT methods. This leaves us a doubt that, does implicit CoT really equal to explicit CoT? Therefore, in this study, we address this question through experiments. We probe the information of intermediate steps from the model’s hidden states when it is performing implicit CoT. The results surprisingly indicate that LLMs hardly think about intermediate steps, suggesting they may just rely on experience rather than strict step-by-step reasoning. Moreover, we find LLMs’ implicit reasoning capabilities are susceptible and unstable, reaffirming the necessity of explicit CoT to effectively support complex tasks.

They probe for representations of intermediate steps in simple multi-step arithmetic problems, and aren't able to recover such information robustly for e.g. the 3rd step in 5-step problems. They also show that using CoT is much more robust to prompt variations.

Relevant with respect to opaque reasoning and out of context reasoning (OOCR) in Transformer architectures.

New Comment