Neither TMS nor ECT didn’t do much for my depression. Eventually, after years of trial and error, I did find a combination of drugs that works pretty well.
I never tried ketamine or psilocybin treatments but I would go that route before ever thinking about trying ECT again.
I suspect fine-tuning specialized models is just squeezing a bit more performance in a particular direction, and not nearly as useful as developing the next-gen model. Complex reasoning takes more steps and tighter coherence among them (the o1 models are a step in this direction). You can try to devote a toddler to studying philosophy, but it won't really work until their brain matures more.
Seeing the distribution calibration you point out does update my opinion a bit.
I feel like there’s still a significant distinction though between adding one calculation step to the question versus asking it to model multiple responses. It would have to model its own distribution in a single pass rather than having the distributions measured over multiple passes align (which I’d expect to happen if the fine-tuning teaches it the hypothetical is just like adding a calculation to the end).
As an analogy, suppose I have a pseudorandom black box function that returns an integer. In order to approximate the distribution of its outputs mod 10, I don’t have to know anything about the function; I just can just sample the function and apply mod 10 post hoc. If I want to say something about this distribution without multiple samples, then I actually have to know something about the function.
This essentially reduces to "What is the next country: Laos, Peru, Fiji?" and "What is the third letter of the next country: Laos, Peru, Fiji?" It's an extra step, but questionable if it requires anything "introspective".
I'm also not sure asking about the nth letter is a great way of computing an additional property. Tokenization makes this sort of thing unnatural for LLMs to reason about, as demonstrated by the famous Strawberry Problem. Humans are a bit unreliable at this too, as demonstrated by your example of "o" being the third letter of "Honduras".
I've been brainstorming about what might make a better test and came up with the following:
Have the LLM predict what its top three most likely choices are for the next country in the sequence and compare that to the objective-level answer of its output distribution when asked for just the next country. You could also ask the probability of each potential choice and see how well-calibrated it is regarding its own logits.
What do you think?
Thanks for pointing that out.
Perhaps the fine-tuning process teaches it to treat the hypothetical as a rephrasing?
It's likely difficult, but it might be possible to test this hypothesis by comparing the activations (or similar interpretability technique) of the object-level response and the hypothetical response of the fine-tuned model.
It seems obvious that a model would better predict its own outputs than a separate model would. Wrapping a question in a hypothetical feels closer to rephrasing the question than probing "introspection". Essentially, the response to the object level and hypothetical reformulation both arise from very similar things going on in the model rather than something emergent happening.
As an analogy, suppose I take a set of data, randomly partition it into two subsets (A and B), and perform a linear regression and logistic regression on each subset. Suppose that it turns out that the linear models on A and B are more similar than any other cross-comparison (e.g. linear B and logistic B). Does this mean that linear regression is "introspective" because it better fits its own predictions than another model does?
I'm pretty sure I'm missing something as I'm mentally worn out at the moment. What am I missing?
I see what you're gesturing at but I'm having difficulty translating it into a direct answer to my question.
Cases where language is fuzzy are abundant. Do you have some examples of where a truth value itself is fuzzy (and sensical) or am I confused in trying to separate these concepts?
Can you help me tease out the difference between language being fuzzy and truth itself being fuzzy?
It's completely impractical to eliminate ambiguity in language, but for most scientific purposes, it seems possible to operationalize important statements into something precise enough to apply Bayesian reasoning to. This is indeed the hard part though. Bayes' theorem is just arithmetic layered on top of carefully crafted hypotheses.
The claim that the Earth is spherical is neither true nor false in general but usually does fall into a binary if we specify what aspect of the statement we care about. For example, "does it have a closed surface", "is it's sphericity greater than 99.5%", "are all the points on it's surface between radius * ( 1 +/- epsilon)", "is the circumference of the equator greater than that of the prime meridian".
Synthetically enhancing and/or generating data could be another dimension of scaling. Imagine how much deeper understanding a person/LLM would have if instead of simply reading/training on a source like the Bible N times, they had to annotate it into something more like the Oxford Annotated Bible and that whole process of annotation became training data.
There should be some way for readers to flag AI-generated material as inaccurate or misleading, at least if it isn’t explicitly author-approved.