Okay, you raise a very good point. To analogize to my own brain: it's like noticing that I can multiply integers 1-20 in my head in one step, but for larger numbers I need to write it out. Does that mean that my neural net can do multiplication? Well, as you say, it depends on n.
it's easy to imagine a huge LLM capable doing 500 iterations of SH1 of small strings in one shot
Nitpick: for SHA1 (and any other cryptographic hash functions) I can't fathom how an LLM could learn it through SGD, as opposed to being hand coded. To do SHA1 correctly you need to impl...
I think your definition of LLM is the common one. For example, https://www.lesswrong.com/posts/KJRBb43nDxk6mwLcR/ai-doom-from-an-llm-plateau-ist-perspective is on the front page right now, and it uses LLM to refer to a big neural net, in a transformer topology, trained with a lot of data. This is how I was intending to use it as well. Note the difference between "language model" as Christopher King used it, and "large language model" as I am. I plan to keep using LLM for now, especially as GPT refers to OpenAI's product and not the general class of things.
Thank you for trying so hard to replicate this little experiment of mine.
Perhaps you sent the prompt in the middle of a conversation rather than at the beginning? If the same list was also sent earlier in the conversation, I can imagine it managed to get the answer right because it had more time to 'take in' the numbers, or otherwise establish a context that guided it to the right answer.
Yes, this is exactly what I did. I made sure to use a new list of numbers for each question – I'd noticed that it would remember previous answers if I didn't – but I didn'...
Oh this is funny. It told me that it ran the code and got the answer [64, 91, 39, 47]. I checked that these satisfied the problem. But I didn't check (until reviewing other comments) whether that's actually what the code outputted. It's not. Technically the code actually doesn't output anything, it saves the result to a variable instead. And if I print that variable, it found [64, 6, 96, 75].
Lesson 1: I was not careful enough in checking its output, even when I thought I was being careful.
Lesson 2: It is indeed not running code, even if it tells me it is.
Thanks, this is exactly the kind of feedback I was hoping for.
Nomenclature-wise: I was using LLM to mean "deep neural nets in the style of GPT-3" but I should be more precise. Do you know of a good term for what I meant?
More generally, I should learn about other styles of LLM. I've gotten some good leads from these comments and some DMs.
This was actually my position when I started writing this post. My instincts told me that "thinking out loud" was a big enhancement to its capabilities. But then I started thinking about what I saw. I watched it spend tens of trillions of FLOPs to write out, in English, how to do a 3x3 matrix multiplication. It was so colossally inefficient, like building a humanoid robot and teaching it to use an abacus.
Then again, your analogy to humans is valid. We do a huge amount of processing internally, an... (read more)