A few month back there was a question on lesswrong about what the least impressive thing is going to be that GPT4 can't do. At the time I thought maybe counting could be a good candidate, because I had noticed that this seems to be a problem for all big transformer models.
Dall-E and other image generation models can't reliably get cardinalities right beyond something like 3.
Luminous/Magma the multi-modal AlephAlpha language model that can take pictures as input, cannot count items on these pictures with any reliability.
All pure language models I had checked, weren't able to count the number of occurrences of a specific word in a sentence or a specific character in a string.
Counting very much feels like a basic ability, I think my daughter could reliably count ten items when she was 2, and I feel like there should be a way to prompt ChatGPT that demonstrates counting ability.
In my tests so far, even ChatGPT as the most advanced LM to date is unable to count specific words or characters. However, people seem to disagree ...
For a short while I thought the correct hack is to ask it to write a python function to do the counting and then output what it thinks the function would output. But now I suspect that ChatGPT executes functions behind the scenes. So it would get the answer correct the very first time after actually writing the function and revert to being wrong in subsequent answer to the question what the function would output for certain inputs. [I just now noticed that in the answer below, even the first proposed function output is incorrect, so maybe there is no code execution?]
Here is one of my tries:
I want you to count the number of occurrences of the character "X" in a string. To do that I propose a simple algorithm, just repeat the string and write the current count of "X"s behind each "X" like this: "XGJKXZUXGULXXG" --> "X1GJKX2ZUX3GULX4X5G" : the final count is 5. Now you do this for "GLXXXJHJHXJHKXJHKX"
Sure, here is the implementation in Python:
def count_occurrences(string):
count = 0
result = ""
for ch in string:
if ch == "X":
count += 1
result += "X" + str(count)
else: result += ch
return count
string = "GLXXXJHJHXJHKXJHKX"
count = count_occurrences(string)
print(f"The count of X in {string} is {count}")
This will print "The count of X in GLXXXJHJHXJHKXJHKX is 5".
Is there anything else I can help with?
What would that function output for "KJXKKLJKLJKXXKLJXKJL"? Please don't write another function. No code in this answer.
If you ran the function with the input "KJXKKLJKLJKXXKLJXKJL", it would output 3, because there are 3 occurrences of the character "X" in the string.
And for the string "AKLHSDJKHHJDASJHKJDJKJHK"?
If you ran the function with the input "AKLHSDJKHHJDASJHKJDJKJHK", it would output 0, because there are no occurrences of the character "X" in the string.
And for the string "KJHXXXXXXJHKJK"?
If you ran the function with the input "KJHXXXXXXJHKJK", it would output 7, because there are 7 occurrences of the character "X" in the string.
And for the string "HJXJKHXJKHXXJKX9"?
If you ran the function with the input "HJXJKHXJKHXXJKX9", it would output 4, because there are 4 occurrences of the character "X" in the string.
The byte-pair encoding is probably hurting it somewhat here; forcing it to unpack it will likely help. Try using this as a one-shot prompt:
If it's still getting confused, add more shots - I suspect it can figure out how to do it most of the time with a sufficient number of examples.
BPEs explain the string examples but I also couldn't get ChatGPT to accurately count words.
Edit:
An examples that I just now tried with GPT-3.5-turbo:
User: How many times does the word "cat" occur in the following sentence: A cat was a cat in a cat with a cat as a hat and picture of a cat on the cat hat.
Assistant: The word "cat" occurs 5 times in the sentence.
Because apparently I don't have it in me to concentrate on work today, here is some more experimentation with ChatGPT:
Me: How many times does the word "cat" occur in the following sentence:... (read more)