With the right prompt, I get the following results for a few examples I tried (first attempts, no cherry-picking).
Input: ( ) ( ( ) )
Output: Balanced
Input: ( ) ( ( )
Output: Unbalanced
Input: ) (
Output: Unbalanced
Input: ( ) ( ) ( )
Output: Balanced
So it is definitely able to learn balancing a small number of parentheses.
I'm convinced that humans must spike their blood sugar and/or pump their body full of stimulants such as caffeine in order to get past the natural tendency to find it unbearably dull to memorize words and syntax by rote and lifeless connection with the structures in their native language.
Just a comment: This is certainly not true for every human. Some people really enjoy that.
Thanks, that makes sense given your assumptions and results.