I did a similar sentiment analysis experiment with GPT-2 and after testing, I found only a couple instances of it getting problems wrong, my code is here:
https://github.com/spronkoid/GPT2-sentiment-analysis
it seems to do better when you add tokens denoting where parts of the problem lay
I know this post is old but I'd thought I'd comment just for the sake of commenting
I did a similar sentiment analysis experiment with GPT-2 and after testing, I found only a couple instances of it getting problems wrong, my code is here:
https://github.com/spronkoid/GPT2-sentiment-analysis
it seems to do better when you add tokens denoting where parts of the problem lay
I know this post is old but I'd thought I'd comment just for the sake of commenting