Personally, I only use the APIs on my computer. I have an Emacs setup based on gptel to bind sending different parts of buffers (either whole page/region or single line) to different models.
Use mostly Claude but sometimes it missbehaves and then I usually send it to 4o. I keep having Gemini in there too but struggle to ever use it. Likewise, I have haiku in there but that's mostly from the days of opus when I sometimes was happy enough with really quick responses compared to sluggish opus.
It's also important to keep different system prompts on different key combinations so that you can ask for a quick answer with just the command / code line you care about in response vs. well thought out answer that will require some text editing to get rid of the explanation. Come to think of it, I might have to write some post processors to only leave the code and throw out the CoT, which would sometimes work.
Emacs is always just one key combo away, whatever I'm doing, and it's always running as a daemon so bringing it up is instantaneous. I can't think of a more comfortable setup. I'm never using the web interfaces, it's a horrible user experience in comparison.
API is dirt cheap if you use it as I do (for single queries or short conversations). It only gets expensive once you really throw in a lot of stuff in the input, since input tokens are so much more expensive. For me, aider-style work on big code context where I expect the actual output to be ready to use, doesn't work well enough yet, and is frustrating. I will wait until better scaffolding or 5 level models for that.
Personally, I only use the APIs on my computer. I have an Emacs setup based on gptel to bind sending different parts of buffers (either whole page/region or single line) to different models.
Use mostly Claude but sometimes it missbehaves and then I usually send it to 4o. I keep having Gemini in there too but struggle to ever use it. Likewise, I have haiku in there but that's mostly from the days of opus when I sometimes was happy enough with really quick responses compared to sluggish opus.
It's also important to keep different system prompts on different key combinations so that you can ask for a quick answer with just the command / code line you care about in response vs. well thought out answer that will require some text editing to get rid of the explanation. Come to think of it, I might have to write some post processors to only leave the code and throw out the CoT, which would sometimes work.
Emacs is always just one key combo away, whatever I'm doing, and it's always running as a daemon so bringing it up is instantaneous. I can't think of a more comfortable setup. I'm never using the web interfaces, it's a horrible user experience in comparison.
API is dirt cheap if you use it as I do (for single queries or short conversations). It only gets expensive once you really throw in a lot of stuff in the input, since input tokens are so much more expensive. For me, aider-style work on big code context where I expect the actual output to be ready to use, doesn't work well enough yet, and is frustrating. I will wait until better scaffolding or 5 level models for that.