slightly related https://arxiv.org/abs/2503.00735
Re: "Let's think step by step"
so let me get this straight.. a simple prompt is able to elicit an entire style of thinking, which is able to solve harder problems, and ultimately ends up motivating new classes of foundation model? Is that what happened last year? Are there any other simple prompts like that? Did we check? Sorry I'm trying to catch up.
I think part of the important part is building your own (company's) collection of examples to train against, since the foundation models are trained against swebench already. And if it works the advantage would be on my CV in the worst case but in equity appreciation in the best case. So, just like any skill, right?
You're right that the whole thing only works if the business can generate returns to high quality code, and can write specifications faster than its complement of engineers can implement them. But I've been in that position sev...
The reasons you give btw don't give me much consolation. The code leaking thing is very temporary; if you could host cutting edge models on AWS or Azure it wouldn't be an issue for most companies. If you could self host them it wouldn't be an issue for almost /any/ companies. The errors thing is a crux. The basic solution to that, I think, is scaling: multishot the problem, rank the solutions, test in every way imaginable, and then for each solved problem optimize your prompts till they can one-shot, keeping a backlog of examples to...
mm.. I gave the wrong impression there; my actual boss doesn't have a huge opinion on AI; in fact he'll take some convincing.
I should state my assumptions:
i dont have time to write any of this down so it's going to come out in the wrong order but here
Where does prompt optimization fit in to y’all’s workflows? I’m surprised not to see mention of it here. E.g OPRO https://arxiv.org/pdf/2309.03409 ?