People who’ve spent a lot of time thinking about P vs NP often have the intuition that “verification is easier than generation”. It’s easier to verify a solution to some equations than to find a solution. It’s easier to verify a password than to guess it. That sort of thing. The claim that it is easier to verify solutions to such problems than to generate them is essentially the claim that P ≠ NP, a conjecture which is widely believed to be true. Thus the intuition that verification is generally easier than generation.
The problem is, this intuition comes from thinking about problems which are in NP. NP is, roughly speaking, the class of algorithmic problems for which solutions are easy to verify. Verifying the solution to some equations is easy, so that problem is in NP.
I think a more accurate takeaway would be that among problems in NP, verification is easier than generation. In other words, among problems for which verification is easy, verification is easier than generation. Rather a less impressive claim, when you put it like that.
With that in mind, here is an algorithmic problem for which generation is easier than verification.
Predicate: given a program, does it halt?
Generation problem: generate a program which halts.
Verification problem: given a program, verify that it halts.
The generation problem is trivial. The verification problem is uncomputable.
That’s it for the post, you all can argue about the application to alignment in the comment section.
I don't think the generalization of the OP is quite "sometimes it's easier to create an object with property P than to decide whether a borderline instance satisfies property P". Rather, the halting example suggests that verification is likely to be harder than generation specifically when there is some (possibly implicit) adversary. What makes verification potentially hard is the part where we have to quantify over all possible inputs - the verifier must work for any input.
Borderline cases are an issue for that quantifier, but more generally any sort of adversarial pressure is a potential problem.
Under that perspective, the "just ask it to try again on borderline cases" strategy doesn't look so promising, because a potential adversary is potentially optimizing against me - i.e. looking for cases which will fool not only my verifier, but myself.
As for the everyday experiences you list: I agree that such experiences seem to be where peoples' intuitions on the matter often come from. Much like the case in the OP, I think people select for problems for which verification is easy - after all, those are the problems which are most legible, easiest to outsource (and therefore most likely to be economically profitable), etc. On the other hand, once we actively look for cases where adversarial pressure makes verification hard, or where there's a "for all" quantifier, it's easy to find such cases. For instance, riffing on your own examples: