Xodarap

Wikitag Contributions

Comments

Sorted by

Note that the REBench correlation definitionally has to be 0 because all tasks have the same length. SWAA similarly has range restriction, though not as severe. 

Xodarap10

This seems plausible to me but I could also imagine the opposite being true: my working memory is way smaller than the context window of most models. LLMs would destroy me at a task which "merely" required you to memorize 100k tokens and not do any reasoning; I would do comparatively better at a project which was fairly small but required a bunch of different steps.

Xodarap30

The METR report you cite finds that LLMs are vastly cheaper than humans when they do succeed, even for longer tasks:

The ARC-AGI results you cite feel somewhat hard to interpret: they may indicate that the very first models with some capability will be extremely expensive to run, but don't necessarily mean that human-level performance will forever be expensive.

Xodarap20

I think the claim is that things with more exposure to AI are more expensive.

Xodarap60

You said

If you "withdraw from a cause area" you would expect that if you have an organization that does good work in multiple cause areas, then you would expect you would still fund the organization for work in cause areas that funding wasn't withdrawn from. However, what actually happened is that Open Phil blacklisted a number of ill-defined broad associations and affiliations, where if you are associated with a certain set of ideas, or identities or causes, then no matter how cost-effective your other work is, you cannot get funding from OP

I'm wondering if you have a list of organizations where Open Phil would have funded their other work, but because they withdrew from funding part of the organization they decided to withdraw totally.

This feels very importantly different from good ventures choosing not to fund certain cause areas (and I think you agree, which is why you put that footnote).

Xodarap110

what actually happened is that Open Phil blacklisted a number of ill-defined broad associations and affiliations

is there a list of these somewhere/details on what happened?

Thanks for writing this up! I wonder how feasible it is to just do a cycle of bulking and cutting and then do one of body recomposition and compare the results. I expect that the results will be too close to tell a difference, which I guess just means that you should do whichever is easier.

I think it would be helpful  for helping others calibrate, though obviously it's fairly personal.

Possibly too sensitive, but could you share how the photos performed on Photfeeler? Particularly what percentile attractiveness? 

Load More