FeepingCreature

Wikitag Contributions

Comments

Sorted by

I think this only works if your standards for posts are in sync with those of the outside world. Otherwise, you're operating under incompatible status models and cannot sustain your community standards against outside pressure; you will always be outcompeted by the outside world (who can pretty much always offer more status than you can simply by volume) unless you can maintain the worth of your respect, and you cannot do that by copying outside appraisal.

I think you failed to establish that the long, well-written and highly-upvoted critiques lived in the larger LW archipelago, so there's a hole in your existence proof. On that basis, I would surmise that on priors Said assumed you were referring to comments or on-site posts.

I don't understand it but it does make me feel happy.

Okay, I'll do that, but why do I have to send an email...?

Like, why isn't the how-to like just in a comment? Alternately, why can't I select Lightcone as an option on Effektiv-Spenden?

Unless there's some legal reason, this seems like a weird unforced own-goal.

Original source, to my knowledge. (July 1st, 2014)

"So long, Linda! I'm going to America!"

Human: "Look, can't you just be normal about this?"

GAA-optimized agent: "Actually-"

Hm, I guess this wouldn't work if the agent still learns an internalized RL methodology? Or would it? Say we have a base model, not much need for GAA because it's just doing token pred. We go into some sort of (distilled?) RL-based cot instruct tuning, GAA means it picks up abnormal rewards from the signal more slowly, ie. it doesn't do the classic boat-spinning-in-circles thing (good test?), but if it internalizes RL at some point its mesaoptimizer wouldn't be so limited, and that's a general technique so GAA wouldn't prevent it? Still, seems like a good first line of defense.

The issue is, from a writing perspective, that a positive singularity quickly becomes both unpredictable and unrelatable, so that any hopeful story we could write would, inevitably, look boring and pedestrian. I mean, I know what I intend to do come the Good End, for maybe the next 100k years or so, but probably a five-minute conversation with the AI will bring up many much better ideas, being how it is. But ... bad ends are predictable, simple, and enter a very easy to describe steady state.

A curve that grows and never repeats is a lot harder to predict than a curve that goes to zero and stays there.

Load More