All of Ben Winchester's Comments + Replies

As for where else these ideas can be found, philosophers have been working on conceptual vagueness intensely since the mid-20th century, and cluster concepts were a relatively early innovation. The philosophical literature also has the benefit of being largely free of nebulous speculations about cognition and needless formalism ... The literature also uses terminology in the ordinary way familiar to everybody engaging these issues professionally ... and avoids the invention of needless terms like "thingspace", which mainly achieve the isolation of LessWron

... (read more)
3Sqrt-1
I do agree that a lot of seqeunces pages would benefit a lot from having discussion of previous work or at least stating what these ideas are called in the mainstream, but I feel Yudkowskys neologisms are just... better. Among the examples of similar concepts you mentioned, I definitely felt Yudkowsky was hinting at them with the whole dimensions thing, but I think "thingspace" is still a useful word and not even that complicated; if it was said in a conversation with someone familiar with ANNs I feel they would get what it meant. (Unlike a lot of other Yudkowskisms usually parroted around here, however...)

Yeah, maybe it's less the OODA loop involvement and more that "bad things" lead to a kind of activated nervous system that predisposes us to reactive behavior ("react" as opposed to "reflect/respond"). 

To me, the bad loops are more "stimulus -> react without thinking"  than "observe, orient, decide, act". You end up hijacked by your reactive nervous system. 

2Algon
I think I know what you mean. Like the state people fall into when scrolling through TikTok or gambling on slot machines or so forth. I think the term is called "dark flow" in psychology. I feel like that's just one facet of what you're pointing out though. Some memes or ideologies can mind-kill you, and I think they should kind-of count as "maximizing engagement".  "Stimulus->react without thinking" has potential, but I'm not sure where to go from here with it.

"One problem is that due to algorithmic improvements, any FLOP threshold we set now is going to be less effective at reducing risk to acceptable levels in the future."

And this goes doubly so if we explicitly incentivize low-FLOP models. When models are non-negligibly FLOP-limited by law, then FLOP-optimization will become a major priority for AI researchers. 

This reminds me of Goodhart's Law, which states “when a measure becomes a target, it ceases to be a good measure." 

I.e., if FLOPs are supposed to be a measure of an AI's danger, and we then l... (read more)