I didn't know that about Bayesian inference-ish updating baking in an Occam-ish prior. Does it need to be complexity penalizing, or would any consistent prior-choosing rule work? I assume the former from the phrasing.
Why is that? "does not much constrain the end results" could just mean that unless we assume the agent is Occam ish, then we can't tell from its posteriors whether it did Bayesian inference or something else. But I don't see why that couldn't be true of some non-Occam-ish prior picking rule, as long as we knew what that was.
I think this definition includes agents that only cared about their sensory inputs, since sensory inputs are a subset of states of the world.
This makes me think that the definition of economic agent that I googled isn't what was meant, since this one seems to be primarily making a claim about efficiency, rather than about impacting markets ("an agent who is part of the economy"). Something more like homo economicus?
Naturalistic agents seems to have been primarily a claim about the situation that agent finds itself in, rather than a claim about that agents' models (eg, a cartesian dualist which was in fact embedded in a universe made of atoms and was itself made of atoms, would still be a "naturalistic agent", I think)
The last point reminds me of Dawkins style extended phenotypes; not sure how analogous/comparable that concept is. I guess it makes me want to go back and figure out if we defined what "an agent" was. So like does a beehive count as "an agent" (I believe that conditioned on it being an agent at all, it would be a naturalized agent)?
...does Arbital have search functionality right now? Maybe not :-/
Had a very visceral experience of feeling surrounded by a bunch of epistemically efficient (wrt me) agents in a markets game tonight. Just like "yup, I can choose to bet, or not bet, and if I do bet, I may even make money, because the market may well be wrong, but I will definitely, definitely lose money in expectation if I bet at all"
I seem to have found max comment length? Here's the rest:
I can't tell if I should also be trying to think about whether there's a reasonable definition of "the goals of google maps" wherein it actually is maximizing its goals right now in a way we can't advance. I don't think there is one?
I don't know why this hasn't happened to corporations - you'd think someone would try it, at some point, and that if it actually worked pretty well it would eventually allow them to outcompete, even if it was the sort of innovation that meant you had to climb uphill for a bit you'd expect people to keep periodically trying and for one of them eventually to overcome the activation energy barrier?
Boundedly rational ?means rational even when you don't have infinite computing power? Naturalistic ?refers to naturalized induction, where you're not a cartesian dualist who thinks your processes can't be messed with by stuff in the world and also you're not just thinking of yourself as a little black dot in the middle of Conway's game of life? Google says economic agent means one who has an impact on the economy by buying, selling or trading; I assign 65% to that being roughly the meaning in use here?
Somehow the epistemic efficiency thing reminds me of the halting problem; that whatever we try and do, it can just do it more. Or... somehow it actually reminds me more the other way, that it's solved the halting problem on us. Apologies for abuse of technical terms.
So an epistemically efficient agent, for example, is already overcoming all the pitfalls you see in movies of "not being able to understand the human drive for self sacrifice" or love, or etc.
Is there an analogue of efficient markets for instrumental efficiency? Some sort of master-strategy-outputting process that exists (or maybe plausibly exists in at least some special cases) in our world? Maybe Deep Blue at chess, I guess? Google maps for driving directions (for the most part)? reads to next paragraph. Well; not sure whether to update against Google Maps being an example from the fact that it's not mentioned in "instrumentally efficient agents are presently unknown" section
That said, "outside very limited domains" - well, I guess "the whole stock market, mostly" is a fair bit broader than "chess" or even "driving directions". Ah, I see; so though chess programs are overall better than humans, they're not hitting the "every silly-looking move is secretly brilliant" bar yet. Oh, and that's definitely not true of google maps - if it looks like it's making you do something stupid, you should have like 40% that it's in fact being stupid. Got it.
I can't tell if I should also be trying to think about whether there's a reasonable de
Does it have to be (1) and (2)? My impression is that either one should be sufficient to count - I guess unless they turn out to be isomorphic, but naively I'd expect there to be edge cases with just one or the other.
Gosh this is just like reading the sequences, in the sense that I'm quite confused about what order to read things in. Currently defaulting to reading in the order on the VA list page
My guess why not to use a mathy definition at this point: because we don't want to undershoot when these protocols should be in effect. If that were the only concern though presumably we could just list several sufficient conditions and note that it isn't an exhaustive list. I don't see that, so maybe I'm missing something.
Are stock prices predictably under/over estimates on longer time horizons? I don't think I knew that.
I guess all the brackets are future-hyperlinks?
So an advanced agent doesn't need to be very "smart" necessarily; advanced just means "can impact the world a lot"
I'm guessing instrumental efficiency means that we can't predict it making choices less-smart-than-us in a systematic way? Or something like that
Oh good, cognitive uncontainability was one of the ones I could least guess what it meant from the list hmm, also cross-domain consequentialism.
I don't remember what Vingean unpredictability is. hmm, it seems to be hard to google. I know I've listened to people talk about Vingean reflection, but I didn't really understand it enough for it to stick. Ok, googling Vingean reflection gets me "ensuring that the initial agent's reasoning about its future versions is reliable, even if these future versions are far more intelligent than the current reasoner" from a MIRI abstract. (more generally, reasoning about agents that are more intelligent than you). So Vingean unpredictability would be that you can't perfectly predict the actions of an agent that's more intelligent than you?
I'm surprised you want to use the word "advanced" to for this concept; implies to me this is the main kind of high-level safety missing from standard "safety" models? I guess the list of bullet points does cover a whole lot of scenarios. It does make it sound sexy, and not like something you'd want to ignore. Obvious alternative usage for the word advanced relative to safety would be for "actually" safe (over just claimed safe). Maybe that has other words available to it like provably.
I have the intuition that many proposals fail against advanced agents; I don't see intuitively that it's the "advanced" that's the main problem (that would imply they would work as long as the agent didn't become advanced, I think? What does that look like? And is this like Asimov's three laws or tool AI or what?)
Are there any interesting intuition pumps that fall out of omniscience/omnipotence that don't fall easily out of the "advanced" concept?
Examples of 'strong, general optimization pressures'? Maybe the sorts of things in that table from Superintelligence. ?Optimization pressure = something like a selective filter, where "strong" means that it was strongly selected for? And maybe the reason to say 'optimization' is to imply that there was a trait that was selected for, strongly, in the same direction (or towards the same narrow target, more like?) for many "generations". Mm, or that all the many different elements of the agent were built towards that trait, with nothing else being a strong competitor. And then "general" presumably is doing something like the work that it does in "general intelligence", ie, not narrow? Ah, a different meaning would be that the agent has been subject to strong pressures towards being a 'general optimizer'. Seems less strongly implied by the grammar, but doesn't create any obvious meaningful interpretive differences.
Oh, or "general" could mean "along many/all axes". So, optimization pressure that is strong, and along many axes. Which fails to specify a set of axes that are relevant, but that doesn't seem super problematic at this moment.
It's not obvious to me that filtering for agents powerful enough to be relevant will leave mainly agents who've been subjected to strong general optimization pressures. For example the Limited Genie described on the Advanced Agent page maybe wasn't?
For self optimization, I assume this is broadly because of the convergent instrumental values claim?