I am also in NYC and happy to participate. My lichess rating is around 2200 rapid and 2300 blitz.
Thank you, Larks! Salute. FYI that I am at least one who has informally committed (see below) to take up this mantle. When would the next one typically be due?
https://twitter.com/robertzzk/status/1564830647344136192?s=20&t=efkN2WLf5Sbure_zSdyWUw
Inspecting code against a harm detection predicate seems recursive. What if the code or execution necessary to perform that inspection properly itself is harmful? An AGI is almost certainly a distributed system with no meaningful notion of global state, so I doubt this can be handwaved away.
For example, a lot of distributed database vendors, like Snowflake, do not offer a pre-execution query planner. This can only be performed just-in-time as the query runs or retroactively after it has completed, as the exact structure may be dependent on co-location of d
...I am interested as well. Please share the docs in question with my LW username at gmail dot com if that is a possibility. Thank you!
This was my thought exactly. Construct a robust satellite with the following properties.
Let a "physical computer" be defined as a processor powered by classical mechanics, e.g., through pulleys rather than transistors, so that it is robust to gamma rays, solar flares and EMP attacks, etc.
On the outside of the satellite, construct an onion layer of low-energy light-matter interacting material, such as alternating a coat of crystal silicon / CMOS with thin protective layers of steel, nanocarbon, or other hard material. When the device is construct...
To not support EA? I am confused. Doesn’t the drowning child experiment lend credence to supporting EA?
Isn't this an example of a reflection problem? We induce this change in a system, in this case an evaluation metric, and now we must predict not only the next iteration but the stable equilibria of this system.
Did you remove the vilification of proving arcane theorems in algebraic number theory because the LessWrong audience is more likely to fall within this demographic? (I used to be very excited about proving arcane theorems in algebraic number theory, and fully agree with you.)
Incidentally, for a community whose most important goal is solving a math problem, why is there no MathJax or other built-in Latex support?
The thing that eventually leapt out when comparing the two behaviours is that behaviour 2 is far more informative about what the restriction was, than behaviour 1 was.
It sounds to me like the agent overfit to the restriction R. I wonder if you can draw some parallels to the Vapnik-style classical problem of empirical risk minimization, where you are not merely fitting your behavior to the training set, but instead achieve the optimal trade-off between generalization ability and adherence to R.
In your example, an agent that inferred the boundaries of our...
However, UFFire does not uncontrollably exponentially reproduce or improve its functioning. Certainly a conflagration on a planet covered entirely by dry forest would be an unmitigatable problem rather quickly.
In fact, in such a scenario, we should dedicate a huge amount of resources to prevent it and never use fire until we have proved it will not turn "unfriendly".
I down-voted this comment because it is a clever ploy for karma that rests on exploiting LessWrongers' sometimes unnecessary enthusiasm for increasingly abstract and self-referential forms of reasoning but otherwise adds nothing to the conversation.
Twist: By "this comment" I actually mean my comment, thereby making this a paraprosdokian.
I am an active github R contributor and stackoverflow R contributor and I would be willing to coordinate. Send me an email: rkrzyz at gmail
So you are saying that explaining something is equivalent to constructing a map that bridges an inferential distance, whereas explaining something away is refactoring thought-space to remove an unnecessary gerrymandering?
It feels good knowing you changed your mind in response to my rebuttal.
I disagree with your preconceptions about the "anti" prefix. For example, an anti-hero is certainly a hero. I think it is reasonable to consider "anti" a contextually overloaded semantic negater whose scope does not have to be the naive interpretation: anti-X can refer to "opposite of X" or "opposite or lacking of a trait highly correlated with X" with the exact choice clear from context.
I got a frequent LessWrong contributor a programming internship this summer.
It is as if you're buying / shorting an index fund on opinions.
Strong AI could fail if there are limits to computational integrity on sufficiently complex systems, similar to heating and QM problems limiting transistor sizes. For example, perhaps we rarely see these limits in humans because their frequency is one in a thousand human-thought-years, and when they do manifest it is mistaken as a diagnosis of mental illness.
The possibility of an "adaptation" being in fact an exaptatation or even a spandrel is yet another reason to be incredibly careful about purposing teleology into a discussion about evolutionarily-derived mechanisms.
The question of the subject is too dense and should be partitioned. Some ideas for auxiliary questions:
Do there exists attempts at classifications of parenting styles? (So that we may not re-invent tread tracks)
Is parenting or childrearing an activity that supports the existence of relevant goals? Do there exist relevant values? Or is parenting better approached as a passive activity sans evaluation with no winners or losers? (So that we may affirm this question is worth answering)
Given affirmative answers to the above questions (and having achieved
In other words, productivity need not be confused with busywork, and I suspect this is primarily an artifact of linguistic heuristics (similar brain procedures get lightly activated when you hear "productivity" as when you hear "workout" or "haste" or even "forward march").
If productivity were a currency, you could say "have I acquired more productons this week than last week with respect to my current goal?" If making your family well off can be achieved by lounging around in the pool splashing each other, then that is high family welfare productivity.
I spend time worrying about whether random thermal fluctuation in (for example) suns produces sporadic conscious moments simply due to random causal structure alignments. Since I also believe most potential conscious moments are bizarre and painful, that worries me. This worry is not useful when embedded in systems one, a worry which the latter was not created to cope with, so I only worry in the system two philosophical curiosity sense.
Seeing as how classical mechanics is an effective theory for physically restructuring significant portions of reality to one's goals, you are promising something tantamount to a full theory of knowledge acquisition, something linguists and psychologists smarter than you have worked on for centuries.
Calm down with promises that will disappoint you and make an MVP.
I do not understand why no one is interested.
Do you have an Amazon wish list? You are awesome.
I am interested. What software did you use? I am trying to learn NEURON but it feels like Fortran and I have trouble navigating around the cobwebs.
In the mathematical theory of Galois representations, a choice of algebraic closure of the rationals and an embedding of this algebraic closure in the complex numbers (e.g. section 5) is usually necessary to frame the background setting, but I never hear "the algebraic closure" or "the embedding," instead "an algebraic closure" and "an embedding." Thus I never forget that a choice has to be made and that this choice is not necessarily obvious. This is an example from mathematics where careful language is helpful in tracking background assumptions.
In mathematical terms, the map from problem space to reference classes is a projection and has no canonical choice (you apply the projection by choosing to lose information), whereas the map from causal structures to problem space is an imbedding and has such a choice (and the choice gains information).
Are we worried whether the compartmentalized accounting of mission and fundraising related financial activity via outsourcing to a different organization can incur PR costs as well? If an organization is worried about "look[ing] bad" because some of their funds are being employed for fundraising, thus lowering their effective percentage, would they be susceptible to minor "scandals" that put to question the validity of GiveWell's metrics by, say, an investigative journalist that misinterprets the outsourced fundraising as misrepresentat...
Yes, thank you, I meant compression algorithm.
This would have been helpful to my 11-year-old self. As I had always been rather unnecessarily called precocious, I developed the pet hypothesis that my life was a simulation of someone whose life in history had been worth re-living: after all, the collection of all possible lives is pretty big, and mine seemed to be extraordinarily neat, so why not imagine some existential video game in which I am the player character?
Unfortunately, I think this also led me to subconsciously be a little lazier than I should have been, under the false assumption that I was...
Can anyone explain what is wrong with the hypothesis of a largely structural long-term memory store? (i.e., in the synaptome, relying not on individual macromolecules but on the ability of a graph of neurons and synapses to store information)
There's nothing wrong with it, it's just that the strength of connections (local synaptic concentration of various neurotransmitters and receptors) has been demonstrated to be just as important as their graph-theoretical structure for long-term memory. Synapses can regulate their strength and maintain the strength over long time periods. The problem that the quoted paragraph is trying to illustrate is that a simple chemical concentration explanation doesn't cut it since chemicals are being diffused and turned over inside synapses all the time. Thus there must be some mechanism for long-term persistence of memory.
I think this can be solved in practice by heeding the assumption that a very sparse subset of all such strings will be mapped by our encryption algorithm when embedded physically. Then if we low-dimensionally parametrize hash functions of the form above, we can store the parameters for choosing a suitable hash function along with the encrypted text, and our algorithm only produces compressed strings of greater length if we try to encrypt more than some constant percentage of all possible length <= n strings, with n fixed (namely, when we saturate suitab...
This reminds me of the non-existence of a perfect encryption algorithm, where an encryption algorithm is a bijective map S -> S, where S is the set of finite strings on a given alphabet. The image of strings of length at most n cannot lie in strings of length at most n-1, so either no string gets compressed (reduced in length) or there will be some strings that will become longer after compression.
To be frank, I question the value of compressing information of this generality, even as a roadmap. For example, "Networking" can easily be expanded into several books (e.g., Dale Carnegie) and "Educating oneself in career-related skills" has almost zero intersection when quantified over all possible careers. If Eliezer had made a "things to know to be a rationalist" post instead of breaking it down into The Sequences, I doubt anyone would have had much use for it.
Maybe you could focus on a particular topic, compile a list of ...
p/s/a: Going up to a girl pretty much anywhere in public and saying something like "I thought you looked cute and wanted to meet you" actually works if your body language is in order. If this seems too scary, going on Chatroulette or Omegle and being vaguely interesting also works, and I know people who have gotten married from meeting this way.
p/s/a: Vitamin D supplements can take you from depressed zombie to functioning human being in one week.
if your body language is in order
This reads unfortunately like an excuse ahead of time. "Oh, your body language must not have been in order."
(Although I do agree that if you're not socially offensive, just telling people when you fancy them does in fact work quite well and I wish I'd realised that ten years earlier than I had.)
Word to the wise: If you substitute "hot" for "cute" you may get unanticipated negative results. I would not interpret "hot" in anywhere near the same way as "cute". Here's how that would translate for me:
"I thought you looked cute..." = "I am likely to be interested in things like emotional intimacy and cuddling."
"I thought you looked hot..." = "I am likely to be one of those guys who is going to be so persistent in making attempts to get casual sex out of you tonight that it is going to drive you up a wall."
I have nothing against sex, but like many people, I am annoyed by persistent attempts to get things from me.
See lukeprog's How to Beat Procrastination and Algorithm for Beating Procrastination. In particular, try to identify which term(s) in the equation in the latter are problematic for you, then use goal shaping to slowly modify them. (Of course, you could also realize you may not want to do this master's thesis and switch to a different problem.)
Goal shaping means rewarding yourself for successively more proximate actions to the desired goal (writing your thesis) in behavior-space. For example, rather than beating yourself up over not getting anything done to...
Given the dynamic nature of human preferences, it may be that the best one can do is n-fold money pumps, for low values of n. Here, one exploits some intransitive preferences n times before the intransitive loop is discovered and remedied, leaving another or a new vulnerability. Even if there may never be a single time that the agent you are exploiting is VNM-rational, its volatility by appropriate utility perturbations will suffice to keep money pumping in line. This mirrors the security that quantum encryption offers: even if you manage to exploit it, th...
For example, "It was not the first time Allana felt the terror of entrapment in hopeless eternity, staring in defeated awe at her impassionate warden." (bonus point if you use a name of a loved one of the gatekeeper)
The AI could present in narrative form that it has discovered using powerful physics and heuristics (which it can share) with reasonable certainty that the universe is cyclical and this situation has happened before. Almost all (all but finitely many) past iterations of the universe that had a defecting gatekeeper led to unfavorable outcomes and almost all situations with a complying gatekeeper led to a favorable outcome.
Good point. It might be that any 1-self-aware system is ω-self-aware.
Thanks, this should work!
Thanks! I presented him with these arguments as well, but they are more familiar on LW and so I didn't see the utility of posting them here. The above argument felt more constructive in the mathematical sense. (Although my friend is still not convinced.)
What were the reactions of your friends?
I agree so much I'm commenting.
The culmination of a long process of reconciling my decision to go to grad school in mathematics with meaning. I didn't realize it before, but I had not expressly realized that mathematicians did all their work using clusters of adaptations that arose through natural selection. Certainly, I would have asserted "all humans are animals that evolved by natural selection," and "mathematicians are humans," but somehow I assigned mathematics privilege. This was somewhat damaging because I didn't expressly apply things like cognitive science r...
Yes.
To add to my comments above, I mean that there is no paradox or unnecessary ache in thinking about minds as physical objects (and hence pausable, storable, and replicable). Everything we've ever done happens within minds anyway, and there is nothing we can do about that. Whatever mental representations we conjure when we think of atoms or molecules or electromagnetic forces are inaccurate and incomplete: this "conscious" experience and sensory perception and thought is what a particular collection of molecules and forces is, rather than a vis...
If you get a non constant, yes. For a linear function, f(a+1) - f(a) = f'(a). Inductively you can then show that the nth one-step difference of a degree n polynomial f at a point a is f^(n)(a). But this doesn't work for anything but n. Thanks for pointing that out!
At this point I would direct the "deferred task" apparatus fully towards interventional interpretability. Put a moratorium on further gradient-based training, which is not well understood and can have many indirect effects unless you have some understanding of modularity and have applied stop gradients almost everywhere that is irrelevant to the generator of the conditional, deceptive reasoning behavior. Instead halt, melt and catch fire at that point.
Halt further model deployments towards the original deferred task. Quarantine the model that first exhibit... (read more)