Posts

Sorted by New

Wiki Contributions

Comments

Sorted by
Lerk50

Yep, most of my hope is on our civilization's coordination mechanisms kicking in in time. Most of the world's problems seem to be failures to coordinate, but that's not the same as saying we can't coordinate.

This is where most of my anticipated success paths lie as well.

Other hopes are around a technical breakthrough that advances alignment more than capabilities…

I do not really understand how technical advance in alignment realistically becomes a success path.  I anticipate that in order for improved alignment to be useful, it would need to be present in essentially all AI agents or it would need to be present in the most powerful AI agent such that the aligned agent could dominate other unaligned AI agents.  I don’t expect uniformity of adoption and I don’t necessarily expect alignment to correlate with agent capability.  By my estimation, this success path rests on the probability that the organization with the most capable AI agent is also specifically interested in ensuring alignment of that agent.  I expect these goals to interfere with each other to some degree such that this confluence is unlikely.  Are your expectations different?

I have a massive level of uncertainty around AGI timelines, but there's an uncomfortably large amount of probability mass on the possibility that through some breakthrough or secret project, AGI was achieved yesterday and not caught up with me.

I have not been thinking deeply in the direction of a superintelligent AGI having been achieved already.  It certainly seems possible.  It would invalidate most of the things I have thus far thought of as plausible mitigation measures.

What ideas are those?

Assuming a superintelligent AGI does not already exist, I would expect someone with a high P(doom) to be considering options of the form:

Use a smart but not self-improving AI agent to antagonize the world with the goal of making advanced societies believe that AGI is a bad idea and precipitating effective government actions.  You could call this the Ozymandias approach.

Identify key resources involved in AI development and work to restrict those resources.  For truly desperate individuals this might look like the Metcalf attack, but a tamer approach might be something more along the lines of investing in a grid operator and pushing to increase delivery fees to data centers.

I haven’t pursued these thoughts in any serious way because my estimation of the threat isn’t as high as yours.  I think it is likely we are unintentionally heading toward the Ozymandias approach anyhow.

Lerk60

I mean, what do you think we've been doing all along?

 

So, the short answer is that I am actually just ignorant about this.  I’m reading here to learn more but I certainly haven’t ingested a sufficient history of relevant works.  I’m happy to prioritize any recommendations that others have found insightful or thought provoking, especially from the point of view of a novice.

 

I can answer the specific question “what do I think” in a bit more detail.  The answer should be understood to represent the viewpoint of someone who is new to the discussion and has only been exposed to an algorithmically influenced, self-selected slice of the information.

 

I watched the Lex Fridman interview of Eliezer Yudkowsky and around 3:06 Lex asks about what advice Eliezer would give to young people.  Eliezer’s initial answer is something to the extent of “Don’t expect a long future.”  I interpreted Eliezer’s answer largely as trying to evoke a sense of reverence for the seriousness of the problem.  When pushed on the question a bit further, Eliezer’s given answer is “…I hardly know how to fight myself at this point.”  I interpreted this to mean that the space of possible actions that is being searched appears intractable from the perspective of a dedicated researcher.  This, I believe, is largely the source of my question.  Current approaches appear to be losing the race, so what other avenues are being explored?

 

I read the “Thomas Kwa's MIRI research experience” discussion and there was a statement to the effect that MIRI does not want Nate’s mindset to be known to frontier AI labs.  I interpreted this to mean that the most likely course being explored at MIRI is to build a good AI to preempt or stop a bad AI.  This strikes me as plausible because my intuition is that the LLM architectures being employed are largely inefficient for developing AGI.  However, the compute scaling seems to work well enough that it may win the race before other competing ideas come to fruition.

 

An example of an alternative approach that I read was “Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible” which seems like an avenue worth exploring, but well outside of my areas of expertise.  The approach shares a characteristic with my inference of MIRI’s approach in that both appear to be pursuing highly technical avenues which would not scale meaningfully at this stage by adding helpers from the general public.

 

The forms of approaches that I expected to see but haven’t seen too much of thus far are those similar to the one that you linked about STOP AI.  That is, approaches that would scale with the addition of approximately average people.  I expected that this type of approach might take the form of disrupting model training by various means or coopting the organizations involved with an aim toward redirection or delay.  My lack of exposure to such information supports a few competing models: (1) drastic actions aren’t being pursued at large scales, (2) actions are being pursued covertly, or (3) I am focusing my attention in the wrong places.

 

Our best chance at this point is probably government intervention to put the liability back on reckless AI labs for the risks they're imposing on the rest of us, if not an outright moratorium on massive training runs.

 

Government action strikes me as a very reasonable approach for people estimating long time scales or relatively lower probabilities.  However, it seems to be a less reasonable approach if time scales are short or probabilities are high.  I presume that your high P(doom) already accounts for your estimation of the probability of government action being successful.  Does your high P(doom) imply that you expect these to be too slow, or too ineffective?  I interpret a high P(doom) as meaning that the current set of actions that you have thought of are unlikely to be successful and therefore additional action exploration is necessary.  I would expect this would include the admission of ideas which would have previously been pruned because they come with negative consequences.

Lerk30

I believe it was the Singularity subreddit in this case.  I was more or less passing through while searching for places to learn more about principles of ANN for AGI.

Lerk171

I found the site a few months ago due to a link from an AI themed forum.  I read the sequences and developed the belief that this was a place for people who think in ways similar to me.  I work as a nuclear engineer.  When I entered the workforce, I was surprised to find that there weren’t people as dispositioned toward logic as I was.  I thought perhaps there wasn’t really a community of similar people and I had largely stopped looking.

 

This seems like a good place for me to learn, for the time being.  Whether or not this is a place for me to develop community remains to be seen. The format seems to promote people presenting well-formed ideas.  This seems valuable, but I am also interested in finding a space to explore ideas which are not well-formed.  It isn’t clear to me that this is intended to be such a space.  This may simply be due to my ignorance of the mechanics around here. That said, this thread seems to be inviting poorly formed ideas and I aim to oblige.

 

There seem to be some writings around here which speak of instrumental rationality, or “Rationality Is Systematized Winning”.  However, this seems to beg the question: “At what scale?”  My (perhaps naive) impression is that if you execute instrumental rationality with an objective function at the personal scale it might yield the decision that one should go work in finance and accrue a pile of utility. But if you apply instrumental rationality to an objective function at the societal scale it might yield the decision to give all your spare resources to the most effective organizations you can find.  It seems to me that the focus on rationality is important but doesn’t resolve the broader question of “In service of what?” which actually seems to be an important selector of who participates in this community.  I don’t see much value in pursuing Machiavellian rationality and my impression is that most here don’t either.  I am interested in finding additional work that explores the implications of global scale objective functions.

 

On a related topic, I am looking to explore how to determine the right scale of the objective function for revenge (or social correction if you prefer a smaller scope).  My intuition is that revenge was developed as a mechanism to perform tribal level optimizations.  In a situation where there has been a social transgression, and redressing that transgression would be personally costly but societally beneficial, what is the correct balance between personal interest and societal interest?

 

My current estimate of P(doom) in the next 15 years is 5%.  That is, high enough to be concerned , but not high enough to cash out my retirement. I am curious about anyone harboring a P(doom) > 50%.  This would seem to be high enough to support drastic actions.  What work has been done to develop rational approaches to such a high P(doom)?

 

This idea is quite poorly formed, but I am interested in exploring how to promote encapsulation, specialization, and reuse of components via the cost function in an artificial neural network. This comes out of the intuition that actions (things described by verbs, or transforms) may be a primitive in human mental architecture and are one of the mechanisms by which analogical connections are searched.  I am interested in seeing if continuous mechanisms could be defined to promote the development of a collection of transforms which could be applied usefully across multiple different domains.  Relatedly, I am also interested in what an architecture/cost function would need to look like to promote retaining multiple representations of a concept with differing levels of specificity/complexity.