Gabe

After taking a summer to deeply contemplate what I want to do with my life, I decided I want to do whatever is most beneficial for the advancement of humanity.

From everything I thought of, it came down to either education (enabling humans) or Artificial Intelligence (building something that can enable a human more than me). I decided on AI, and am looking to build my foundations in alignment and system development (previously I was focused on climate change prevention).

My main method of learning is interaction, so I joined my schools AI club, and have joined a few forums, and am reading voraciously on the latest in AI, policy, and alignment.

Posts

Sorted by New

2Most capable publicly available agents?

9mo

1A Psychoanalytic Explanation of Sam Altman's Irrational Actions

9mo

Wikitag Contributions

Comments

Sorted by

Newest

A Psychoanalytic Explanation of Sam Altman's Irrational Actions

Gabe9mo11

No, definitely not, I didn't mean to give that impression. I think on a deeper level, when you consider why anyone does anything though, it does come down to basic instinctual desires such as the need to feel loved or the need to feel powerful. In the absence of a rational motivator, it is likely that whatever Sam Altman's primary instinct is will take over, while the ego rationalizes. So, money is maybe the result, but the real driver is likely a deep seated want of power or status.

Any real toeholds for making practical decisions regarding AI safety?

Answer by GabeSep 29, 202471

I have had this same question for a while, and this is the general conclusion I've come to:

Identify the safety issues today, solve them, and then assume the safety issues scale as the technology scales, and either amp up the original solution, or develop new tactics to solve these extrapolated flaws.

This sounds a little vague, so here is an example: We see one of the big models misrepresent history in an attempt to be woke, and maybe it gives a teenager a misconception of history. So, the best thing we can do from a safety perspective is figure out how to train models to absolutely represent facts. After this is done, we can extrapolate the flaw up to a model deliberately feeding misinformation to achieve a certain goal, and we can try to use the same solution we used for the smaller problem for the bigger problem, or if we see it won't work, develop a new solution.

The biggest problem with this, is it is reactionary, and if you only use this method, a danger may present itself for the first time, and already cause major harm.

I know this approach isn't as effective for xrisk, but still, it's something I like to use. Easy to say though, coming from someone who doesn't actually work in AI safety.