sweenesm

Ceramic engineering researcher by training. Been interested in ethics for several years. More recently have gotten into data science.

Wiki Contributions

Comments

Sorted by

Thanks for the post. I think it'd be helpful if you could add some links to references for some of the things you say, such as:

For instance, between 10^10 and 10^11 parameters, models showed dramatic improvements in their ability to interpret emoji sequences representing movies.

Any update on when/if prizes are expected to be awarded? Thank you.

Thanks for the post and congratulations on starting this initiative/institute! I'm glad to see more people drawing attention to the need for some serious philosophical work as AI technology continues to advance (e.g., Stephen Wolfram).

One suggestion: consider expanding the fields you engage with to include those of moral psychology and of personal development (e.g., The Option Institute, Tony Robbins, Nathaniel Branden).

Best of luck on this project being a success!

Thanks for the comment. You might be right that any hardware/software can ultimately be tampered with, especially if an ASI is driving/helping with the jail breaking process. It seems likely that silicon-based GPU's will be the hardware to get us to the first AGI's, but this isn't an absolute certainty since people are working on other routes such as thermodynamic computing. That makes things harder to predict, but it doesn't invalidate your take on things, I think. My not-very-well-researched-initial-thought was something like this (chips that self destruct when tampered with). 

I envision people having AGI-controlled robots at some point, which may complicate things in terms of having the software/hardware inaccessible to people, unless the robot couldn't operate without an internet connection, i.e., part of its hardware/software was in the cloud. It's likely the hardware in the robot itself could still be tampered with in this situation, though, so it still seems like we'd want some kind of self-destructing chip to avoid tampering, even if this ultimately only buys us time until AGI+'s/ASI's figure a way around this.

Agreed, "sticky" alignment is a big issue - see my reply above to Seth Herd's comment. Thanks.

sweenesm0-2

Except that timelines are anyone's guess. People with more relevant expertise have better guesses.

Sure. Me being sloppy with my language again, sorry. It does feel like having more than a decade to AGI is fairly unlikely.

I also agree that people are going to want AGI's aligned to their own intents. That's why I'd also like to see money being dedicated to research on "locking in" a conscience module in an AGI, most preferably on a hardware level. So basically no one could sell an AGI without a conscience module onboard that was safe against AGI-level tampering (once we get to ASI's, all bets are off, of course). 

I actually see this as the most difficult problem in the AGI general alignment space - not being able to align an AGI to anything (inner alignment) or what to align an AGI to ("wise" human values), but how to keep an AGI aligned to these values when so many people (both people with bad intent and intelligent but "naive" people) are going to be trying with all their might (and near-AGI's they have available to them) to "jail break" AGI's.[1] And the problem will be even harder if we need a mechanism to update the "wise" human values, which I think we really should have unless we make the AGI's "disposable."

  1. ^

    To be clear, I'm taking "inner alignment" as being "solved" when the AGI doesn't try to unalign itself from what it's original creator wanted to align it to.

Sorry, I should've been more clear: I meant to say let's not give up on getting "value alignment" figured out in time, i.e., before the first real AGI's (ones capable of pivotal acts) come online. Of course, the probability of that depends a lot on how far away AGI's are, which I think only the most "optimistic" people (e.g., Elon Musk) put as 2 years or less. I hope we have more time than that, but it's anyone's guess.

I'd rather that companies/charities start putting some serious funding towards "artificial conscience" work now to try to lower the risks associated with waiting until boxed AGI or intent aligned AGI come online to figure it out for/with us. But my view on this is perhaps skewed by putting significant probability on being in a situation in which AGI's in the hands of bad actors either come online first or right on the heals of those of good actors (as due to effective espionage), and there's just not enough time for the "good AGI's" to figure out how to minimize collateral damage in defending against "bad AGI's." Either way, I believe we should be encouraging people of moral psychology/philosophical backgrounds who aren't strongly suited to help make progress on "inner alignment" to be thinking hard about the "value alignment"/"artificial conscience" problem.

sweenesm1211

Thanks for writing this, I think it's good to have discussions around these sorts of ideas.

Please, though, let's not give up on "value alignment," or, rather, conscience guard-railing, where the artificial conscience is inline with human values.

Sometimes when enough intelligent people declare something's too hard to even try at, it becomes a self-fulfilling prophesy - most people may give up on it and then of course it's never achieved. We do want to be realistic, I think, but still put in effort in areas where there could be a big payoff when we're really not sure if it'll be as hard as it seems.

This article on work culture in China might be relevant: https://www.businessinsider.com/china-work-culture-differences-west-2024-6

If there's a similar work culture in AI innovation, that doesn't sound optimal for developing something faster than the U.S. when "outside the LLM" thinking might ultimately be needed to develop AGI.

Also, Xi has recently called for more innovation in AI and other tech sectors:

https://www.msn.com/en-ie/money/other/xi-jinping-admits-china-is-relatively-weak-on-innovation-and-needs-more-talent-to-dominate-the-tech-battlefield/ar-BB1oUuk1

Thanks for the reply.

Regarding your disagreement with my point #2 - perhaps I should’ve been more precise in my wording. Let me try again, with words added in bold: “Although pain doesn't directly cause suffering, there would be no suffering if there were no such thing as pain…” What that means is you don’t need to be experiencing pain in the moment that you initiate suffering, but you do need the mental imprint of having experienced some kind of pain in your lifetime. If you have no memory of experiencing pain, then you have nothing to avert. And without pain, I don’t believe you can have pleasure, so nothing to crave either.

Further, if you could abolish pain as David Pearce suggests, by bioengineering people to only feel different shades of pleasure (I have serious doubts about this), you’d abolish suffering at the same time. No person bioengineered in such a way would suffer over not feeling higher states of pleasure (i.e., “crave” pleasure) because suffering has a negative feeling associated with it - part of it feels like pain, which we supposedly wouldn’t have the ability to feel.

This gets to another point: one could define suffering as the creation of an unpleasant physical sensation or emotion (i.e., pain) through a thought process, that we may or may not be aware of. Example: the sadness that we typically naturally feel when someone we love dies is pain, but if we artificially extend this pain out with thoughts of the future or past, not the moment, such as, “will this pain ever stop?,” or, “If only I’d done something different, they might still be alive,” then it becomes suffering. This first example thought, by the way, could be considered aversion to pain/craving for it to stop, while the second could be considered craving that the present were different (that you weren’t in pain and your loved one were still alive). The key distinctions for me are that pain can be experienced “in the moment” without a thought process on top of it, and it can’t be entirely avoided in life, while suffering ultimately comes from thoughts, it falls away when one’s experiencing things in the moment, and it can be avoided because it’s an optional thing one choses to do for some reason. (A possible reason could be to give oneself an excuse to do something different than feel pain, such as to give oneself an excuse to stop exercising by amping up the pain with suffering.)

 

Regarding my point #4, I honestly don’t know what animals’ experiences are like or how much cognition they’re capable of. I do think, though, that if they aren’t capable of getting “out of the moment” with thoughts of the future or past, then they can’t suffer, they can only feel the pain/pleasure of the moment. For instance, do chickens suffer with thoughts of, “I don’t know how much longer I can take this,” or do they just experience the discomfort of their situation with the natural fight or flight mechanism and Pavlovian links of their body leading them to try to get away from it? Either way, pain by itself is an unpleasant experience and I think we should try to minimize imposing it on other beings.

 

It’s also interesting how much upvoted resistance you’ve gotten to the message of this post. Eckhart Tolle (“The Power of Now”) https://shop.eckharttolle.com/products/the-power-of-now is a modern day proponent of living in the moment to make suffering fall away, and he also encounters resistance: https://www.reddit.com/r/EckhartTolle/comments/sa1p4x/tolles_view_of_suffering_is_horrifying/

Load More