LESSWRONG
LW

All of alex.herwix's Comments + Replies

Thank you for an interesting post! I have only skimmed it so far and not really dug in to the mathematics section but the way you are framing logic somewhat reminds me of Dewey, J. (1938). Logic: The Theory of Inquiry. Henry Holt and Company, INC.

Are you by any chance familiar with this work and could elaborate on possible continuities and discontinuities?

1Kris Brown5mo

Hi, sorry I'm not directly familiar with that Dewey work. As far as classical American pragmatism goes, I can only point to Brandom's endorsement of Cheryl Misak's transformative new way of looking at it in Lecture 4 of this course, which might be helpful for drawing this connection.

Scaffolding for "Noticing Metacognition"

alex.herwix6mo10

Thanks for the reply. I wanted to get at something slightly different, though.

I think that a key insight of traditions that work with "judgmentless/reactionless noticing" is that we humans tend to be "obsessive" problem solvers that are prone to getting tangled up in their own attempts at problem solving. Sometimes trying to solve problems can actually become the problem. On some level, I appreciate that your techniques may actually help to guard against this but on another level I wonder if this may be bought at the price of becoming boxed into a restrictive problem solving mindset that is unable to notice its own limitations.

Just throwing this out there and wondering what reactions this turns up.

2Raemon6mo

Yeah I do concretely think one needs to guard against being an obsessive problem solver… but, also, there are some big problems that gotta get solved and while there are downsides and risks I mostly think "yep, I’m basically here to ~obsessive problem solve." (even if I'll try to be reasonable about it and encourage others to as well) (To be clear, psychologically unhealthy or counterproductive obsessions with problem solving are bad. But if I have to choose between accidentally veering towards that too much or too little, I'm choosing too much)

Scaffolding for "Noticing Metacognition"

alex.herwix6mo50

Thank you for an interesting post. I noticed some confusion while reading it and thought it might be worthwhile to share. When I think of "noticing", I think of meditation and cultivating awareness. One of the key aspects in those traditions is that they advocate for the value of avoiding automated reactions to experience by "simply" noticing it. Your approach to noticing seems to advocate the opposite of this, training automated reactions triggered by noticing. How do you think about the relationship between these different perspectives? Can it inform us about potential failure modes that your approach might hold?

2Raemon6mo

First, I totally think it's worth learning to notice things without having any particular response. I think some people find that intuitively or intrinsically valuable. For people who don't find "judgmentless/reactionless noticing" valuable, I would say: "The reason to do that is to develop a rich understanding of your mind. A problem you would run into if you have reactions/judgments is that doing so changes your mind while you're looking at it, you can only get sort of distorted data if you immediately jump into changing things. You may want this raw data from your mind a) because it helps you diagnose confusing problems in your psychology, b) because you might just intrinsically value getting to know your own mind with as close contact as possible – it's where you live, and in some sense, it's all the reality you have to interact with." I think all of that is pretty important for becoming a poweruser-rationalist. Now that you've drawn my attention to it, I probably will update the essay to include it somehow. But, I think all of that takes quite awhile to pay off, and if it's not intuitively appealing, I don't think it's really worth trying until you've gotten some fluency with Noticing in the first place. ... And, that all said: I think the buddhists-and-such are ultimately trying to achieve a different goal than I'm trying to achieve, so even though the methods are pretty similar in many places, they are just optimized pretty differently. The goal I'm trying to achieve is "solve confusing problems at the edge of my ability that feel impossible, but are nonetheless incredibly important." This post is exploring Noticing in that particular context, and furthermore, in the context of "what skills can you train that will quickly pay off, such that you'll get some indication they are valuable at all", either in a dedicated workshop I'm designing, or, on your own without any personalized guidance. ... It does seem like there will be other types of workshop

Finding the Wisdom to Build Safe AI

alex.herwix9mo30

We train an LLM to be an expert on AI design and wisdom. We might do this by feeding it AI research papers and "wisdom texts", like principled arguments about wise behavior and stories of people behaving wisely, over and above those base models already have access to, and then fine tuning to prioritize giving wise responses.
We simultaneously train some AI safety researchers to be wiser.
Our wise AI safety researchers use this LLM as an assistant to help them think through how to design a superintelligent AI that would embody the wisdom necessary to be safe.

... (read more)

Finding the Wisdom to Build Safe AI

alex.herwix9mo10

If we can train AI to be wise, it would imply an ability to automate training, because if we can train a wise AI, then in theory that AI could train other AIs to be wise in the same way wise humans are able to train other humans to be wise. We would only need to train a single wise AI in such a scheme who could pass on wisdom to other AIs.

I think this is way too optimistic. Having trained a wise person or AI once does not mean that we have fully understood what we have done to get there, which limits our ability to reproduce it. One can maybe make the argu... (read more)

2Gordon Seidoh Worley9mo

This is a place where my Zen bias is showing through. When I wrote this I was implicitly thinking about the way we have a system of dharma transmission that, at least as we practice Zen in the west, also grants teaching authorization, so my assumption was that if we feel confident certifying an AI as wise, this would imply also believing it to be wise and skilled enough to teach what it knows. But you're right, these two aspects, wisdom and teaching skill, can be separated, and in fact in Japan this is the case: dharma transmission generally comes years before teaching certification is granted, and many more people receive transmission than are granted the right to teach.

Rawls's Veil of Ignorance Doesn't Make Any Sense

alex.herwix1y22

To be honest, I am pretty confused by your argument and I tried to express one of those confusions with my reply. I think you probably also got what I wanted to express but chose to ignore the content in favor of patronizing me. As I don't want to continue to go down this road, here is a more elaborate comment that explains where I am coming from:

First, you again make a sweeping claim that you do not really justify: "Many (perhaps most) famous "highly recognized" philosophical arguments are nonsensical". What is your ground for this claim? Do you mean that... (read more)

Rawls's Veil of Ignorance Doesn't Make Any Sense

alex.herwix1y-40

Since a lot of arguments on internet forums are nonsensical, the fact that your comment doesn’t makes sense to me, means that it is far more likely that it doesn’t make sense at all than it is that I am missing something.

That’s pretty ironic.

-5Shankar Sivarajan1y

Rawls's Veil of Ignorance Doesn't Make Any Sense

alex.herwix1y-2-13

I downvoted this post because the whole set up is straw manning Rawls work. To claim that a highly recognized philosophical treatment of justice that has inspired countless discussions and professional philosophers doesn’t “make any sense” is an extraordinary claim that should ideally be backed by a detailed argument and evidence. However, to me the post seems handwavey and more like armchair philosophizing than detailed engagement. Don’t get me wrong, feel free to do that but please make clear that this is what you are doing.

Regarding your claim that the ... (read more)

2Shankar Sivarajan1y

Many (perhaps most) famous "highly recognized" philosophical arguments are nonsensical (zombies, as an example). If one doesn't make sense to you, it is far more likely that it doesn't make sense at all than it is that you're missing something.

The Ideal Speech Situation as a Tool for AI Ethical Reflection: A Framework for Alignment

alex.herwix1y10

Hey Kenneth,

thanks for sharing your thoughts. I don't have much to say about the specifics of your post because I find it somewhat difficult to understand how exactly you want an AI (what kind of AI?) to internalize ethical reflection and what benefit the concept of the ideal speech situation (ISS) has here.

What I do know is that the ISS has often been characterized as an "unpractical" concept that cannot be put into practice because the ideal it seeks simply cannot be realized (e.g., Ulrich, 1987, 2003). This may be something to consider or dive dee... (read more)

1kenneth myers1y

The basic intuition I have, which I think is correct, is that if you build a Habermasian robot, it won't kill everyone. This is significant to my mind. Maybe it's impossible, but it seems like an interesting thing to pursue.

1kenneth myers1y

Hey, cool! Yes, I agree that the ideal speech situation is not achievable. Thats what the "ideal" part is. However neither is next word prediction in principle. It's an ideal that can be striven for. I'm going to try and unpack the details of what I'm proposing in future posts, just wanted to introduce the idea here

An Invitation to Refrain from Downvoting Posts into Net-Negative Karma

alex.herwix1y10

I see your point regarding different results depending on order of how people see the post but that’s also true the other way around. Given the assumption that less people are likely to view a post that has negative Karma, people who may actually turn out to like the post and upvote it never do so because of preexisting negative votes.

In fact, I think that’s the whole point of this scheme, isn’t it?

So, either way you never capture an „accurate“ picture because the signal itself is distorting the outcome. The key question is then what outcome one prefers, n... (read more)

Thoughts on open source AI

alex.herwix1y10

I think this is a very contextual question that really depends on the design of the mechanisms involved. For example, if we are talking about high risk use cases the military could be involved as part of the regulatory regime. It’s really a question of how you set this up, the possible design space is huge if we look at this with an open mind. This is why I am advocating for engaging more deeply with the options we have here.

Thoughts on open source AI

alex.herwix1y3-1

I just wanted to highlight that there also seems to be an opportunity to combine the best traits of open and closed source licensing models in the form of a new regulatory regime that one could call: regulated source.

I tried to start a discussion about this possibility but so far the take up has been limited. I think that’s a shame, there seems to be so much that could be gained by “outside the box” thinking on this issue since the alternatives both seem pretty bleak.

2the gears to ascension1y

enforceability of such things seems unlikely to be sufficient to satisfy those who want government intervention.

What is to be done? (About the profit motive)

alex.herwix2y31

That seems to downplay the fact that we will never be able to internalize all externalities simply because we cannot reliably anticipate all of them. So you are always playing catch up to some degree.

Also simply declaring an issue “generally” resolved when the current state of the world demonstrates it’s actually not resolved seems premature in my book. Breaking out of established paradigms is generally the best way to make rapid progress on vexing issues. Why would you want to close the door to this?

3dr_s2y

I don't think he's declaring it resolved, more arguing that it's been fought over to the death - quite literally - and yet no viable alternative seems to have emerged, so odds are doing it here would turn out similarly improductive and possibly destructive to the community.

What is to be done? (About the profit motive)

Answer by alex.herwixSep 09, 202312

I ask myself the same question. I recently posted an idea about AI regulation to address such issues and start a conversation but there was almost no reaction and mostly just pushback. See: https://www.lesswrong.com/posts/8xN5KYB9xAgSSi494/against-the-open-source-closed-source-dichotomy-regulated

My take is that many people here are very worried about AI doom and think that for-profit work is necessary to get the best minds working on the issue. It also seems that Governments in general are perceived to be incompetent so the fear is more regulation will scr... (read more)

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

alex.herwix2y*20

Thanks for engaging with the post and acknowledging that regulation may be a possibility we should consider and not reject out of hand.

I don't share your optimistic view that transnational agencies such as the IAEA will be all that effective. The history of the nuclear arms race is that those countries that could develop weapons did, leading to extremes such as the Tsar Bomba, a 50-megaton monster that was more of a dick-waving demonstration than a real weapon. The only thing that ended the unstable MAD doctrine was the internal collapse of the Sovie

alex.herwix2y10

Alright, it seems to me like the crux between our positions is that you are unwilling or unable to consider whether new institutions could create an environment that is more conducive to technical AI alignment work because you feel that this is a hopeless endeavor. Societies (in your view that seems to be just government) are simply worse at creating new institutions compared to the alternative of letting DeepMind do its thing. Moreover, you don't seem to acknowledge that it is worthwhile to consider how to avoid the dystopian failure mode because the cata... (read more)

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

alex.herwix2y10

So, I concede that the proposal is pretty vague and general and that this may make it difficult to get the gist of it but I think it's still pretty clear that the idea is broader than nationalizing. I refer specifically to the possible involvement of intergovernmental, professional, or civil society organizations in the regulating body. With regards to profit, the degree to which profit is allowed for could be regulated for each use case separately with some (maybe the more benign) use cases being more tailored to profit seeking companies than others. ... (read more)

4jessicata2y

Getting AI right is mainly a matter of technical competence and technical management competence. DeepMind is obviously much better at those than any government, especially in the AI domain. The standard AI risk threat is not that some company aligns AI to its own values, it's that everyone dies because AI is not aligned to anyone's values, because this is a technically hard problem, as has been argued on this website and in other writing extensively. If Google successfully allocates 99% of the universe to itself and its employees and their families and 1% to the rest of the people in the world, that is SO good for everyone's values compared with the default trajectory, due to a combination of default low chance of alignment, diminishing marginal utility in personal values, and similarity of impersonal values across humans. If a government were to nationalize AI development, I would think that the NSA was the best choice due to their technical competence, although they aren't specialized in AI, so this would still be worse than DeepMind. DeepMind founder Shane Legg has great respect for Yudkowsky's alignment work. Race dynamics are mitigated by AI companies joining the leader in the AI space, which is currently DeepMind. OpenAI agrees with "merge and assist" as a late-game strategy. Recent competition among AI firms, primarily in LLMs, is largely sparked by OpenAI (see Claude, Bard, Gemini). DeepMind appeared content to release few products in the absence of substantial competition. Google obviously has no need to sell anything to anyone if they control the world. This sentence is not a logical argument, it is rhetoric.

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

alex.herwix2y10

I think my intuition would be the opposite... The more room for profit, the more incentives for race dynamics and irresponsible gold rushing. Why would you think it's the other way around?

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

alex.herwix2y10

I think it could be possible to introduce more stringent security measures. We can generally keep important private keys from being leaked so if we treat weights carefully, we should be able to have at least a similar track record. We can also forbid the unregulated use of such software similar to the unregulated use of nuclear technology. Also in the limit, the problem still exists in a closed source world.

Llama is a special case because there are no societal incentives against it spreading… the opposite is the case! Because it was “proprietary”, it’s the... (read more)

2Chris_Leong2y

I'm not a fan of profit maximisation either. Although I'm much more concerned about the potential for us to lose control then for a particular corporation to make a bit too much profit.

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

alex.herwix2y*10

This is not necessarily true because resources and source code would be shared between all actors who pass the bar so to speak. So capabilities should be diffused more widely between actors who have demonstrated competence than in a closed source model. It would be a problem if the bars were too high and tailored to suit only the needs of very few companies. But the ideal would be strong competition because the required standards are appropriate and well-measured with the regulating body investing resources into the development and support of new responsib... (read more)

2jessicata2y

I don't see how this proposal substantially differs from nationalizing all AI work. Governments can have internal departments that compete, like corporations can. Removing the profit motive seems to only leave government and nonprofit as possibilities, and the required relationships with regulators make this more like a government project for practical purposes. Without specifying more about the governance structure, this is basically nationalization. It is understandable that people would oppose nationalization both due to historical bad results of communist and fascist systems of government, and due to specific reasons why current governments are untrustworthy, such as handling of COVID, lab leak, etc recently. In an American context, it makes very little sense to propose government expansion (as opposed to specific, universally-applied laws) without coming to terms with the ways that government has shown itself to be untrustworthy for handling catastrophic risks. I think, if centralization is beneficial, it is wiser to centralize around DeepMind than any government. But again, the justification for this proposal seems spurious if it is in effect tending towards more centralization than default closed source AI.

The environment as infrastructure

alex.herwix2y1512

Just to let you know that this overall framing is pretty common in sustainable development contexts. It’s often called blue and green infrastructure. See for example: https://iucn.org/news/europe/201911/building-resilience-green-and-blue-infrastructure

However, I think those people would be more focused on „giving nature space and letting it do it’s thing“ rather than trying to upgrade nature. Given our track record, I would tend to agree with them. Let’s not put the cart in front of the horse and think that we can effectively design ecological ecosystems just yet.

5AnthonyC2y

Yeah, there's a big ol' Chesterton's Fence between where we are now and redesigning nature. We're not ready. But we can intervene to undo damage we've already done, and stop doing more.

How to estimate a pre-aligned value for a common discussion ground?

alex.herwix2y10

False premise. You seem to be assuming that many people using symbols reliably in similar ways points to anything other than this convention being reliably useful in achieving some broadly desired end. It doesn't.

Your mathematics example is also misleading because it directs attention to "mathematical truths" which are generally only considered to be valid statements within the framework of mathematics and, thus, inherently relative to a particular framework and not "absolute".

As soon as you move to "real life" cases you are faced with the ques... (read more)

1EL_File41382y

This seems useful. Thanks!

How to estimate a pre-aligned value for a common discussion ground?

Answer by alex.herwixFeb 23, 20231-1

False premise. There is no “absolute truth”. I don’t want to come across as condescending but please have a look at any somewhat recent science textbook if you doubt this claim.

I would suggest reframing to: how can we establish common ground that a) all/most people can agree on and b) facilities productive inquiry.

1EL_File41382y

And that arose a question: If there's no "absolute truth", then how "relative" the truth most people agree on (such as 1+1=2 mathematically) would be? Sorry if this question seems too naive as I'm at an early stage of exploring philosophy, and any other views other than objectivity under the positivism view seems not convincing to me.

Why We Age, Part 1: What ageing is and is not

alex.herwix5y130

Hey Will,

looking forward to the rest of the series! Would be awesome if you could comment on the following development: https://joshmitteldorf.scienceblog.com/2020/05/11/age-reduction-breakthrough/

Is this just hype or how should one make sense of this?