Thanks for the reply. I wanted to get at something slightly different, though.
I think that a key insight of traditions that work with "judgmentless/reactionless noticing" is that we humans tend to be "obsessive" problem solvers that are prone to getting tangled up in their own attempts at problem solving. Sometimes trying to solve problems can actually become the problem. On some level, I appreciate that your techniques may actually help to guard against this but on another level I wonder if this may be bought at the price of becoming boxed into a restrictive problem solving mindset that is unable to notice its own limitations.
Just throwing this out there and wondering what reactions this turns up.
Thank you for an interesting post. I noticed some confusion while reading it and thought it might be worthwhile to share. When I think of "noticing", I think of meditation and cultivating awareness. One of the key aspects in those traditions is that they advocate for the value of avoiding automated reactions to experience by "simply" noticing it. Your approach to noticing seems to advocate the opposite of this, training automated reactions triggered by noticing. How do you think about the relationship between these different perspectives? Can it inform us about potential failure modes that your approach might hold?
...
- We train an LLM to be an expert on AI design and wisdom. We might do this by feeding it AI research papers and "wisdom texts", like principled arguments about wise behavior and stories of people behaving wisely, over and above those base models already have access to, and then fine tuning to prioritize giving wise responses.
- We simultaneously train some AI safety researchers to be wiser.
- Our wise AI safety researchers use this LLM as an assistant to help them think through how to design a superintelligent AI that would embody the wisdom necessary to be safe.
If we can train AI to be wise, it would imply an ability to automate training, because if we can train a wise AI, then in theory that AI could train other AIs to be wise in the same way wise humans are able to train other humans to be wise. We would only need to train a single wise AI in such a scheme who could pass on wisdom to other AIs.
I think this is way too optimistic. Having trained a wise person or AI once does not mean that we have fully understood what we have done to get there, which limits our ability to reproduce it. One can maybe make the argu...
To be honest, I am pretty confused by your argument and I tried to express one of those confusions with my reply. I think you probably also got what I wanted to express but chose to ignore the content in favor of patronizing me. As I don't want to continue to go down this road, here is a more elaborate comment that explains where I am coming from:
First, you again make a sweeping claim that you do not really justify: "Many (perhaps most) famous "highly recognized" philosophical arguments are nonsensical". What is your ground for this claim? Do you mean that...
Since a lot of arguments on internet forums are nonsensical, the fact that your comment doesn’t makes sense to me, means that it is far more likely that it doesn’t make sense at all than it is that I am missing something.
That’s pretty ironic.
I downvoted this post because the whole set up is straw manning Rawls work. To claim that a highly recognized philosophical treatment of justice that has inspired countless discussions and professional philosophers doesn’t “make any sense” is an extraordinary claim that should ideally be backed by a detailed argument and evidence. However, to me the post seems handwavey and more like armchair philosophizing than detailed engagement. Don’t get me wrong, feel free to do that but please make clear that this is what you are doing.
Regarding your claim that the ...
Hey Kenneth,
thanks for sharing your thoughts. I don't have much to say about the specifics of your post because I find it somewhat difficult to understand how exactly you want an AI (what kind of AI?) to internalize ethical reflection and what benefit the concept of the ideal speech situation (ISS) has here.
What I do know is that the ISS has often been characterized as an "unpractical" concept that cannot be put into practice because the ideal it seeks simply cannot be realized (e.g., Ulrich, 1987, 2003). This may be something to consider or dive dee...
I see your point regarding different results depending on order of how people see the post but that’s also true the other way around. Given the assumption that less people are likely to view a post that has negative Karma, people who may actually turn out to like the post and upvote it never do so because of preexisting negative votes.
In fact, I think that’s the whole point of this scheme, isn’t it?
So, either way you never capture an „accurate“ picture because the signal itself is distorting the outcome. The key question is then what outcome one prefers, n...
I think this is a very contextual question that really depends on the design of the mechanisms involved. For example, if we are talking about high risk use cases the military could be involved as part of the regulatory regime. It’s really a question of how you set this up, the possible design space is huge if we look at this with an open mind. This is why I am advocating for engaging more deeply with the options we have here.
I just wanted to highlight that there also seems to be an opportunity to combine the best traits of open and closed source licensing models in the form of a new regulatory regime that one could call: regulated source.
I tried to start a discussion about this possibility but so far the take up has been limited. I think that’s a shame, there seems to be so much that could be gained by “outside the box” thinking on this issue since the alternatives both seem pretty bleak.
That seems to downplay the fact that we will never be able to internalize all externalities simply because we cannot reliably anticipate all of them. So you are always playing catch up to some degree.
Also simply declaring an issue “generally” resolved when the current state of the world demonstrates it’s actually not resolved seems premature in my book. Breaking out of established paradigms is generally the best way to make rapid progress on vexing issues. Why would you want to close the door to this?
I ask myself the same question. I recently posted an idea about AI regulation to address such issues and start a conversation but there was almost no reaction and mostly just pushback. See: https://www.lesswrong.com/posts/8xN5KYB9xAgSSi494/against-the-open-source-closed-source-dichotomy-regulated
My take is that many people here are very worried about AI doom and think that for-profit work is necessary to get the best minds working on the issue. It also seems that Governments in general are perceived to be incompetent so the fear is more regulation will scr...
Thanks for engaging with the post and acknowledging that regulation may be a possibility we should consider and not reject out of hand.
...I don't share your optimistic view that transnational agencies such as the IAEA will be all that effective. The history of the nuclear arms race is that those countries that could develop weapons did, leading to extremes such as the Tsar Bomba, a 50-megaton monster that was more of a dick-waving demonstration than a real weapon. The only thing that ended the unstable MAD doctrine was the internal collapse of the Sovie
Alright, it seems to me like the crux between our positions is that you are unwilling or unable to consider whether new institutions could create an environment that is more conducive to technical AI alignment work because you feel that this is a hopeless endeavor. Societies (in your view that seems to be just government) are simply worse at creating new institutions compared to the alternative of letting DeepMind do its thing. Moreover, you don't seem to acknowledge that it is worthwhile to consider how to avoid the dystopian failure mode because the cata...
So, I concede that the proposal is pretty vague and general and that this may make it difficult to get the gist of it but I think it's still pretty clear that the idea is broader than nationalizing. I refer specifically to the possible involvement of intergovernmental, professional, or civil society organizations in the regulating body. With regards to profit, the degree to which profit is allowed for could be regulated for each use case separately with some (maybe the more benign) use cases being more tailored to profit seeking companies than others. ...
I think my intuition would be the opposite... The more room for profit, the more incentives for race dynamics and irresponsible gold rushing. Why would you think it's the other way around?
I think it could be possible to introduce more stringent security measures. We can generally keep important private keys from being leaked so if we treat weights carefully, we should be able to have at least a similar track record. We can also forbid the unregulated use of such software similar to the unregulated use of nuclear technology. Also in the limit, the problem still exists in a closed source world.
Llama is a special case because there are no societal incentives against it spreading… the opposite is the case! Because it was “proprietary”, it’s the...
This is not necessarily true because resources and source code would be shared between all actors who pass the bar so to speak. So capabilities should be diffused more widely between actors who have demonstrated competence than in a closed source model. It would be a problem if the bars were too high and tailored to suit only the needs of very few companies. But the ideal would be strong competition because the required standards are appropriate and well-measured with the regulating body investing resources into the development and support of new responsib...
Just to let you know that this overall framing is pretty common in sustainable development contexts. It’s often called blue and green infrastructure. See for example: https://iucn.org/news/europe/201911/building-resilience-green-and-blue-infrastructure
However, I think those people would be more focused on „giving nature space and letting it do it’s thing“ rather than trying to upgrade nature. Given our track record, I would tend to agree with them. Let’s not put the cart in front of the horse and think that we can effectively design ecological ecosystems just yet.
False premise. You seem to be assuming that many people using symbols reliably in similar ways points to anything other than this convention being reliably useful in achieving some broadly desired end. It doesn't.
Your mathematics example is also misleading because it directs attention to "mathematical truths" which are generally only considered to be valid statements within the framework of mathematics and, thus, inherently relative to a particular framework and not "absolute".
As soon as you move to "real life" cases you are faced with the ques...
False premise. There is no “absolute truth”. I don’t want to come across as condescending but please have a look at any somewhat recent science textbook if you doubt this claim.
I would suggest reframing to: how can we establish common ground that a) all/most people can agree on and b) facilities productive inquiry.
Hey Will,
looking forward to the rest of the series! Would be awesome if you could comment on the following development: https://joshmitteldorf.scienceblog.com/2020/05/11/age-reduction-breakthrough/
Is this just hype or how should one make sense of this?
Thank you for an interesting post! I have only skimmed it so far and not really dug in to the mathematics section but the way you are framing logic somewhat reminds me of Dewey, J. (1938). Logic: The Theory of Inquiry. Henry Holt and Company, INC.
Are you by any chance familiar with this work and could elaborate on possible continuities and discontinuities?