All of Justin Bullock's Comments + Replies

Thanks for this comment. I agree there is some ambiguity here on the types of risks that are being considered with respect to the question of open-sourcing foundation models. I believe the report favors the term "extreme risks" which is defined as "risk of significant physical harm or disruption to key societal functions." I believe they avoid the terms of "extinction risk" and "existential risk," but are implying something not too different with their choice of extreme risks. 

For me, I pose the question above as:

"How large are the risks from fully op

... (read more)

Thank you for this comment! 

I think your point that "The problem here is that fine-tuning easily strips any safety changes and easily adds all kinds of dangerous things (as long as capability is there)." is spot on and maps to my intuitions about the weaknesses of fine-tuning and one of strongest points in favor of the significant risks to open-sourcing foundation models. 

I appreciate your suggestions for other methods of auditing that could possibly work such as a model being run within a protected framework and open-sourcing encrypted weights. ... (read more)

Thanks for the comment!

I think your observation that biological evolution is a slow, blind, and undirected process is fair. We try to make this point explicit in our section on natural selection (as a main evolutionary selection pressure for biological evolution) where we say "The natural processes for succeeding or failing in survival and reproduction – natural and sexual selection – are both blind and slow."

For our contribution here we are not trying to dispute this. Instead we're seeking to find analogies to the ways in which machine evolution, which we... (read more)

This is great! Thanks for sharing. I hope you continue to do these.

This discussion considers a relatively “flat”, dynamic organization of systems. The open-agency model[13] considers flexible yet relatively stable patterns of delegation that more closely correspond to current developments.

 

I have a questions here that I'm curious about:

I wonder if you have any additional thoughts about the "structure" of the open agencies that you imagine here. Flexible and relatively stable patterns of delegation seem to be important dimensions. You mention here that the discussion focuses on "flat" organization of systems, but... (read more)

We want work flows that divide tasks and roles because of the inherent structure of problems, and because we want legible solutions. Simple architectures and broad training facilitate applying structured roles and workflows to complex tasks. If the models themselves can propose the structures (think of chain-of-thought prompting), so much the better. Planning a workflow is an aspect of the workflow itself.

 

I think this has particular promise, and it's an area I would be excited to explore further. As I mentioned in a previous comment on your The Open ... (read more)

Thanks for this post, and really, this series of posts. I had not been following along, so I started with the "“Reframing Superintelligence” + LLMs + 4 years" and worked my way back to here. 

I found your initial Reframing Superintelligence report very compelling back when I first came across it, and still do. I also appreciate your update post referenced above. 

The thought I'd like to offer here is that it strikes me that your ideas here are somewhat similar to what both Max Weber and Herbert Simon proposed we should do with human agents. After r... (read more)

Thanks for this post. As I mentioned to both of you, it feels a little bit like we have been ships passing one another in the night. I really like your idea here of loops and the importance of keeping humans within these loops, particularly at key nodes in the loop or system, to keep Moloch at bay.

I have a couple scattered points for you to consider:

  • In my work in this direction, I've tried to distinguish between roles and tasks. You do something similar here, which I like. To me, the question often should be about what specific tasks should be automated as
... (read more)

I was interested in seeing what the co-writing process would create. I also wanted to tell a story about technology in a different way, which I hope compliments the other stories in this part of the sequence. I also just think it’s fun to retell a story that was originally told from the point of view of future intelligent machines back in 1968, and then to use a modern intelligent machine to write that story. I think it makes a few additional points about how stable our fears have been, how much the technology has changed, and the plausibility of the story itself.

I love that response! I’ll be interested to see how quickly it strikes others. All the actual text that appears within the story is generated by ChatGPT with the 4.0 model. Basically, I asked ChatGPT to co-write a brief story. I had it pause throughout and ask for feedback in revisions. Then, at the end of the story it generated with my feedback along the way, I asked it to fill in some more details and examples, which it did. I asked for minor changes in these in style and specific type as well.

I’d be happy to directly send you screenshots of the chat as well.

Thanks for reading!

2Richard_Kennaway
Why?

Thanks for the response! I appreciate the clarification on both point 1 and 2 above. I think they’re fair criticisms. Thanks for pointing them out.

Thank you for providing a nice overview of our Frontier AI Regulation: Managing Emerging Risks to Public Safety that was just released!

I appreciate your feedback, both the positive and critical parts. I'm also glad you think the paper should exist and that it is mostly a good step. And, I think your criticism is fair. Let me also note that I do not speak for the authorship team. We are quite a diverse group from academia, labs, industry, nonprofits, etc. It was no easy task to find common ground across everyone involved.

I think the AI Governance space is d... (read more)

4Zach Stein-Perlman
Thanks for your reply. In brief response to your more specific points: 1. On government oversight, I think you're referring to the quote "providing a regulator the power to oversee model development could also promote regulatory visibility, thus allowing regulations to adapt more quickly." But the paper doesn't seem to mention the direct benefit of oversight: verifying compliance and enforcing the rules. Good oversight would result in licensing not being a one-time thing but rather that labs could lose their licenses during a training run if they were noncompliant. (In my community 'oversight of training runs' means government auditors verifying compliance and the government stopping noncompliant runs; maybe it means something weaker outside my community.) 2. I agree that "perfect compliance" is hard but stand by my disappointment in the "particularly egregious instances" passage as not aiming high enough,   Edit: also I get that finding consensus is hard but after reading the consensus-y-but-ambitious Towards Best Practices in AGI Safety and Governance and Model evaluation for extreme risks I was expecting consensus on something stronger.

As you likely know by now, I think the argument that “Technological Progress = Human Progress” is clearly more complicated than is sometimes assumed. AI is very much already embedded in society and the existing infrastructure makes further deployment even easier. As you say, “more capability dropped into parts of a society isn’t necessarily a good thing.”

One of my favorite quotes from the relationship between technological advancement and human advancement is from Aldous Huxley below:

“Today, after two world wars and three major revolutions, we know that th... (read more)

Thanks for the comment, David! It also caused me to go back and read this post again, which sparked quite a few old flames in the brain.

I agree that a collection of different approaches to ensuring AI alignment would be interesting! This is something that I’m hoping (now planning!) to capture in part with my exploration of scenario modeling that’s coming down the pipe. But, a brief overview of the different analytical approaches to AI alignment, would be helpful (if it doesn’t already exist in an updated form that I’m unaware of).

I agree with your insight ... (read more)

Thank you for this post! As I may have mentioned to you both, I had not followed this line of research until the two of you brought it to my attention. I think the post does an excellent job describing the trade offs around interpretability research and why we likely want to push it in certain, less risky directions. In this way, I think the post is a success in that it is accessible and lays out easy to follow reasoning, sources, and examples. Well done!

I have a couple of thoughts on the specific content as well where I think my intuitions converge or div... (read more)

Thank you for this post! As I mentioned to both of you, I like your approach here. In particular, I appreciate the attempt to provide some description of how we might optimize for something we actually want, something like wisdom.

I have a few assorted thoughts for you to consider:

  • I would be interested in additional discussion around the inherent boundedness of agents that act in the world. I think self-consistency and inter-factor consistency have some fundamental limits that could be worth exploring within this framework. For example, might different t

... (read more)

Thank you for the comment and for reading the sequence! I posted Chapter 7 Welcome to Analogia! (https://www.lesswrong.com/posts/PKeAzkKnbuwQeuGtJ/welcome-to-analogia-chapter-7) yesterday and updated the main sequence page just now to reflect that. I think this post starts to shed some light on ways of navigating this world of aligning humans to the interests of algorithms, but I doubt it will fully satisfy your desire for a call to action. 

I think there are both macro policies and micro choices that can help.

At the macro level,  there is an over... (read more)

There is a growing academic field of "governance" that exists that would variously be described as a branch of political science, public administration, or policy studies. It is a relatively small field, but has several academic journals where that fit the description of the literature you're looking for. The best of these journals, in my opinion, is Perspectives on Public Management & Governance (although it has a focus on public governance structures to a fault of ignoring corporate governance structures).

In addition to this, there is a 50 chapter OU... (read more)

Thank you! I’m looking forward to the process of writing it, synthesizing my own thoughts, and sharing them here. I’ll also be hoping to receive your insightful feedback, comments, and discussion along the way!

Thank you for this post, Kyoung-Cheol. I like how you have used Deep Mind's recent work to motivate the discussion of the consideration of "authority as a consequence of hierarchy" and that "processing information to handle complexity requires speciality which implies hierarchy." 

I think there is some interesting work on this forum that captures these same types of ideas, sometimes with similar language, and sometimes with slightly different language.

In particular, you may find the recent post from Andrew Critch on "Power dynamics as a blind spot or b... (read more)

2Kyoung-cheol Kim
Thank you very much for your valuable comments, Dr. Bullock! I agree that exploring various viewpoints and finding similarities and discrepancies can be crucial for advancing the philosophy of science and improving our understanding of complex systems like AI and organizations. Your approach of considering the development of AI and its utilization within the configurations of societal works, lying somewhere between centralization and game theory situations, is indeed a nuanced and well-considered perspective. It acknowledges the complexity and discretion that hierarchical systems can have while incorporating game theory's relevance in multi-agent systems. Considering organizational frameworks in the context of AI-human interactions is essential, as it sheds light on how we can effectively work with AI agents in such systems. The concept of authority, being a cognitive phenomenon for humans, is indeed distinct from how machines perceive and handle information to process complexity. I share your belief that organization theories have significant potential in contributing to these discussions and becoming crucial for governance experts. It's exciting to see how interdisciplinary perspectives can enrich our understanding of AI development and utilization. I look forward to further engaging with your ideas and seeing more valuable contributions from you in the future!

Thanks for this. I tabbed the Immoral Mazes sequences. On cursory view it seems very relevant. I'll be working my way through it. Thanks again.

Thanks. I think your insight is correct that governance requires answers to the "how" and "what" questions, and that the bureaucratic structure is one answer, but it leave the "how" unanswered. I don't have a good technical answer, but I do have an interesting proposal by Hannes Alfven in the book "The End of Man?" that he published under the pseudonym of Olof Johnneson called Complete Freedom Democracy that I like. The short book is worth the read, but hard to find. The basic idea is a parliamentary system in which all humans, through something akin to a smart phone, to rank vote proposals. I'll write up the details some time! 

Thank you for the comment. There are several interesting points I want to comment on. Here are my thoughts in no particular order of importance:

  • I think what I see as your insight on rigidity versus flexibility (rigid predictable rules vs. innovation) more generally is helpful and something that is not addressed well in my post. My own sense is that an ideal bureaucracy structure could be rationally constructed that balances tradeoffs across rigidity and innovation. Here I would also take Weber's rule 6 that you highlight as an example. As represented in th
... (read more)
3Logan Zoellner
  I think this is the essential question  that needs to be answered: Is the stratification of bureaucracies a result of the fixed limit on human cognitive capacity, or is it an inherent limitation of bureaucracy? One way to answer such a question might be to look at the asymptotics of the situation.  Suppose that the number of "rules" governing an organization is proportional to the size of the organization.  The question would then be does the complexity of the coordination problem also increase only linearly as well?  If so, it is reasonable to suppose that  humans (with a finite capacity) would face a coordination problem but AI would not.   Suppose instead that the complexity of the coordination problem increases with the square of organization size.  In this case, as the size of an organization grows, AI might find the coordination harder and harder, but still tractable.   Finally, what if the AI must consider all possible interactions between all possible rules in order to resolve the coordination problem?  In this case, the complexity of "fixing" a stratified bureaucracy is exponential in the size of the bureaucracy and beyond a certain (slowly rising) threshold the coordination problem is intractable. If weighted voting is indeed a solution to the problem of bureaucratic stratification, we would expect this to be true of both human and AI organizations.  In this case, great effort should be put into discovering such structures because they would be of use in the present and not only in our AI dominated future. Suppose the coordination problem is indeed intractable.  That is to say that once a bureaucracy has become sufficiently complex it is impossible to reduce the complexity of the system without unpredictable and undesirable side-effects.  In this case, the optimal solution may be the one chosen by capitalism (and revolutionaries) to periodically replace the bureaucracy once it is no longer near the efficiency frontier . There is undoubtedly a cont

I think this approach may have something to add to Christiano's method, but I need to give it more thought. 

I don't think it is yet clear how this structure could help with the big problem of superintelligent AI. The only contributions I see clearly enough at this point are redundant to arguments made elsewhere. For example, the notion of a "machine beamte" as one that can be controlled through (1) the appropriate training and certification, (2) various motivations and incentives for aligning behavior with the knowledge from training, and (3) nominate... (read more)

Thank you for this. I pulled up the thread. I think you're right that there are a lot of open questions to look into at the level of group dynamics. I'm still familiarizing myself with the technical conversation around the iterated prisoner's dilemma and other ways to look at these challenges from a game theory lens. My understanding so far is that some basic concepts of coordination and group dynamics like authority and specialization are not yet well formulated, but again, I don't consider myself up to date in this conversation yet.

From the thread you sh... (read more)

Thank you for the insights. I agree with your insight that "bureaucracies are notorious homes to Goodhart effects and they have as yet found no way to totally control them." I also agree with you intuition that "to be fair bureaucracies do manage to achieve a limited level of alignment, and they can use various mechanisms that generate more vs. less alignment." 

I do however believe that an ideal type of bureaucratic structure helps with at least some forms of the alignment problem. If for example, Drexler is right, and my conceptualization of the theo... (read more)

3Gordon Seidoh Worley
Yeah, I guess I should say that I'm often worried about the big problem of superintelligent AI and not much thinking about how to control narrow and not generally capable AI. For weak AI, this kind of prosaic control mechanism might be reasonable. Christiano things this class of methods might work on stronger AI.

My name is Justin Bullock. I live in the Seattle area after 27 years in Georgia and 7 years in Texas. I have a PhD and Public Administration and Policy Analysis where I focused on decision making within complex, hierarchical, public programs. For example, in my dissertation I attempted to model how errors (measured as improper payments) are built into the US Unemployment Insurance Program. I spent time looking at how agents are motivated within these complex systems trying to develop general insights into how errors occur in these systems. Until about 2016... (read more)

3gilch
See the Group Rationality topic. The rationalists, as a culture, still haven't quite figured out how to coordinate groups very well, in my opinion. It's something we should work on.