Sequences

Linguistic Freedom: Map and Territory Revisted
INVESTIGATIONS INTO INFINITY

Comments

Do you have any thoughts on whether it would make sense to push for a rule that forces open-source or open-weight models to be released behind an API for a certain amount of time before they can be released to the public?

Would be very curious to know why people are downvoting this post.

Is it:
a) Too obvious
b) Too pretentious
c) Poorly written
d) Unsophisticated analysis
e) Promoting dishonesty

Or maybe something else.

You say counterfactuals in CLDT should correspond to consistent universes


That's not quite what I wrote in this article:

However, this now seems insufficient as I haven't explained why we should maintain the consistency conditions over comparability after making the ontological shift. In the past, I might have said that these consistency conditions are what define the problem and that if we dropped them it would no longer be Newcomb's Problem... My current approach now tends to put more focus on the evolutionary process that created the intuitions and instincts underlying these incompatible demands as I believe that this will help us figure out the best way to stitch them together.

I'll respond to the other component of your question later.

Just thought I'd add a second follow-up comment.

You'd have a much better idea of what made FHI successful than I would. At the same time, I would bet that in order to make this new project successful - and be its own thing - it'd likely have to break at least one assumption behind what made old FHI work well.

Then much later, when we ran the AI Alignment Prize here on LW, I also noticed that the prize by itself wasn't too important; the interactions between newcomers and old-timers were a big part of what drove the thing.

 

Could you provide more detail?

Reading your list, a bunch of it seems to be about decisions about what to work on or what locally to pursue.

 

I think my list appears more this way then I intended because I gave some examples of projects I would be excited by if they happened. I wasn't intending to stake out a strong position as to whether these projects should projects chosen by the institute vs. some examples of projects that it might be reasonable for a researcher to choose within that particular area.

I'd love your feedback on my thoughts on decision theory.

If you're trying to get a sense of my approach in order to determine whether it's interesting enough to be worth your time, I'd suggest starting with this article (3 minute read).

I'm also considering applying for funding to create a conceptual alignment course.

I strongly agree with Owen's suggestions about figuring out a plan grounded in current circumstances, rather than reproducing what was.

Here's some potentially useful directions to explore.

Just to be clear, I'm not claiming that it should adopt all of these. Indeed, an attempt to adopt all of these would likely be incoherent and attempting to pursue too many different directions at the same time.

 These are just possibilities, some subset of which is hopefully useful:

  • Rationality as more of a focus area: Given that Lightcone runs Less Wrong, an obvious path to explore is whether rationality could be further developed by providing people either a fellowship or a permanent position to work on developing the art:
    • Being able to offer such paid positions might allow you to draw contributions from people with rare backgrounds. For example, you might decide it would be useful to better understand anthropology as a way of better understanding other cultures and practices and so you could directly hire an anthropologist to help with that.
    • It would also help with projects that would be valuable, but which would be a slog and require specific expertise. For example, it would be great to have someone update the sequences in light of more recent psychological research.
  • Greater focus on entrepreneurship:
    • You've already indicated your potential interest in taking it this direction by adding it as one of the options on your form.
    • This likely makes sense given that Lightcone is located in the Bay Area, the city with the most entrepreneurs and venture capitalists in the world.
    • Insofar as a large part of the impact of FHI was the projects it inspired elsewhere, it may make sense to more directly attempt this kind of incubation.
  • Response to the rise of AI:
    • One of the biggest shifts in the world since FHI was started has been the dramatic improvements in AI
    • One response to this would be to focus more on the risks and impacts from AI. However, there are already a number of institutions focusing on this, so this might simply end up being a worse version of them:
      • You may also think that you might be able to find a unique angle, for example, given how Eliezer was motivated to create rationality in order to help people understand his arguments on AI safety, it might be valuable for there to be a research program which intertwines those two elements.
      • Or you might identify areas, such as AI consciousness, that are still massively neglected
    • Another response would be to try to figure out how to leverage AI:
      • Would it make sense to train an AI agent on Less Wrong content?
      • As an example, how could AI be used to develop wisdom?
    • Another response would be to decide that better orgs are positioned to pursue these projects.
  • Is there anything in the legacy of MIRI, CFAR of FHI that is particularly ripe for further development?:
    • For example, would it make sense for someone to try to publish an explanation of some of the ideas produced by MIRI on decision theory in a mainstream philosophical journal?
    • Perhaps some techniques invented by CFAR could be tested with a rigorous academic study?
  • Potential new sources of ideas:
    • There seems to have been a two-way flow of ideas between LW/EA and FHI.
    • While there may still be more ideas within these communities that are deserving of further exploration, it may also make sense to consider whether there any new communities that could provide a novel source of ideas?:
      • A few possibilities immediately come to mind: post-rationality, progress studies, sensemaking, meditation, longevity, predictions.
  • Less requirement for legibility than FHI:
    • While FHI leaned towards the speculative end of academia, there was still a requirement for projects to still be at least somewhat academically legibility. What is enabled by no longer having that kind of requirement?
  • Opportunities for alpha from philosophical rigour:
    • This was one of the strengths of FHI - bringing philosophical rigour to new areas. It may be worth exploring how this could be preserved/carried forward?
    • One of the strengths of academic philosophy - compared to the more casual writing that is popular on Less Wrong - is its focus on rigour and drawing out distinctions. If this institute were able to recruit people with strong philosophical backgrounds, are there any areas that would be particularly ripe for applying this style of thinking?
    • Pursuing this direction might be a mistake if you would struggle to recruit the right people. It may turn out that the placement of FHI within Oxford was vital for drawing the philosophical talent of the calibre that they drew. 

"The structure of synchronization is, in general, richer than the world model itself. In this sense, LLMs learn more than a world model" given that I expect this is the statement that will catch a lot of people's attention.

Just in case this claim caught anyone else's attention, what they mean by this is that it contains:
• A model of the world
• A model of the agent's process for updating its belief about which state the world is in

This strongly updates me towards expecting the institute to produce useful work.

Load More