One of my favorite things about you, John, is that you are excellent at prompting me to direct my attention towards important questions which lead to promising insights. Thanks for that!
I answered your questions, originally in my private notes, but partway through decided I would comment them.
Imagine that your field achieved perfection - the ultimate theory, perfect understanding, building The Thing.
What has been achieved in the idealized version of the field, which has not yet been achieved today? What are the main barriers between here and there?
Barriers:
Often, in hindsight, a field turns out to have been bottlenecked on the development of some new measurement method, ranging from physical devices like the thermometer to abstract ideas like Shannon’s entropy and information channel capacity.
In what places does it look like your field is bottlenecked on the ability to usefully measure something? What are the main barriers to usefully measuring those things?
What are the places where your field is flailing around in the dark, trying desperate ideas and getting unhelpful results one after another? What are the places where it feels like the problem is formulated in the wrong language, and a shift to another frame might be required to ask the right question or state the right hypothesis?
Sometimes, we have a few different models, each of which works really well in different places. Maybe it feels like there should be some model which unifies them all, which could neatly account for all these phenomena at once - like the unification of electricity, magnetism and optics in the 19th century.
Are there different models in your field which feel like they point to a not-yet-known unified model?
One of the main ways we notice (usually implicit) false assumptions in our models is when they come into conflict with some other results, patterns or constraints. This may look like multiple models which cannot all be true simultaneously, or it may look like one model which looks like it cannot be true at all yet nonetheless keeps matching reality quite well. This is a hint to reexamine the assumptions under which the models are supposedly incompatible/impossible, and especially look for any hidden assumptions in that impossibility argument.
Are there places in your field where a few models look incompatible, or one model looks impossible, yet nonetheless the models match reality quite well?
The space of possible physical laws or theorems or principles is exponentially vast. Sometimes, the hard part is to figure out what the relevant factors are at all. For instance, to figure out how to reproducibly culture a certain type of cell, a biologist might need to provide a few specific signal molecules, a physical medium with the right elasticity or density, a particular temperature range, and/or some other factors which nobody even thought to test yet.
Are there places in your field where nobody even knows what key factors must be controlled for some important outcome to robustly occur?
Are there places in your field where some concept seems very central to understanding, but nobody knows its True Name yet?
At the social level, what are the barriers to solving the main problems in the previous two questions? Why aren’t they already solved? Why isn’t progress being made, or made faster?
Are there places where your field as a whole, or you personally, pursue things which won’t really help with the main problems (but might kind of “look like” they address the problems)?
Pick someone you know, or a few people, who are smart and have good judgment. What would their answers to these questions be?
John, I think you would not strongly disagree with most anything I said, but I feel like you would say that corrigibility isn’t as pragmatically important to understand. Or, you might say that True-Name-corrigibility is actually downstream of the True-Name-alignment concepts we need, and it’s the epiphenomenon. I don’t know. This prediction is uncertain and felt more like I queried my John model to give a mental speech which is syntactically similar to a real-John-speech, rather than my best prediction of what you would say.
I think it's important here to quote Hamming defining important problem:
I'm not talking about ordinary run-of-the-mill research; I'm talking about great research. I'll occasionally say Nobel-Prize type of work. It doesn't have to gain the Nobel Prize, but I mean those kinds of things which we perceive are significant [e.g. Relativity, Shannon's information theory, etc.] ...
Let me warn you, "important problem" must be phrased carefully. The three outstanding problems in physics, in a certain sense, were never worked on while I was at Bell Labs. By important I mean guaranteed a Nobel Prize and any sum of money you want to mention. We didn't work on (1) time travel, (2) teleportation, and (3) antigravity. They are not important problems because we do not have an attack. It's not the consequence that makes a problem important, it is that you have a reasonable attack.
This suggests to me that e.g. "AI alignment is an important problem", not this particular approach to alignment is an important problem. The latter is too small; it can be good work and impactful work, but not great work in the sense of relativity or information theory or causality. (I'd love to be proven wrong!)
I have this particular approach to alignment head-chunked as the reasonable attack, under Hamming’s model. It looks like if corrigibility or agent-foundations do not count as reasonable attacks, then Hamming would not think alignment is an important problem.
But speaking of, I haven’t read or seen discussed anywhere, whether he addresses the part about generating reasonable attacks.
Yes, I think that's the right chunking - and broadly agree, though Hamming's schema is not quite applicable to pre-paradigmatic fields. For reasonable-attack generation, I'll just quote him again:
One of the characteristics of successful scientists is having courage. ... [Shannon] wants to create a method of coding, but he doesn't know what to do so he makes a random code. Then he is stuck. And then he asks the impossible question, ``What would the average random code do?'' He then proves that the average code is arbitrarily good, and that therefore there must be at least one good code. [Great scientists] go forward under incredible circumstances; they think and continue to think.
I give you a story from my own private life. Early on it became evident to me that Bell Laboratories was not going to give me the conventional acre of programming people to program computing machines in absolute binary. ... I finally said to myself, ``Hamming, you think the machines can do practically everything. Why can't you make them write programs?'' What appeared at first to me as a defect forced me into automatic programming very early.
And there are many other stories of the same kind; Grace Hopper has similar ones. I think that if you look carefully you will see that often the great scientists, by turning the problem around a bit, changed a defect to an asset. For example, many scientists when they found they couldn't do a problem finally began to study why not. They then turned it around the other way and said, ``But of course, this is what it is'' and got an important result. So ideal working conditions are very strange. The ones you want aren't always the best ones for you.
Another technique I've seen in pre-paradigmatic research is to pick something that would be easy if you actually understood what was going on, and then try to solve it. The point isn't to get a solution, though it's nice if you do, the point is learning through lots of concretely-motivated contact with the territory. Agent foundations and efforts to align language models both seem to fit this pattern, for example.
In foundations of physics a Hamming problem would be measuring gravity from entangled states, so called gravcats. It is one area where QM does not make obvious definite predictions, as the details depend on the quantum nature of gravity, if any.
In contrast, the prediction is obvious and clear for QM+Newtonian gravity: any measurement of gravitational effects results in rapid entanglement with the measuring apparatus and decoherence of the entangled state.
Sadly, the limiting factor right now is the accuracy of the measurement, as we currently cannot measure gravity from objects below Planck mass (20 microgram), or put large enough objects into a superposition (current limit is an equivalent of tens of thousands hydrogen atoms, 16 orders of magnitude difference).
A few for cell biology:
For social science, here are some I'd throw in:
The best social science follows George Orwell's dictum: it takes huge effort to see what is in front of your face.
We’ll start with Richard Hamming’s original question: what are the most important problems in your field?
(At this point, you should grab pencil and paper, or open a text file, or whatever. Set a timer for at least two minutes - or five to ten if you want to do a longer version of this exercise - and write down the most important problems in your field. The rest of the questions will be variations on this first one, all intended to come at it from different directions; I recommend setting the timer for each of them.)
Perfection
Imagine that your field achieved perfection - the ultimate theory, perfect understanding, building The Thing.
What has been achieved in the idealized version of the field, which has not yet been achieved today? What are the main barriers between here and there?
Measurement
Often, in hindsight, a field turns out to have been bottlenecked on the development of some new measurement method, ranging from physical devices like the thermometer to abstract ideas like Shannon’s entropy and information channel capacity.
In what places does it look like your field is bottlenecked on the ability to usefully measure something? What are the main barriers to usefully measuring those things?
Framing
What are the places where your field is flailing around in the dark, trying desperate ideas and getting unhelpful results one after another? What are the places where it feels like the problem is formulated in the wrong language, and a shift to another frame might be required to ask the right question or state the right hypothesis?
Unification
Sometimes, we have a few different models, each of which works really well in different places. Maybe it feels like there should be some model which unifies them all, which could neatly account for all these phenomena at once - like the unification of electricity, magnetism and optics in the 19th century.
Are there different models in your field which feel like they point to a not-yet-known unified model?
Incompatible Assumptions
One of the main ways we notice (usually implicit) false assumptions in our models is when they come into conflict with some other results, patterns or constraints. This may look like multiple models which cannot all be true simultaneously, or it may look like one model which looks like it cannot be true at all yet nonetheless keeps matching reality quite well. This is a hint to reexamine the assumptions under which the models are supposedly incompatible/impossible, and especially look for any hidden assumptions in that impossibility argument.
Are there places in your field where a few models look incompatible, or one model looks impossible, yet nonetheless the models match reality quite well?
Giant Search Space
The space of possible physical laws or theorems or principles is exponentially vast. Sometimes, the hard part is to figure out what the relevant factors are at all. For instance, to figure out how to reproducibly culture a certain type of cell, a biologist might need to provide a few specific signal molecules, a physical medium with the right elasticity or density, a particular temperature range, and/or some other factors which nobody even thought to test yet.
Are there places in your field where nobody even knows what key factors must be controlled for some important outcome to robustly occur?
Finding The True Name
Sometimes, most people in the field have an intuition that some concept is important, but it’s not clear how to formulate the concept in a way that makes it robustly and generalizably useful. “Causality” was a good example of this, prior to Judea Pearl & co. Once we can pin down the right formulation of the idea, we can see arguments/theorems which follow the idea, and apply them in the wild. But before we have the right formulation, we have to make do with ad-hoc proxies, “leaky abstractions” which don’t quite consistently generalize in the ways we intuitively want/expect.
Are there places in your field where some concept seems very central to understanding, but nobody knows its True Name yet?
Meta
Sometimes social problems in a field prevent the most important problems from being addressed - e.g. bad incentives, key ideas reaching too few people, people with complementary skillsets not running into each other, different groups using different language and tools, etc.
At the social level, what are the barriers to solving the main problems in the previous two questions? Why aren’t they already solved? Why isn’t progress being made, or made faster?
Pica
There’s a condition called pica, where someone has a nutrient deficiency (e.g. too little iron), and they feel strong cravings for some food which does not contain that nutrient (e.g. ice). The brain just doesn’t always manage to crave things which will actually address the real problem; for some reason things like ice will “look like” they address the problem, to the relevant part of the brain.
Are there places where your field as a whole, or you personally, pursue things which won’t really help with the main problems (but might kind of “look like” they address the problems)?
Other Peoples’ Answers
Pick someone you know, or a few people, who are smart and have good judgment. What would their answers to these questions be?
Closing
Hamming’s original question was not just “What are the most important problems in your field?”. He had two follow-up questions:
To further quote Hamming: if you do not work on important problems, it's unlikely you'll do important work.