Some Thoughts on Metaphilosophy

Wei Dai

LESSWRONG
LW

Some Thoughts on Metaphilosophy — LessWrong

82 Some Thoughts on Metaphilosophy

by Wei Dai

10th Feb 2019

AI Alignment Forum

5 min read

82 Ω 33

A powerful AI (or human-AI civilization) guided by wrong philosophical ideas would likely cause astronomical (or beyond astronomical) waste. Solving metaphilosophy is one way in which we can hope to avoid this kind of disaster. For my previous thoughts on this topic and further motivation see Metaphilosophical Mysteries, The Argument from Philosophical Difficulty, Three AI Safety Related Ideas, and Two Neglected Problems in Human-AI Safety.

Some interrelated ways of looking at philosophy

Philosophy as answering confusing questions

This was my starting point for thinking about what philosophy is: it's what we do when we try to answer confusing questions, or questions that we don't have any other established methodology for answering. Why do we find some questions confusing, or lack methods for answering them? This leads to my next thought.

Philosophy as ability to generalize / handle distributional shifts

ML systems tend to have a lot of trouble dealing with distributional shifts. (It seems to be a root cause of many AI as well as human safety problems.) But humans seem to have some way of (sometimes) noticing out-of-distribution inputs, and can feel confused instead of just confidently using their existing training to respond to it. This is perhaps most obvious in unfamiliar ethical situations like Torture vs Dust Specks or trying to determine whether our moral circle should include things like insects and RL algorithms. Unlike ML algorithms that extrapolate in an essentially random way when given out-of-distribution inputs, humans can potentially generalize in a principled or correct way, by using philosophical reasoning.

Philosophy as slow but general purpose problem solving

Philosophy may even be a fully general purpose problem solving technique. At least we don't seem to have reason to think that it's not. The problem is that it's painfully slow and resource intensive. Individual humans acting alone seem to have little chance of achieving justifiably high confidence in many philosophical problems even if they devote their entire lives to those problems. Humanity has been collectively trying to solve some philosophical problems for hundreds or even thousands of years, without arriving at final solutions. The slowness of philosophy explains why distributional shifts remain a safety problem for humans, even though we seemingly have a general way of handling them.

Philosophy as meta problem solving

Given that philosophy is extremely slow, it makes sense to use it to solve meta problems (i.e., finding faster ways to handle some class of problems) instead of object level problems. This is exactly what happened historically. Instead of using philosophy to solve individual scientific problems (natural philosophy) we use it to solve science as a methodological problem (philosophy of science). Instead of using philosophy to solve individual math problems, we use it to solve logic and philosophy of math. Instead of using philosophy to solve individual decision problems, we use it to solve decision theory. Instead of using philosophy to solve individual philosophical problems, we can try to use it to solve metaphilosophy.

Philosophy as "high computational complexity class"

If philosophy can solve any problem within a very large class, then it must have a "computational complexity class" that's as high as any given problem within that class. Computational complexity can be measured in various ways, such as time and space complexity (on various actual machines or models of computation), whether and how high a problem is in the polynomial hierarchy, etc. "Computational complexity" of human problems can also be measured in various ways, such as how long it would take to solve a given problem using a specific human, group of humans, or model of human organizations or civilization, and whether and how many rounds of DEBATE would be sufficient to solve that problem either theoretically (given infinite computing power) or in practice.

The point here is that no matter how we measure complexity, it seems likely that philosophy would have a "high computational complexity class" according to that measure.

Philosophy as interminable debate

The visible aspects of philosophy (as traditionally done) seem to resemble an endless (both in clock time and in the number of rounds) game of debate, where people propose new ideas, arguments, counterarguments, counter-counterarguments, and so on, and at the same time to try judge proposed solutions based on these ideas and arguments. People sometimes complain about the interminable nature of philosophical discussions, but that now seems understandable if philosophy is a "high computational complexity" method of general purpose problem solving.

In a sense, philosophy is the opposite of math: whereas in math any debate can be settled by producing a proof (hence analogous to the complexity class NP) (in practice maybe a couple more rounds is needed of people finding or fixing flaws in the proof), potentially no fixed number of rounds of debate (or DEBATE) is enough to settle all philosophical problems.

Philosophy as Jürgen Schmidhuber's General TM

Unlike traditional Turing Machines, a General TM or GTM may edit their previous outputs, and can be considered to solve a problem even if it never terminates, as long as it stops editing its output after a finite number of edits and the final output is the correct solution. So if a GTM solves a certain problem, you know that it will eventually converge to the right solution, but you have no idea when, or if what's on its output tape at any given moment is the right solution. This seems a lot of like philosophy, where people can keep changing their minds (or adjust their credences) based on an endless stream of new ideas, arguments, counterarguments, and so on, and you never really know when you've arrived at a correct answer.

What to do until we solve metaphilosophy?

Protect the trajectory?

What would you do if you had a GTM that could solve a bunch of really important problems, and that was the only method you had of solving them? You'd try to reverse-engineer it and make a bunch of copies. But if you couldn't do that, then you'd want to put layers and layers of protection around it. Applied to philosophy, this line of thought seems to lead to the familiar ideas of using global coordination (or a decisive strategic advantage) to stop technological progress, or having AIs derive their terminal goals from simulated humans who live in a safe virtual environment.

Replicate the trajectory with ML?

Another idea is to try to build a good enough approximation of the GTM by training ML on its observable behavior (including whatever work tapes you have read access to). But there are two problems with this: 1. This is really hard or impossible to do if the GTM has internal state that you can't observe. And 2. If you haven't already reverse engineered the GTM, there's no good way to know that you've built a good enough approximation, i.e., to know that the ML model won't end up converging to answers that are different from the GTM.

A three part model of philosophical reasoning

It may be easier to understand the difficulty of capturing philosophical reasoning with ML by considering a more concrete model. I suggest we can divide it into three parts as follows: A. Propose new ideas/arguments/counterarguments/etc. according to some (implicit) distribution. B. Evaluate existing ideas/arguments/counterarguments/etc. C. Based on past ideas/arguments/counterarguments/etc., update some hidden state that changes how one does A and B. It's tempting to think that building an approximation of B using ML perhaps isn't too difficult, and then we can just search for the "best" ideas/arguments/counterarguments/etc. using standard optimization algorithms (maybe with some safety precautions like trying to avoid adversarial examples for the learned model). There's some chance this could work out well, but without having a deeper understanding of metaphilosophy, I don't see how we can be confident that throwing out A and C won't lead to disaster, especially in the long run. But A and C seem very hard or impossible for ML to capture (A due to paucity of training data, and C due to the unobservable state).

Is there a way around this difficulty? What else can we do in the absence of a full white-box solution to metaphilosophy?

Meta-PhilosophyPhilosophyAI

Frontpage

82 Ω 33

New Comment

44 comments, sorted by

top scoring

Click to highlight new comments since: Today at 11:24 AM

[-]William_S7yΩ470

I guess it feels like I don't know how we could know that we're in the position that we've "solved" meta-philosophy. It feels like the thing we could do is build a set of better and better models of philosophy and check their results against held-out human reasoning and against each other.

I also don't think we know how to specify a ground truth reasoning process that we could try to protect and run forever which we could be completely confident would come up with the right outcome (where something like HCH is a good candidate but potentially with bugs/subtleties that need to be worked out).

I feel like I have some (not well justified and possibly motivated) optimism that this process yields something good fairly early on. We could gain confidence that we are in this world if we build a bunch of better and better models of meta-philosophy and observe at some point the models continue agreeing with each other as we improve them, and that they agree with various instantiations of protected human reasoning that we run. If we are in this world, the thing we need to do is just spend some time building a variety of these kinds of models and produce an action that looks good to most of them. (Where agreement is not "comes up with the same answer" but more like "comes up with an answer that other models think is okay and not disastrous to accept").

Do you think this would lead to "good outcomes"? Do you think some version of this approach could be satisfactory for solving the problems in Two Neglected Problems in Human-AI Safety?

Do you think there's a different kind of thing that we would need to do to "solve metaphilosophy"? Or do you think that working on "solving metaphilosophy" roughly caches out as "work on coming up with better and better models of philosophy in the model I've described here"?

[-]Wei Dai7yΩ480

I guess it feels like I don’t know how we could know that we’re in the position that we’ve “solved” meta-philosophy.

What I imagine is reaching a level of understanding of what we’re really doing (or what we should be doing) when we “do philosophy”, on par with our current understanding of what “doing math” or “doing science” consist of, or ideally a better level of of understanding than that. (See Apparent Unformalizability of “Actual” Induction for one issue with our current understanding of “doing science”.)

I also don’t think we know how to specify a ground truth reasoning process that we could try to protect and run forever which we could be completely confident would come up with the right outcome (where something like HCH is a good candidate but potentially with bugs/subtleties that need to be worked out).

Here I’m imagining something like putting a group of the best AI researchers, philosophers, etc. in some safe and productive environment (which includes figuring out the right rules of social interactions), where they can choose to delegate further to other reasoning processes, but don’t face any time pressure to do so. Obviously I don’t know how to specify this in terms of having all the details worked out, but that does not seem like a hugely difficult problem to solve, so I wonder what do you mean/imply by “don’t think we know how”?

It feels like the thing we could do is build a set of better and better models of philosophy and check their results against held-out human reasoning and against each other.

If that’s all we do, it seems like it would be pretty easy to miss some error in the models, because we didn’t know that we should test for it. For example there could be entire classes of philosophical problems that the models will fail on, which we won’t know because we won’t have realized yet that those classes of problems even exist.

Do you think this would lead to “good outcomes”? Do you think some version of this approach could be satisfactory for solving the problems in Two Neglected Problems in Human-AI Safety?

It could, but it seems much riskier than either of the approaches I described above.

Do you think there’s a different kind of thing that we would need to do to “solve metaphilosophy”? Or do you think that working on “solving metaphilosophy” roughly caches out as “work on coming up with better and better models of philosophy in the model I’ve described here”?

Hopefully I answered these sufficiently above. Let me know if there’s anything I can clear up further.

[-]cousin_it7y60 Response to previous version

But humans seem to have some way of (sometimes) noticing out-of-distribution inputs, and can feel confused instead of just confidently use their existing training to respond to it.

I think what you're describing can be approximated by a Bayesian agent having a wide prior, and feeling "confused" when some new piece of evidence makes its posterior more diffuse. Evolutionarily it makes sense to have that feeling, because it tells the agent to do more exploration and less exploitation.

For example, if you flip a coin 1000 times and always get heads, your posterior is very concentrated around "the coin always comes up heads". But then it comes up tails once, your posterior becomes more diffuse, you feel confused, and you change your betting behavior until you can learn more.

[-]romeostevensit7y42 Response to previous version

I think it is driven by a general heuristic of finding compressibility. If a distribution seems complex we assume we're accidentally conflating two variables and seek the decomposition that makes the two resultant distributions approximate-able by simpler functions.

[-]Shmi7y52 Response to previous version

As I said here countless times before, answering questions is not what philosophy is good at. It's good at asking questions, and figuring out how to slice a small manageable piece of a big question for some other science to work on. Sadly, most philosophers misunderstand what their job is. They absolutely suck at finding answers, even as they excel as debating the questions. The debate is important as it crystallizes how to slice the big question into smaller ones, but it does not provide answers. Sometimes it's the philosophers themselves that are polymaths enough to be able to both slice a question and to answer it, like Pierce/Russell/Wittgenstein with truth tables. Most of the time a good question is posed, or a non-obvious perspective is highlighted, like the oft-discussed here Searle's Chinese room argument, or Jackson's Mary's room setup, but the proposed solution itself is nowhere close to satisfactory.

Philosophy is NOT a general purpose problem solver, and NOT a meta problem solver, it is a (meta) problem problem asker and slicer.

[-]Gordon Seidoh Worley7y102 Response to previous version

I object rather strongly to this categorization. This feels strongly to me like a misunderstanding borne of having only encountered analytic philosophy in rather limited circumstances and having assumed the notion of the "separate magisterium" that the analytic tradition developed as it broke from the rest of Western philosophy.

Many people doing philosophy, myself included, think of it more as the "mother" discipline from which we might specialize into other disciplines once we have the ground well understood enough to cleave off a part of reality for a time being while we work with that small part so as to avoid constantly facing the complete, overwhelming complexity of facing all of reality at once. What is today philosophy is perhaps tomorrow a more narrow field of study, except it seems in those cases where we touch so closely upon fundamental uncertainty that we cannot hope to create a useful abstraction, like physics or chemistry, to let us manipulate some small part of the world accurately without worrying about the rest of it.

[-]Shmi7y20 Response to previous version

Many people doing philosophy, myself included, think of it more as the "mother" discipline from which we might specialize into other disciplines once we have the ground well understood enough to cleave off a part of reality for a time being while we work with that small part so as to avoid constantly facing the complete, overwhelming complexity of facing all of reality at once.

That's a great summary, yeah. I don't see any contradiction with what I said.

What is today philosophy is perhaps tomorrow a more narrow field of study, except it seems in those cases where we touch so closely upon fundamental uncertainty that we cannot hope to create a useful abstraction, like physics or chemistry, to let us manipulate some small part of the world accurately without worrying about the rest of it.

You have a way with words :) Yes, specific sciences study small slivers of what we experience, and philosophy ponders the big picture, helping to spawn another sliver to study. Still don't see how it provides answers, just helps crystallize questions.

[-]Gordon Seidoh Worley7y40 Response to previous version

Yes, specific sciences study small slivers of what we experience, and philosophy ponders the big picture, helping to spawn another sliver to study. Still don't see how it provides answers, just helps crystallize questions.

It sounds like a disagreement on whether A contains B means B is an A or B is not an A. That is, whether or not that, say, physics, which is contained within the realm of study we call philosophy, although carefully cordoned off with certain assumptions from the rest of it, is still philosophy or whether philosophy is the stuff that isn't broken down into a smaller part, because to my way of thinking physics is largely philosophy of the material and so by example we have a case where philosophy provides answers.

[-]Shmi7y20 Response to previous version

I don't see this as anything related to containment. Just interaction. Good philosophy provides a well-defined problem to investigate for a given science, and, once in a blue moon, an outline of methodology, like Popper did. In turn, the scientific investigation in question can give philosophy some new "big" problems to ponder.

[-]Signer7y52 Response to previous version

Jackson’s Mary’s room setup

Never understood why it is considered good - isn't just confusion between "being in a state" and "knowing about a state"? The same way there is a difference between knowing everything about axes and there being axe in your head.

[-]TAG7y20 Response to previous version

Physicalists sometimes respond to Mary's Room by saying that one can not expect Mary actually to actually instantiate Red herself just by looking at a brain scan. It seems obvious to then that a physical description of brain state won't convey what that state is like, because it doesn't put you into that state. As an argument for physicalism, the strategy is to accept that qualia exist, but argue that they present no unexpected behaviour, or other difficulties for physicalism.

If another version of Mary were shut up to learn everything about, say, nuclear fusion, the question "would she actually know about nuclear fusion" could only be answered "yes, of course....didn't you just say she knows everything"? The idea that she would have to instantiate a fusion reaction within her own body in order to understand fusion is quite counterintuitive. Similarly, a description of photosynthesis will make you photosynthesise, and would not be needed for a complete understanding of photosynthesis.

There seem to be some edge cases.: for instance, would an alternative Mary know everything about heart attacks without having one herself? Well, she would know everything except what a heart attack feels like, and what it feels like is a quale. the edge cases, like that one, are cases are just cases where an element of knowledge-by-acquaintance is needed for complete knowledge. Even other mental phenomena don't suffer from this peculiarity. Thoughts and memories are straightforwardly expressible in words, so long as they don't involve qualia.

So: is the response "well, she has never actually instantiated colour vision in her own brain" one that lays to rest and the challenge posed by the Knowledge argument, leaving physicalism undisturbed? The fact that these physicalists feel it would be in some way necessary to instantiate colour, but not other things, like photosynthesis or fusion, means they subscribe to the idea that there is something epistemically unique about qualia/experience, even if they resist the idea that qualia are metaphysically unique.

[-]Signer7y10 Response to previous version

The problem is that there in no other case does it seem necessary to instantiate a brain state in order to undertstand something.

The point is you either define "to understand" as "to experience", or it is not necessary to see red in order to understand experience. What part of knowledge is missing if Mary can perfectly predict when she will see red? It just that ability to invoke qualia from memory is not knowledge, just because it is also in the brain - the same way that reflexes are not additional knowledge. And even ability to transfer thoughts with words is just approximation... I mean it doesn't solve the Hard problem by itself (panpsychism does) - but I think bringing knowledge into it doesn't help. Maybe its intuitive, but it seems to be very easily disprovable intuition - not the kind of "I am certain that I am conscious".

[-]ChristianKl7y30 Response to previous version

Most people who rides bikes don't have explicit knowledge about how riding a bike works. They are relying on reflexes to ride a bike.

Would you say that most people who ride bikes don't know how to ride a bike?

[-]Signer7y30 Response to previous version

Basically, yes, I would like to use different words for different things. And if we don't accept that knowing how to ride a bike and being able to ride a bike are different, then what? A knowledge argument for unphysical nature of reflexes?

[-]ChristianKl7y20 Response to previous version

By that reasoning a native speaker of a language would often have less knowledge of a language then a person who learned it as a foreign language in a formal matter even when the native speaker speaks it much better for all practical purposes.

When we speak about whether Mary understanding Chinese, I think what we care about is to what extend she will be able to use the language the way a speaker of Chinese would.

A lot of most expert decision making is based on "unconscious competence" and you have to be very careful about how you use the term knowledge if you think that "unconscious competence" doesn't qualify as knowledge.

[-]Signer7y10 Response to previous version

Again, this seems to me like a pretty consistent way to look at things that also more accurately matches reality. Whether we use words "knowledge" and "ability" or "explicit knowledge" and "knowledge" doesn't matter, of course. And for what its worth, I much less sure of usefulness of being precise about such terms in practice. But if there is an obvious physical model of this thought experiment, where there are roughly two kinds of things in Mary's brain - one easily influenceable by words, and another not - and this model explains everything without introducing anything unphysical, then I don't see what's the point of saying "well, if we first group everything knowledge-sounding together, then that grouping doesn't make sense in Mary's situation".

[-]TAG9mo20

The point is you either define “to understand” as “to experience

It's not novel, we already have. That why we say this like "you had to be there".

What part of knowledge is missing if Mary can perfectly predict when she will see red?

What red looks like, as stated in all forms of the story.

… I mean it doesn’t solve the Hard problem by itself

It's not supposed to. It's supposed to indicate that there is a hard problem , ie. that that even a super scientist cannot come up with a reductive+predictive theory of qualia.

(panpsychism does)

If you don't think there is an HP, because of Mary's Room, why do you think there is an HP?

[-]Signer8mo10

If you don’t think there is an HP, because of Mary’s Room, why do you think there is an HP?

Because of the Zombies Argument. "What part of physical equations says our world is not a zombie-world?" is a valid question. The answer to "What part of physical equations says what red looks like?" is just "the part that describes brain".

It’s supposed to indicate that there is a hard problem , ie. that that even a super scientist cannot come up with a reductive+predictive theory of qualia.

It doesn't indicate it independently of other assumptions. Mary's situation only implies that you should track the difference between knowing and being (which I guess is a hint that panpsychism solves HP) - it doesn't say there is something wrong with "Mary has a reductive+predictive theory of qualia, which is confirmed by her ability to predict every pixel of her future experience".

What red looks like, as stated in all forms of the story.

Mary can say "it looks like <pixel array>".

[-]TAG8mo20

The answer to “What part of physical equations says what red looks like?” is just “the part that describes brain”.

Expand on the "says". If Mary looks at these equations ,in her monochrome room, does she go into the brain state that instantiates seeing something red? Does she somehow finds out what red looks like without that? Neither?

Mary’s situation only implies that you should track the difference between knowing and being

What does that mean? Are you saying Mary already knew what red looks like, and instantiating the brain state adds no new knowledge?

which I guess is a hint that panpsychism solves HP) -

Why?

Mary has a reductive+predictive theory of qualia, which is confirmed by her ability to predict every pixel of her future experience”.

Mary can "predict pixels" in some sense that bypasses her knowing what colour qualia look like. Just as a blind person can repeat, without understanding , that tomatoes look red, Mary can state that such and such a brain state would have an RGB value of #FF0000 at such and such a pixel. #FF0000 is a symbol for something unknown to here, just as much as r-e-d. So it's not a prediction of a quale in the relevant sense.

[-]Signer8mo10

If Mary looks at these equations ,in her monochrome room, does she go into the brain state that instantiates seeing something red?

No.

Does she somehow finds out what red looks like without that?

Yes.

What does that mean? Are you saying Mary already knew what red looks like, and instantiating the brain state adds no new knowledge?

She already knew what red looks like, the knowledge just was in a different representation. Just like with knowing how to ride a bike. "no new", like everything here, depends on definitions. But she definitely undergoes physical change, that may be viewed as her gaining new representation of knowledge, which may be valuable for her.

Mary can “predict pixels” in some sense that bypasses her knowing what colour qualia look like. Just as a blind person can repeat, without understanding , that tomatoes look red, Mary can state that such and such a brain state would have an RGB value of #FF0000 at such and such a pixel. #FF0000 is a symbol for something unknown to here, just as much as r-e-d. So it’s not a prediction of a quale in the relevant sense.

Mary can "predict muscles" in some sense that bypasses her knowing how to ride a bike. Just as a boring person can repeat, without understanding, that bikes are fast, Mary can state that such and such a muscle state would have an XYZ value of (1,2,3) at such and such a cell. (1,2,3) is a symbol for something unknown to her, just as much as b-i-k-e. So it’s not a prediction of riding in the relevant sense.

Without additional assumptions, predicting pixels doesn't bypasses anything more, than predicting atoms bypasses knowing fire. You can separately intuitively assume that there is a difference, but what's unique about Mary's situation is resolvable by just noting that there is a difference between being and knowing.

Why is red is "something unknown to her"? If to answer it you need additional arguments, like zombies, then you don't need Mary. And the answer is not "because she gains knowledge by instantiating" because it's the same way with bikes.

Why?

I mean that from qualia requiring you to exist in a specific state you can make a jump to "consciousness is existence".

[-]TAG8mo*20

She already knew what red looks like, the knowledge just was in a different representation.

What it looks like is the representation! A different representation just isn't a quale. #FF0000 just isnt a red quale!

Just like with knowing how to ride a bike. "no new",

But reading a book on riding a bike isn't knowing how to tide a bike...you get the knowledge from mounting a bike and trying!

like everything here, depends on definitions. But she definitely undergoes physical change, that may be viewed as her gaining new representation of knowledge, which may be valuable for her.

The knowledge of representation is the whole thing! Qualia are appearances!

Without additional assumptions, predicting pixels doesn't bypasses anything

It bypasses what you are calling representation ... you have admitted that.

there is a difference between being and knowing.

That doesn't mean there isn't a difference between different kinds of knowing.

I mean that from qualia requiring you to exist in a specific state you can make a jump to "consciousness is existence".

The physics equations representing a brain don't contain qualia then, since they don't exist as a brain.

[-]Signer8mo10

What it looks like is the representation! A different representation just isn’t a quale. #FF0000 just isnt a red quale!

But reading a book on riding a bike isn’t knowing how to tide a bike...you get the knowledge from mounting a bike and trying!

The knowledge of representation is the whole thing! Qualia are appearances!

If you want to define things that way, ok. So Mary's room implies that bikes are as unphysical as qualia.

It bypasses what you are calling representation … you have admitted that.

Mary also doesn't have all representations for all physical knowledge. She doesn't have to have a concept of fire, or equations in all possible notations, or riding skills.

That doesn’t mean there isn’t a difference between different kinds of knowing.

Mary's room doesn't provide motivation for there being a fundamental difference between knowing how to ride a bike and knowing what it is like to see red. And physicalism explains bikes, right?

The physics equations representing a brain don’t contain qualia then, since they don’t exist as a brain.

Yes, of course, like they don't contain atoms or fire or whatever. Reality they describe contains them. Well, except equations are physical objects, so you can write equations with brains or something, but it's not relevant.

And additional representation of red in Mary's brain after she sees it is also doesn't contain her being in a state of seeing red.

[-]TAG8mo20

The knowledge of representation is the whole thing! Qualia are appearances!

If you want to define things that way, ok.

As before , that's the standard definition.

So Mary’s room implies that bikes are as unphysical as qualia.

Qualia aren't defined as unphysical.

Bikes aren't appearances , so there is no analogy.

It bypasses what you are calling representation … you have admitted that.

Mary also doesn’t have all representations for all physical knowledge. She doesn’t have to have a concept of fire, or equations in all possible notations, or riding skills.

Of course she knows what fire is , she is a super scientist.

Know-how, such as riding kills, is not an appearance, or physical.knowledge.

That doesn’t mean there isn’t a difference between different kinds of knowing.

Mary’s room doesn’t provide motivation for there being a fundamental difference between knowing how to ride a bike and knowing what it is like to see red.

Nonetheless , there is a difference.

And physicalism explains bikes, right?

Riding bikes? How they work? How they appear?

The physics equations representing a brain don’t contain qualia then, since they don’t exist as a brain.

Yes, of course, like they don’t contain atoms or fire or whatever.

But in most cases, that doesn't matter, for the usual reason.

That is correct as stated but somewhat misleading: the problem is why is it necessary, in the case of experience, and only in the case of experience to instantiate it in order to fully understand it. Obviously, it is true a that a descirption of a brain state won't put you into that brain state. But that doesn't show that there is nothing unusual about qualia. The problem is that there in no other case does it seem necessary to instantiate a brain state in order to undertstand something. If another version of Mary were shut up to learn everything about, say, nuclear fusion, the question "would she actually know about nuclear fusion" could only be answered "yes, of course....didn't you just say she knows everything"? The idea that she would have to instantiate a fusion reaction within her own body in order to understand fusion is quite counterintuitive. Similarly, a description of photosynthesis will make you photosynthesise, and would not be needed for a complete understanding of photosynthesis.

[-]Signer8mo32

Bikes aren’t appearances , so there is no analogy.

The analogy is that they both need instantiation. That's the thing about appearances that is used in the argument.

Know-how, such as riding kills, is not an appearance, or physical.knowledge.

So physicalism is false, because physical knowledge is incomplete without know-how.

Nonetheless , there is a difference.

Sure, they are different physical processes. But what's the relevant epistemological difference? If you agree that Mary is useless we can discuss whether there are ontological differences.

Riding bikes? How they work? How they appear?

Yes.

in the case of experience, and only in the case of experience

The problem is that there in no other case does it seem necessary to instantiate a brain state in order to undertstand something.

Again, this is false - it is as much as necessary in case of riding. And differences between knowing about qualia and knowing about fusion are explained by preferences: humans just don't care about or need instantiating fusion, but care about instantiating red. In both cases you are physically affected and so you (can define knowledge in such a way that you) gain new representation of knowledge by instantiation.

[-]TAG8mo*20

The analogy is that they both need instantiation

Both need instantiation for what?

So physicalism is false, because physical knowledge is incomplete without know-how.

That's kind of munchkinning. Even if it's incomplete in that way, it doesn't have metaphysical implications.

Sure, they are different physical processes. But what’s the relevant epistemological difference

Mary doesn't know what colour qualms look.like, and therefore has an incomplete understanding of consciousness. As stayed in all versions of the story.

Riding bikes? How they work? How they appear?

Yes.

Unhelpful.

Again, this is false—it is as much as necessary in case of riding

Riding is doing, not understanding.

[-]Signer8mo10

Even if it’s incomplete in that way, it doesn’t have metaphysical implications.

Therefore Mary's incomplete knowledge about consciousness doesn't have metaphysical implications, because it is incomplete in fundamentally same way.

Mary doesn’t know what colour qualms look.like, and therefore has an incomplete understanding of consciousness.

Mary doesn't know how to ride, and therefore has incomplete understanding of riding. What's the difference?

Both need instantiation for what?

For gaining potential utility from specific knowledge representations, for knowledge that feels intuitively complete. I guess "you can't learn to ride in your room" requirement is not exactly and only instantiation? Anyway, the intended general category is "useful knowledge representations".

Unhelpful

I mean all of them: if physicalism explains riding a bike (physical equations give knowledge in some form and also predict you gaining new knowledge representation when you actually learn to ride), then it explains it's appearance in analogous way (physical equations give knowledge about bike's appearance in some form and also predict you gaining new knowledge representation when you actually see it).

[-]TAG8mo20

Therefore Mary’s incomplete knowledge about consciousness doesn’t have metaphysical implications, because it is incomplete in fundamentally same way.

No it isn't. Mary doesn't know what Red looks like. That's not know-how

Mary doesn’t know how to ride, and therefore has incomplete understanding of riding. What’s the difference?

Things can be incomplete in different ways

Both need instantiation for what?

For gaining potential utility from specific knowledge representations, for knowledge that feels intuitively complete.

Theoretical knowledge isn't about utility.

I mean all of them: if physicalism explains riding a bike

It doesn't , in the sense that the theoretical knowledge gives you the know-how. That's one of your own assumptions.

[-]bfinn7y30 Response to previous version

But philosophers are good at proposing answers - they all do that, usually just after identifying a flaw with an existing proposal.

What they're not good at is convincing everyone else that their solution is the right one. (And presumably this is because multiple solutions are plausible. And maybe that's because of the nature of proof - it's impossible to prove something definitively, and disproving typically involves finding a counterexample, which may be hard to find.)

I'm not convinced philosophy is much less good at finding actual answers than say physics. It's not as if physics is completely solved, or even particularly stable. Perhaps its most promising period of stability was specifically the laws of motion & gravity after Newton - though for less than two centuries. Physics seems better than philosophy at forming a temporary consensus; but that's no use (and indeed is counterproductive) unless the solution is actually right.

Cf a rare example of consensus in philosophy: knowledge was 'solved' for 2300 years with the theory that it's a 'true justified belief'. Until Gettier thought of counterexamples.

[-]RogerDearnaley2mo30

So, In philosophy of science terminology, pholosophers have plenty of hypothesis generation, but very little falsifiability (beyond, as Gettier did, demonstarting an internal logical inconsistency), so the tendency it to increase the number of credible candidate answers, rather than decreasing them.

[-]William_S7y40

Re: Philosophy as interminable debate, another way to put the relationship between math and philosophy:

Philosophy as weakly verifiable argumentation

Math is solving problems by looking at the consequences of a small number of axiomatic reasoning steps. For something to be math, we have to be able to ultimately cash out any proof as a series of these reasoning steps. Once something is cashed out in this way, it takes a small constant amount of time to verify any reasoning step, so we can verify given polynomial time.

Philosophy is solving problems where we haven't figured out a set of axiomatic reasoning steps. Any non-axiomatic reasoning step we propose could end up having arguments that we hadn't thought of that would lead us to reject that step. And those arguments themselves might be undermined by other arguments, and so on. Each round of debate lets us add another level of counter-arguments. Philosophers can make progress when they have some good predictor of whether arguments are good or not, but they don't have access to certain knowledge of arguments being good.

Another difference between mathematics and philosophy is that in mathematics we have a well defined set of objects and a well-defined problem we are asking about. Whereas in philosophy we are trying to ask questions about things that exist in the real world and/or we are asking questions that we haven't crisply defined yet.

When we come up with a set of axioms and a description of a problem, we can move that problem from the realm of philosophy to the realm of mathematics. When we come up with some method we trust of verifying arguments (ie. replicating scientific experiments), we can move problems out of philosophy to other sciences.

It could be the case that philosophy grounds out in some reasonable set of axioms which we don't have access to now for computational reasons - in which case it could all end up in the realm of mathematics. It could be the case that, for all practical purposes, we will never reach this state, so it will remain in the "potentially unbounded DEBATE round case". I'm not sure what it would look like if it could never ground out - one model could be that we have a black box function that performs a probabilistic evaluation of argument strength given counter-arguments, and we go through some process to get the consequences of that, but it never looks like "here is a set of axioms".

[-]avturchin7yΩ340 Response to previous version

All else equal, I prefer an AI which is not capable to philosophy, as I am afraid of completely alien conclusions which it could come to (e.g. insect are more important than humans).

More over, I am skeptical that going on meta-level simplifies the problem to the level that it will be solvable by humans (the same about meta-ethics and theory of human values). For example, if someone says that he is not able to understand math, but instead will work on meta-mathematical problems, we would be skeptical about his ability to contribute. Why meta-level would be simpler?

[-]jessicata7yΩ570 Response to previous version

More over, I am skeptical that going on meta-level simplifies the problem to the level that it will be solvable by humans (the same about meta-ethics and theory of human values).

This is also my reason for being pessimistic about solving metaphilosophy before a good number of object-level philosophical problems have been solved (e.g. in decision theory, ontology/metaphysics, and epistemology). If we imagine being in a state where we believe running computation X would solve hard philosophical problem Y, then it would seem that we already have a great deal of philosophical knowledge about Y, or a more general class of problems that includes Y.

More generally, we could look at the history difficulty of solving a problem vs. the difficulty of automating it. For example: the difficulty of walking vs. the difficulty of programming a robot to walk; the difficulty of adding numbers vs. the difficulty of specifying an addition algorithm; the difficulty of discovering electricity vs. the difficulty of solving philosophy of science to the point where it's clear how a reasoner could have discovered (and been confident in) electricity; and so on.

The plausible story I have that looks most optimistic for metaphilosophy looks something like:

Some philosophical community makes large progress on a bunch of philosophical problems, at a high level of technical sophistication.
As part of their work, they discover some "generators" that generate a bunch of the object-level solutions when translated across domains; these generators might involve e.g. translating a philosophical problem to one of a number of standard forms and then solving the standard form.
They also find philosophical reasons to believe that these generators will generate good object-level solutions to new problems, not just the ones that have already been studied.
These generators would then constitute a solution to metaphilosophy.

[-]Wei Dai7yΩ362 Response to previous version

I think our positions on this are pretty close, but I may put a bit more weight on other "plausible stories" for solving metaphilosophy relative to your "plausible story". (I'm not sure if overall I'm more or less optimistic than you are.)

If we imagine being in a state where we believe running computation X would solve hard philosophical problem Y, then it would seem that we already have a great deal of philosophical knowledge about Y, or a more general class of problems that includes Y.

It seems quite possible that understanding the general class of problems that includes Y is easier than understanding Y itself, and that allows us to find a computation X that would solve Y without much understanding of Y itself. As an analogy, suppose Y is some complex decision problem that we have little understanding of, and X is an AI that is programmed with a good decision theory.

More generally, we could look at the history difficulty of solving a problem vs. the difficulty of automating it. For example: the difficulty of walking vs. the difficulty of programming a robot to walk;

This does not seem like a very strong argument for your position. My suggestion in the OP is that humans already know the equivalent of "walking" (i.e., doing philosophy), we're just doing it very slowly. Given this, your analogies don't seem very conclusive about the difficulty of solving metaphilosophy or whether we have to make a bunch more progress on object-level philosophical problems before we can solve metaphilosophy.

[-]avturchin7y10 Response to previous version

Creating AI for solving hard philosophical problems is like passing hot potato from right hand to left.

For example, I want to solve the problem of qualia. I can't solve it myself, but may be I can create super-intelligent AI which will help me to solve it? Now I start to working on AI, and soon encounter the the control problem. Trying to solve the control problem, I would have to specify nature of human values, and soon I will find the need to tell something about existing and nature of qualia. Now the circle is done: I have the same problem of qualia, but packed inside the control problem. If I make some assumption about what qualia should be, they will probably affect the final answer by AI.

However, I still could use some forms of AI to solve qualia problem: if I use google search, I could quickly find all relevant articles, identify the most cited, newest, maybe create an argument map. This is where Drexler's CAIS may help.

[-]William_S7y40

Maybe one AI philosophy service could look like: would ask you a bunch of other questions that are simpler than the problem of qualia, then show you what those answers imply about the problem of qualia if you use some method of reconciling those answers.

[-]avturchin7y10

In fact, when I use Google Scholar to find new articles about e.g. qualia, I already use narrow AI to advance my understanding. So AI could be useful in thinking about philosophical problems. What I am afraid of is AI's decisions based on incomprehensible AI-created philosophy.

[-]Wei Dai7yΩ350 Response to previous version

More over, I am skeptical that going on meta-level simplifies the problem to the level that it will be solvable by humans

If I gave the impression in this post that I expect metaphilosophy to be solved before someone builds an AGI, that was far from my intentions. I think this is a small-chance-of-high-return kind of situation, plus I think someone has to try to attack the problem if only to generate evidence that it really is a hard problem, otherwise I don't know how to convince people to adopt costly social solutions like stopping technological progress. (And actually I don't expect the evidence to be highly persuasive either, so this amounts to just another small chance of high return.)

What I wrote in an earlier post still describes my overall position:

There is no strong empirical evidence that solving metaphilosophy is superhumanly difficult, simply because not many people have attempted to solve it. But I don’t think that a reasonable prior combined with what evidence we do have (i.e., absence of visible progress or clear hints as to how to proceed) gives much hope for optimism either.

[-]RogerDearnaley2mo31

we use [philosophy] to solve science as a methodological problem (philosophy of science)

That was true when Popper actually did that in the 1930s. But I think the Popperian "philosophy of science" (i.e. hypothesis generation, falsifiability, and paradigm shifts) is now "obvious strategy implications from the theory of approximate Bayesian reasoning" (which was already under development in the 1930s, but wasn't fully developed until about the 1950s), so IMO it has since become a matter of mathematics/logic (and since the rise of AI as a field of engineering, also engineering). So I see science as having now been put on a basis stronger, from a Naturalism point of view, than philosophy was able to provide.

In general, I'm a lot more optimistic about AI-assisted science and mathematics than I am about AI-assisted metaphilosophy. Partly because I think there are areas, such as ethics, where there are reasons (like Evolutionary Moral Psychology) to think that human moral intuitions might actually map to successful adaptive strategies for co-evolution of cooperation in positive-sum games — and I'm less clear why AI would necessarily have useful intuitions.

[-]TAG9mo20

A solution to metaphilosophy doesn't have to lead to a solution to philosophy. Your list of possible characterizations of metaphilosophy isn't exhaustive. These facts are related: philosophy can also be characterised as "where reduction and empiricism stop working".

It's not that philosophers weirdly and unreasonably prefer intuition to empirical facts and mathematical/logical reasoning, it is that those things either don't go far enough, or are themselves based on intuition.

"Just use empircism" doesn't work, because philosophy is about interpreting empirical data.

"Just use maths/logic" doesn't work , because those things are based on axioms justified by intuitive appeal.

"Just use reductionism" doesn't work , because its not clear what lies at the bottom of the stack, or if anything does. Logic, epistemology and ontology have all been held to be First Philosophy at different times. Logic, epistemology and ontology also seen to interact. Correct ontology depends on direct epistemology..but what minds are capable of knowing depends on ontology. Logic possibly depends on ontology too, since quantum.mechanics arguable challenges traditional bivalent logic.

Philosopher don't embrace intuitions because they think they are particularly reliable, they have reasoned that they can't do without them. That is the essence of the Inconvenient Ineradicability of Intuition. An unfounded foundation is what philosophers mean by "intuition

[-]Wei Dai5yΩ220

having AIs derive their terminal goals from simulated humans who live in a safe virtual environment.

There has been some subsequent discussion (expressing concern/doubt) about this at https://www.lesswrong.com/posts/7jSvfeyh8ogu8GcE6/decoupling-deliberation-from-competition?commentId=bSNhJ89XFJxwBoe5e

[-]Chris_Leong7y20

"The point here is that no matter how we measure complexity, it seems likely that philosophy would have a "high computational complexity class" according to that measure." - I disagree. The task of philosophy is to figure out how to solve the meta problem, not to actually solve all individual problems or the worst individual problem

[-]Benjamin Schmidt2y10

Because of the strange loopy nature of concepts/language/self/different problems metaphilosophy seems unsolvable?
Asking: What is good? already implies that there are the concepts "good", "what", "being" that there are answers and questions ... Now we could ask what concepts or questions to use instead ...

Similarly:
> "What are all the things we can do with the things we have and what decision-making process will we use and why use that process if the character of the different processes is the production of different ends; don't we have to know which end is desired in order to choose the decision-making process that also arrives at that result?"
> Which leads back to desire and knowing what you want without needing a system to tell you what you want.

It's all empty in the Buddhist sense. It all depends on which concepts or turing machines or which physical laws you start with.

[-]Seremonia2y-10

Metaphilosophy is about reasoning through logical consequences. It's the basic, foundation of causality

You can read here https://www.lesswrong.com/posts/Xnunj6stTMb4SC5Zg/metaphilosophy-a-philosophizing-through-logical-consequences

[-]Wei Dai2y43

I read your post but was unable to make any sense of it.

Moderation Log