When surveys on mturk are designed to hold a single account occupied for longer than strictly necessary to fill out an answer that passes any surface-level validity checks, the obvious next step is for people to run multiple accounts on multiple devices, and you're back at people giving low-effort answers as fast as possible.
This bit was very interesting to me:
These models are “predictive” in the important sense that they perceive not just how things are at the moment but also anticipate how your sensory inputs would change under various conditions and as a consequence of your own actions. Thus:
- Red = would create a perception of a warmer color relative to the illuminant even if the illumination changes.
My current pet theory of qualia is that there is an illusion that they are a specific thing (e.g. the redness of red) when in reality there are only perceived relations between a quale and other qualia, and a perceived identity between that quale and memories of that quale. But the sense of identity (or constancy through time) is not caused by an actual specific thing (the "redness" that one erroneously tries to grasp but always seems just beyond reach), but by a recurrence of those relations.
Why I like the quoted part is because it can be read as a predictive processing-flavoured version of the same theory. The illusion (that there is a reified thing instead of only a jumble of relationships) is strengthened by the fact that we not only recognize the cluster of qualia relationships and can correctly identify it, but furthermore predict how it will behave. Framing a "quale" as an ability to predict how a sense impression will change with varying sensory (or imaginary) impressions seems to make the definition both richer (it is not just an isolated flash of recognition of what's in front of your eyes, but a set of predictions of how it might behave) and more coherent (different experiences of "redness" are tied together by the same quale because they could be "transformed" into each other while adhering to the changes that the quale "redness" says are applicable to itself).
Using glossaries, indexes and other alphabetically ordered word listings to leverage the explicitly learned spellings in order to deduce beginnings of other words – e.g. if you knew how to spell the token 'the', and you kept seeing the token 'this' listed shortly after the token 'the' in alphabetic listings, you could reasonably guess that
'this' begins with a T, its second letter could well be H, and if so, its third letter comes from the set {E, F, ..., Z}. By spending an astronomical amount of time attempting to solve something akin to a 50,000-dimensional Sudoku puzzle, you might be able to achieve high confidence for your guesses as to the first three or four letters of most whole-word token strings.
I would hazard a guess this is the most significant mechanism here. Crossword or Scrabble dictionaries, lists of "animals/foods/whatever starting with [letter]", "words without E", "words ending in [letter]", anagrams, rhyming dictionaries, hyphenation dictionaries, palindromes, lists where both a word and its reverse are valid words, word snakes and other language games... It probably ingested a lot of those. And when it learns something about "mayonnaise" from those lists, that knowledge is useful for any words containing the tokens 'may', 'onna' and 'ise'.
Thanks for mentioning reflective stability, it's exactly what I've been wondering about recently and I didn't know the term.
However, using the formalism of utility functions, we are able to make decently convincing arguments that this self-improvement process will tend to preserve utility functions.
Can you point me to the canonical proofs/arguments for values being reflectively stable throughout self-improvement/reproduction towards higher intelligence? On the one hand, it seems implausible to me on the intuition that it's incredibly difficult to predict the behaviour of a complex system more intelligent than you from static analysis. On the other hand, if it is true, then it would seem to hold just as much for humans themselves as the first link in the chain.
Because if it forseeably changed utility function from X to Y, then probably it would be calculated by the X-maximizing agent to harm, rather than help, its utility, and so the change would not be made.
Specifically, the assumption that this is foreseeable at all seems to deeply contradict the notion of intelligence itself.
The paper Galton - Visualised Numerals (1880) (there's several documents with this title floating around, this is the full paper and has the most pictures) contains a bunch of drawings of people's professed visualizations of the number line. Some are quite curled indeed!
Alternative explanation: everyone has qualia, but some people lack the mental mechanism that makes them feel like qualia require a special metaphysical explanation. Since qualia are almost always represented as requiring such an explanation (or at least as ineffable, mysterious and elusive), these latter people don't recognize their own qualia as that which is being talked about.
How can people lack such a mental mechanism? Either
I don't have a clue about the relative prevalences of these groups, nor do I mean to make a claim about which group you personally are in.
Your disagreement is mirrored almost exactly in Yudkowsky's post Zombies Redacted. The crucial point (as mentioned also in Hastings' sister comment) is that the thought experiment breaks down as soon as you consider the zombies making just the same claims about consciousness as we do, while not actually having any coherent reason for making such claims (as they are defined to not have consciousness in the first place). I guess you can imagine, in some sense, a scenario like that, but what's the point of imagining a hypothetical set of physical laws that lack internal coherence?
Suppose scientists one day solve the problem of consciousness according to one of these definitions, in a way that can be readily understood by any reasonably intelligent interested layman. I think it's quite likely that many adherents of other definitions would then come around and say, "ah, great job, they finally figured it out! This is exactly what I meant by 'consciousness' all along.". For example, if an explanation of the first-person subjective experience of pleasure and pain (no. 6) were available, it would probably explain perception-of-perception (no. 7) as well, or at least explain why that seemed like a reasonable interpretation of consciousness.
If so, it would support the conclusion that most people are in fact referring to the same phenomenon (and it is this phenomenon that they find morally valuable), and the fact that they refer to it in different terms is mostly a function of the general unsolvedness of the problems surrounding consciousness, unclarity of terminology, etc...
Then there is an alliance not around the conflation of properly distinct concepts, but around the resistance to over-specification of something which is as yet too unclear to specify to such an extent. Very speculatively we might suppose this is an instance of a general social process that lets debate take place at the proper level of specificity, without hindering participants from forming and using their own tentative definitions.
Perhaps instead of a tree it would be better to have a directed acyclic graph, since IME even if the discussion splits off into branches one often wants at some point to respond to multiple endpoints with one single comment. But I don't know if there's really a better UI for that shape of discussion than a simple flat thread with free linking/quoting of earlier parts of discussions. I don't think I have ever seen a better UI for this than 4chan's.