As the internal ontology takes on any reflective aspects, parts of the representation that mix with facts about the AI's internals, I expect to find much larger differences
It could be worth exploring reflection in transparency-based AIs, the internals of which are observable. We can train a learning AI, which only learns concepts by grounding them on the AI's internals (consider the example of a language-based AI learning a representation linking saying words and its output procedure). Even if AI-learned concepts do not coincide with human concepts, becaus...
I like your arguments on AGI timelines, but the last section of your post feels like you are reflecting on something I would call "civilization improvement" rather than on a 20+ years plan for AGI alignment.
I am a bit confused by the way you are conflating "civilization improvement" with a strategy for alignment (when you discuss enhanced humans solving alignment, or discuss empathy in communicating a message "If you and people you know succeed at what you're trying to do, everyone will die"). Yes, given longer timelines, civilization improveme...
I am also interested in interpretable ML. I am developing artificial semiosis, a human-like AI training process which can achieve aligned (transparency-based, interpretability-based) cognition. You can find an example of the algorithms I am making here: the AI runs a non-deep-learning algorithm, does some reflection and forms a meaning for someone “saying” something, a meaning different from the usual meaning for humans, but perfectly interpretable.
I support then the case for differential technological development:
...There are two counter-arguments to this th
In his paper, Searle brings forward a lot of arguments.
Early in his argumentation and referring to the Chinese room, Searle makes this argument (which I ask you not to mix with later arguments without care):
it seems to me quite obvious in the example that I do not understand a word of the Chinese stories. I have inputs and outputs that are indistinguishable from those of the native Chinese speaker, and I can have any formal program you like, but I still understand nothing. For the same reasons, Schank's computer understands nothing of any stories....
Uhm, an Aboriginal tends to see meaning in anything. The more the regularities, the more meaning she will form. Semiosis is the dynamic process of interpreting these signs.
If you were put in a Chinese room with no other input than some incomprehensible scribbles you will probably start considering that what you are doing has indeed a meaning.
Of course, a less intelligent human in the room or a human put under pressure would not be able to understand Chinese even with the right algorithm. My point is that the right algorithm enables the right human to understand Chinese. Do you see that?
A more proper summary would read as follows:
1. P is an instantiated algorithm that behaves as if it [x]. (Where [x] = “understands and speaks Chinese”.)
2. If we examine P, we can easily see that its inner workings cannot possibly explain how it could [x].
3. Therefore, the fact that humans can [x] cannot be explainable by any algorithm.
I have some problem with your formulation. The fact that P does not understand [x] is nowhere in your formulation, not in premise #1. Conclusion #3 is wrong and should be written as "the fact that humans can [x] cannot ...
SCA infers that "somebody wrote that" where the term "somebody" is used more generally than in English.
SCA does not infer that another human being wrote that, but rather that a casual agent wrote that, maybe spirits of the caves.
If SCA enters two caves and observes natural patterns in cave A and the characters of "The adventures of Pinocchio" in cave B, she may deduce that two different spirits wrote them. Although she may discover some patterns in what spirit A (natural phenomena) wrote, she won't be able to discover a g...
TruePath, you are mistaken, my argument addresses the main issue of explaining computer understanding (moreover, it seems that you are making confusion between the Chinese room argument and the “system reply” to it).
Let me clarify. I could write the Chinese room argument as the following deduction argument:
1) P is a computer program that does [x]
2) There is no computer program sufficient for explaining human understanding of [x]
=> 3) Computer program P does not understand [x]
In my view, assumption (2) is not demonstrated and the argument should be ...
Daniel, I'm curious too. What do you think about Fluid Construction Grammar? Can it be a good theory of language?
cousin_it, aren't you forgetting that the rules of the Chinese Room are different than those of Turing's imitation game? While Turing does not let you in the other test room, Searle grants you complete access to the code of the program. If you could really work out a (Chinese) brain digital upload, you could develop a theory of consciousness/intelligence/intentionality from it. Unfortunately, artificial neural networks bear no connection to the brain, like ELIZA bears no connection to a human!
I am skeptical about your theory of impact for investigating the question of which concepts would be convergent across minds, specifically your expectation that concepts validated through linguistic conventions may assist in non-ad-hoc interpretability of deep learning networks. Yet, I am interested in investigating semantics for the purpose of alignment. Let me try to explain how my model differs from yours.
First, for productively studying semantics, I recommend keeping a distinction between a semantics for vision (as the prototypical sensory input) and o... (read more)