Thanks for your comments!
My sense you are writing this as someone without lots of experience in writing and publishing scientific articles (correct me if I am wrong).
You're correct in that I haven't published any scientific articles -- my publication experience is entirely in academic philosophy and my suggestions are based on my frustrations there. This may be a much more reasonable proposal for academic philosophy than other disciplines, since philosophy deals more with conceptually nebulous issues and has fewer objective standards.
...linearly present
I don’t follow. How is it easier (or more special as an opportunity) to decide how to relate to an AI system than to a chicken or a distant human?
I think that our treatment of animals is a historical problem. If there were no animals, if everyone was accustomed to eating vegetarian meals, and then you introduced chickens into the world, I believe people wouldn't be inclined to stuff them into factory farms and eat their flesh. People do care about animals where they are not complicit in harming them (whaling, dog fighting), but it is hard for most peop...
Turing test is sentient
I'm not sure why we should think that the Turing test provides any evidence regarding consciousness. Dogs can't pass the test, but that is little reason to think that they're not conscious. Large language models might be able to pass the test before long, but it looks like they're doing something very different inside, and so the fact that they are able to hold conversations is little reason to think they're anything like us. There is a danger with being too conservative. Sure, assuming sentience may avoid causing unnecessary harm...
I'm not particularly worried that we may harm AIs that do not have valenced states, at least in the near term. The issue is more over precedent and expectations going forward. I would worry about a future in which we create and destroy conscious systems willy-nilly because of how it might affect our understanding of our relationship to them, and ultimately to how we act toward AIs that do have morally relevant states. These worries are nebulous, and I very well might be wrong to be so concerned, but it feels risky to rush into things.
We've been struggling with natural consciousnesses, both human and animal, for a long long time, and it's not obvious to me that artificial consciousness can avoid any of that pain.
You're right, but there are a couple of important differences:
Thanks for writing this up! It's great to see all of the major categories after having thought about it for awhile. Given the convergence, does this change your outlook on the problem?
If you try to give feedback during training, there is a risk you'll just reward it for being deceptive. One advantage to selecting post hoc is that you can avoid incentivizing deception.
Interesting proposal!
Is there a reason you have it switch to the human net just once in the middle?
I would worry that the predictor might switch ontologies as time goes on. Perhaps, to make the best use of the human compute time, it reasons in a human ontology up until n/2. Once the threat of translation is past, it might switch to its own ontology from n/2 to n. If so, the encoder that works up to n/2 might be useless thereafter. A natural alternative would be to have it switch back and forth some random number of times at random intervals.
This is a very interesting change (?) from earlier models. I wonder if this is a poetry-specific mechanism given the amount of poetry in the training set, or the application of a more general capability. Do you have any thoughts either way?