Familiar things didn't kill you. No, they are interested in familiarity. I just said that. It is rare but possible for a need for familiarity (as defined mathematically instead of linguistically) to result in sacrifice of a GIA instance's self...
ohhhh... sorry... There is really only one, and everything else is derived from it. Familiarity. Any other values would depend on the input, output and parameters. However familiarity is inconsistent with the act of killing familiar things. The concern comes in when something else causes the instance to lose access to something it is familiar with, and the instance decides it can just force that to not happen.
The concept of agent is logically inconsistent with the General Intelligence Algorithm. What you are trying to refer to with Agent/tool etc are just GIA instances with slightly different parameters, inputs, and outputs.
Even if it could be logically extended to the point of "Not even wrong" it would just be a convoluted way of looking at it.
EDIT: To edit and simplify my thoughts, in order to get a General Intelligence Algorithm Instance to do anything requires masterful manipulation of parameters with full knowledge of generally how it is going to behave as a result. A level of understanding of psychology of all intelligent (and sub-intelligent) behavior. It is not feasible that someone would accidentally program something that would become an evil mastermind. GIA instances could easily be made to behave in a passive manner even when given affordances and output, kind of like a person that was happy to assist in any way possible because they were generally warm or high or something.
You can define the most important elements of human values for a GIA instance, because most of human values are a direct logical consequence of something that cannot be separated from the GIA... IE if general motivation X accidentally drove intelligence (see: Orthogonality Thesis ) and it also drove positive human values, then positive human values would be unavoidable. It is true that the specifics of body and environment drive some specific human values, but those are just side effects of X in that environment and X in different environments only changes so much and in predictable ways.
You can directly implant knowledge/reasoning into a GIA instance. The easiest way to do this is to train one under very controlled circumstances, and then copy the pattern. This reasoning would then condition the GIA instance's interpretation of future input. However, under conditions which directly disprove the value of that reasoning in obtaining X the GIA instance would un-integrate that pattern and reintegrate a new one. This can be influenced with parameter weights.
I suppose this could be a concern regarding the potential generation of an anger instinct. This HEAVILY depends on all the parameters however, and any outputs given to the GIA instance. Also, robots and computers do not have to eat, and have no associated instincts with killing things in order to do so... Nor do they have reproductive instincts...
This one is actually true.
I am open to arguments as to why that might be the case, but unless you also have the GIA, I should be telling you what things I would want to do first and last. I don't really see what the risk is, since I haven't given anyone any unique knowledge that would allow them to follow in my footsteps.
A paper? I'll write that in a few minutes after I finish the implementation. Problem statement -> pseudocode -> implementation. I am just putting some finishing touches on the data structure cases I created to solve the problem.
Mathematically predictable but somewhat intractable without a faster running version of the instance, with the same frequency of input. Or predictable within ranges of some general rule.
Or just generally predictable with the level of understanding afforded to someone capable of making one in the first place, that for instance could describe the cause of just about any human psychological "disorder".