I was thinking about AI going FOOM and one main argument I found is that it would just rewrite its own code. Is there any research going into the direction of code that is not changeable ? Could their be code that is unchangeable ? Would their be some obvious...
Maybe a different framework to look at it:
I wouldn't ascribe human morality to the process of evolution. Morality is a bunch of if..., then statements. Morality seems to be more of a cultural thing and helps coordination. Morality is obviously influenced by our emotions such as disgust, love etc but these can be influenced heavily by culture, upbringing and just genes. Now let's assume the AI is getting killed if it behaves "unmoral", how can you be sure that it does not evolve to be deceptive ?
This kinda reminds me of Failing with abandon
Today I though about how it is weird that so many people go into soft sciences (social sciences etc.) instead of STEM fields. I think one of the reasons may be that feedback loops are way bigger. In STEM fields most of the time you will be shown that you are wrong. However in soft sciences you can go on without ever noticing that you made a wrong judgement (outside view). Maybe alignment should look more into how people came up with theories in soft sciences ? Since it seems like the feedback loops are bigger.
I misused the definition of a pivotal act which makes it confusing. My bad!
I understood the phrase pivotal act more in the spirit of out-off distribution effort. To rephrase it more clearly: Do "you" think an out-off distribution effort is needed right now ? For example sacrificing the long term (20 years) for the short term (5 years) or going for high risk-high reward strategies.
Or should we stay on our current trajectory, since it maximizes our chances of winning ? (which as far as I can tell is "your" opinion)
As far as I can tell the major disagreements are about us having a plan and taking a pivotal act. There seems to be general "consensus" (Unclear, Mostly Agree, Agree) about what the problems are and how an AGI might look. Since no pivotal acts is needed either you think that we will be able to tackle this problem with the resources we have and will have, you have (way) longer timelines (let's assume Eliezer timeline is 2032 for argument's sake) or you expect the world to make a major shift in priorities concerning AGI.
Am I correct in assuming this or am I missing some alternatives ?
This seems to boil down to the "AI in the box" problem. People are convinced that keeping an AI trapped is not possible. There is a tag which you can look up (AI Boxing) or you can just read up here.
Reading this 13 years later is quite interesting when you think about how far the LW community and EA community have come.
"If AGI systems can become as smart as humans, imagine what one human/organization could do by just replicating this AGI."
I was thinking about AI going FOOM and one main argument I found is that it would just rewrite its own code.
Is there any research going into the direction of code that is not changeable ? Could their be code that is unchangeable ?
Would their be some obvious disadvantages other than that we could also not fix a misaligned AI instantly ?
Would something like this stop an AI going FOOM ?
Here is an example which I believe is directionally correct, it took me roughly 20 minutes to come up with it. The prompt is "how do living systems create meaning "?:
- My life feels like it has meaning (sensory-motor behavior and conceptual intentional aspects). Looking at it through an evolutionary perspective, it is highly likely that meaning assignment is the way through which living systems survived. Thus, there has to be some base biological level at which meaning is created through cell-cell communication/ bioelectricity/ biochemistry /biosensoring etc.
- Life is just made of atoms. Atoms are just automata. This implies, there is no meaning at the atom level and thus it cannot pop at a
... (read more)