With self-reflective abilities, the system can also consider paths including self-modification in reaching its goal. Some of those paths may be highly unintuitive for humans, so we wouldn't notice some possible dangers. Self-modification may also remove some safety mechanisms.
A system that explores many paths can find a solutions humans woudln't notice. Such "creativity" at object level is relatively harmless. Google Maps may find you a more efficient path to your work than the one you use now, but that's okay. Maybe the path is wrong for some reasons that Google Maps does not understand (e.g. it leads through a neighborhood with high crime), but at least on general level you understand that such is the risk of following the outputs blindly. However, similar "creativity" at self-modification level can have unexpected serious consequences.
"the system can also", "some of those paths may be", "may also remove". Those are some highly conditional statements. Quantify, please, or else this is no different than "the LHC may destroy us all with a mini black hole!"
In the spirit of "satisficers want to become maximisers" here is a somewhat weaker argument (growing out of a discussion with Daniel Dewey) that "tool AIs" would want to become agent AIs.
The argument is simple. Assume the tool AI is given the task of finding the best plan for achieving some goal. The plan must be realistic and remain within the resources of the AI's controller - energy, money, social power, etc. The best plans are the ones that use these resources in the most effective and economic way to achieve the goal.
And the AI's controller has one special type of resource, uniquely effective at what it does. Namely, the AI itself. It is smart, potentially powerful, and could self-improve and pull all the usual AI tricks. So the best plan a tool AI could come up with, for almost any goal, is "turn me into an agent AI with that goal." The smarter the AI, the better this plan is. Of course, the plan need not read literally like that - it could simply be a complicated plan that, as a side-effect, turns the tool AI into an agent. Or copy the AI's software into a agent design. Or it might just arrange things so that we always end up following the tool AIs advice and consult it often, which is an indirect way of making it into an agent. Depending on how we've programmed the tool AI's preferences, it might be motivated to mislead us about this aspect of its plan, concealing the secret goal of unleashing itself as an agent.
In any case, it does us good to realise that "make me into an agent" is what a tool AI would consider the best possible plan for many goals. So without a hint of agency, it's motivated to make us make it into a agent.