Sometimes, people propose "experiments" with new norms, policies, etc. that don't have any real means of evaluating whether or not the policy actually succeeded or not.
This should be viewed with deep skepticism -- it often seems to me that such an "experiment" isn't really an experiment at all, but rather a means of sneaking a policy in by implying that it will be rolled back if it doesn't work, while also making no real provision for evaluating whether it will be successful or not.
In the worst cases, the process of running the experiment can involve taking measures that prevent the experiment from actually being implemented!
Here are some examples of the sorts of thing I mean:
- Management at a company decides that it's going to "experiment with" an open floor plan at a new office. The office layout and space chosen makes it so that even if the open floor plan proves detrimental, it will be very difficult to switch back to a standard office configuration.
- The administration of an online forum decides that it is going to "experiment with" a new set of rules in the hopes of improving the quality of discourse, but doesn't set any clear criteria or timeline for evaluating the "experiment" or what measures might actually indicate "improved discourse quality".
- A small group that gathers for weekly chats decides to "experiment with" adding a few new people to the group, but doesn't have any probationary period, method for evaluating whether someone's a good fit or removing them if they aren't, etc.
Now, I'm not saying that one should have to register a formal plan for evaluation with timelines, metrics, etc. for any new change being made or program you want to try out -- but you should have at least some idea of what it would look like for the experiment to succeed and what it would look like for it to fail, and for things that are enough of a shakeup more formal or established metrics might well be justified.
Indeed, these aren't controlled experiments at all, but sometimes they are also not policy-sneaking. Sometimes they are just using the phrase "experimenting with" in place of "trying out" to frame policy-implementation. At that point, the decision has already been made to try (not necessarily to assess whether trying is a good idea, it's already been endorsed as such), and presumably the conditions for going back to the original version are: 1) It leads to obviously-bad results on the criteria "management" was looking at to motivate the change in the first place or 2) It leads to complaints among the underlings.
The degree of skepticism, then, really just depends on your prior for whether the change will be effective, just like anything else. Whether there should have been more robust discussion depends either on the polarity of those priors (imagine a boardroom where someone raises the change and no one really objects vs. one where another person suggests forming an exploratory committee to discuss it further), or on whether you believe more people should have been included in the discussion ("you changed the bull pen without asking any of the bulls?!"). It has little to do with the fact that it was labeled an experiment, since again, it's likely being used as business-speak rather than as a premeditated ploy. I would love to have data on that though- do people who specifically refer to experimentation when they could just use a simpler word tend to use it innocuously or in a sneaky way?