I'm not surprised my submission did badly since it was the easiest thing I could quickly come up with after seeing that I was already late. I wasn't quite expecting to be unable to come up with anything better though. After looking at other people's comments I'm particularly disappointed that it never once crossed my mind to try analyzing single-soldier combats. I was explicitly trying to figure out the effect of one soldier of each weapon, and I had a histogram of the number of soldiers per combat from which I could have easily gleaned that there were lots of single-soldier combats to investigate had I thought to do so, but instead I tried to analyze the win rates of (some combination of weapons) vs (some combination of weapons) + (1 more of the weapon I'm trying to investigate) and running into trouble with the fact that that extra soldier is also correlated with an increased alien threat and didn't know how to tease the two effects apart.
I misremembered the May 6 date as May 9 but luckily other people have been asking for more time so it seems I might not be late.
The average number of soldiers the Army sends looks linear in the number of aliens. A linear regression gives the coefficients: 0.40 soldiers by default + 0.66 per Abomination + 0.32 per Crawler + 0.16 per Scarab + 0.81 per Tyrant + 0.49 per Venompede. From here, the log-odds of victory looks like a linear function of the difference between the actual number of soldiers and the expected number of soldiers.
Based on no evidence at all, I will assume this generalizes to the individual weapon types and that each additional soldier of each weapon type increases the odds of victory by some fixed amount depending on the composition of the aliens, but not dependent on the other soldiers already present.
Here's a guess that can definitely be improved upon but I don't know if I will
7 Thermo-Torpedos
It's definitely wrong because
Weapon diversity clearly helps but my model makes that impossible. I'm pretty sure my assumption that the effectiveness of each marginal soldier is the same and only depends on the aliens is wrong, even though it does look true when averaged over all weapon and alien types.
Initial observations characterizing the data
The PGFDA seems to treat all weapon types completely interchangeably. All weapon types appear equally often and with the same distribution, and there are no correlations between different weapon types or between weapon types and alien species in the past missions. The only tactical decision they make is to send more soldiers when there are more aliens.
The alien species also seem to be acting independently of each other. They each have different distributions in the number of individuals per encounter but each species shows up in about 100,000 encounters and there are no correlations between the presence of any alien species with any other.
Victory is somewhat correlated with number of soldiers which makes sense, but isn't correlated with specific weapon or alien types. I would guess that each weapon is strong and weak against certain aliens, or maybe some weapons combinations synergize and others interfere with each other such that they all come out to the same average effectiveness when chosen at random like the PGFDA and AM are doing.
Tentative guesses:
Nothing else is standing out to me so I just threw some linear regression at it and added 1.22 times the standard deviation of the residual to be safe.
I completely ignored the greenish-gray turtles because His Malevolence didn't have any and there weren't that many of them in the data, I hope that wasn't a mistake. It bothers me that I can't figure out anything regarding nostril size. From a meta perspective, I feel like there wouldn't be two irrelevant columns given that fangs was already redundant. Everything else was at least somewhat correlated with weight.
I've been loving reading these for a while and figured I'd give it a shot for once.
Random early observations
Edit
There are way too many green turtles with 6 shell segments, and they all have no wrinkles, normal nostril size, no miscellaneous abnormalities, and weight 20.4.
Interesting. I agree with all your reasoning, but my plausibility judgements of the implications seem opposite to yours and I came away with the opposite conclusion that well-being is clearly capped.
I think you and the linked post might have mismatching definitions of reward. It seems like your definition is that reward is what the AI values, but the linked post uses reward to mean the reward function specified by the programmers that is used to train the AI.
As for using FLOP as a plural noun, that's how other units work. We use 5 m for 5 meters, 5 s for 5 seconds, 5 V for 5 volts, etc. so it's not that weird.
If human behaviour is fully determined by the laws of the universe, then you have no choice in whether you assign moral blame or not so it doesn't make sense to discuss whether we should or shouldn't do that.
From looking at every ingredient-result pair and picking the ones which appear with Barkskin Potion more often than would expected if they were completely independent, I'm going to suggest the ingredients
1. Crushed Onyx
2. Demon Claw
3. Ground Bone
4. Quicksilver
5. Troll Blood
I'm very certain about Crushed Onyx and Ground Bone which appear in every successful Barkskin Potion, less so about the rest.
Edit: if I filter to only successful Barkskin Potions without any of the unavailable ingredients, that leaves 42 rows. None of them use Quicksilver, and instead Giant's Toe and Vampire Fang are looking pretty good. Not sure how much weight to put on this but something to investigate later.