Thank you for posting this.
The need to join to other records was trivial for me.
I think the data model was too complex too fully decipher with a reasonable amount of effort, but this wasn't a problem as it wasn't necessary to get a decent answer (I might actually have got the optimal one if I hadn't blundered and missed that Italia suffered a famine in the previous year - though I was uncertain on a number of points and wasn't expecting to do as well as I did). In particular the wealth/population dependency completely passed me by.
Overall in terms of difficulty it felt OK.
If you might have played it but decided not to, what drove you away?
I set up the same kind of thing that abstractapplic did:
I created a sub-df for each province, reset their indices, then recombined; and for "does this predict that with a lag of N years?" investigations, I shifted one of the sub-dfs back by N before recombining.
then bounced off because while I had decent ideas of what I wanted to look for, I never got excited enough to get past the activation energy of trying to look for it.
My guesses about why include:
Not certain any of this is necessarily bad, but it's where my friction was.
I found this challenge difficult and awkward due to the high number of possible response-predictor pairs (disaster A in province B is predicted by disaster/omen X in province Y with a Z-year delay), low number of rows (if you look at each province seperately there are only 1080 records to play with), and probablistic linkages (if events had predicted each other more reliably, the shortage of data would have been less of an issue).
This isn't necessarily a criticism - sometimes reality is difficult and awkward, and it's good to prepare for that - and I get that it's incongruous to hear "it's too hard!" from the person who took second place out of a cohort that all did much better than random. Still, I think this problem would have been more approachable if we'd had fewer predictors and/or more data.
Misc other thoughts:
Rain of Fish is random. Sometimes fish just fall out of the sky. This is a thing that happens. It has a 2% chance of happening any year in any province.
Example of the problems caused by too few rows per column: I managed to convince myself there was a weak but solid connection between Rain of Fish and Plague in some provinces. (In my defense, it made intuitive sense that having rapidly-decaying fish all over your territories might make people sick.)
When the Titans rage against the bars of their prisons, two things happen:
- There is an Earthquake in the province of their prison.
- One or more fragments of them escape (the first into the province where they are imprisoned, additional ones into adjacent provinces). These fragments look to mortals like black doves, but carry fragments of Titanic malice.
. . . so my joke answer of "earthquakeproof every province, including the ones that don't belong to you" would actually have been a good idea longterm? That's delightful.
Was the need to use joins to analyze the data too large a barrier to entry?
I did my analysis without using joins. I created a sub-df for each province, reset their indices, then recombined; and for "does this predict that with a lag of N years?" investigations, I shifted one of the sub-dfs back by N before recombining. Joins would have made more sense in retrospect, but not knowing about them wouldn't have stopped me cold.
due to the high number of possible response-predictor pairs
My hope was that people would figure out the existence of the Population and Wealth sub-variables, at which point I think figuring out what effects omens had would have been much much easier. Sadly it seems I illusion-of-transparencied myself on how hard that would be to work out. People figured out a lot of the intermediate correlations I expected to be useful there (enough to get some very good answers), but no-one seems to have actually drawn the link that would have connected them.
My hope was that you would start with sub-results like:
and eventually arrive at the conclusion of 'maybe there is an underlying Population variable that many different things interact with'.
(I even tried to drop a hint about the Population and Wealth variables in the problem statement. I guess it's just much harder than I expected to make deductions like that.)
for "does this predict that with a lag of N years?" investigations, I shifted one of the sub-dfs back by N before recombining
That...is in fact a join?
it's just much harder than I expected to make deductions like that
This is something I noticed from some earlier .scis! I forget which, now. My hypothesis was that finding underlying unmentioned causes was really hard without explicitly using causal machinery in your exploration process, and I don't know how to, uh, casually set up causal inference, and it's something I would love to try learning at some point. Like, my intuition is something akin to "try a bunch of autogenerated causal graphs, see if something about correlations says [these] could work and [those] probably don't, inspect them visually, notice that all of [these] have a commonality". No idea if that would actually pan out or if there's a much better way. There's a lot of friction in "guess maybe there's an underlying cause, do a lot of work to check that one specific guess, anticipate you'd go through many false guesses and maybe even there isn't such a thing on this problem".
That...is in fact a join?
What I was (haphazardly, inarticulately) getting at is that I never used any built-in functions with 'join' in the name, or for that matter thought anything along the lines of "I will Do a Join now". In other words, I don't think needing to know about joins was a barrier to entry, because I never explicitly used that information when working on this problem.
*simon's comments on the scenario listed only 40,000 denarii of interventions. His score here reflects only those. Sorry, simon. At least you saved the Emperor money while still hitting most of the valuable interventions!
So people were only able to use any type of protection for a given province once. (Like, no extra grain shipments?)
This is a follow-up to last week's D&D.Sci scenario: if you intend to play that, and haven't done so yet, you should do so now before spoiling yourself.
There is a web interactive here you can use to test your answer, or you can read on.
RULESET
Map
Provinces are laid out as follows:
This matters for war, and for the spread of plague (and of black doves). Provinces are at risk of being pillaged by adjacent provinces of different empires (e.g. Italia is at risk of being pillaged only by Germania), and are at risk of plague spreading from any adjacent province, friend or foe (e.g. Italia can contract plague from Germania, Hispania or Grecia).
Congratulations to simon, who I believe was the first to identify the map and the connection to Plague and Pillaging.
Population and Wealth
The main factors driving a province's exposure to disasters were not the omens directly, but its Population and Wealth. The majority of omens were relevant only insofar as they had relationships to Population and Wealth. Population and Wealth are both represented as integer values that tend to almost always be between 1 and 10.
Both Population and Wealth go up gradually but then decrease with certain disasters. Every year, three things happen in order:
Fire and Famine
These are the two main negative-feedback disasters, keeping Population and Wealth growth in check by occurring when they grow too high:
Secular Omens
Most things the existing diviners categorize as Omens have nothing supernatural about them. They are still potentially useful to you, however, due to their relationship to Population and Wealth:
This means that, for example, a Two-Headed Baby indicates high risk of Fire (since Wealth must be high), but also indicates low risk of Famine for the same reason.
Plague
Plague has a 1% chance of starting in any given province in any given year (though see Black Doves below for another way it can arise).
Once Plague has started, however, it is contagious. If a province has a neighbor that had Plague last year, this year it has a 50% chance to get Plague. If it has two (or more) such neighbors, it has a 100% chance.
A province that suffers from Plague cannot suffer it again for the next 5 years.
Earthquakes
Earthquakes have a 1% chance of happening in any given province in any given year (though see Black Doves below for another way they can occur).
Great Leaders and Pillaging
If a province is more militarily strong than its neighbors, it can Pillage them.
The Strength of a province is given by its Population, plus a bonus if its empire is ruled by a Great Leader.
The birth and death of Great Leaders is heralded by omens sent by the gods: a Flaming Comet in one of an empire's provinces indicates that a Great Leader was just born there, while Sky Dark at Noon in the capital indicates that the Great Leader died.
While an Empire has a Great Leader whose age is at least 15, all provinces in that empire receive a +4 bonus to Strength (making them usually stronger than their neighbors unless there are very large Population differences). At the time of your scenario, the only empire with a Great Leader is Germania (...maybe don't tell the Emperor that).
If a province can pillage its neighbors, the odds of doing that are given by the Wealth of the neighbor: 5% * Wealth of neighbor. Nations are less likely to bother pillaging a very poor neighbor, even if they're strong enough to do so.
Some Omens are taken to herald victory, and while they don't have any effect on Strength they cause provinces to invade more readily:
The year after one of these omens appears, the province it appeared in will be twice as likely to invade its neighbor(s) if possible.
Black Doves
Long ago, the gods bound the Titans in prisons beneath the earth:
When the Titans rage against the bars of their prisons, two things happen:
Black Doves have various effects on the province they are in:
The black doves then attempt to travel across the world to spread Titanic evil everywhere they can before they fade away. Each year, they will try to move to an adjacent province that has not had a fragment of that Titan in this breakout - if they succeed, that province will have them next year.
Multiple Titans can break out at the same time. When this happens, each breakout happens independently, and the doves move independently. To you, however, they all look the same: one or more Titan fragments in a region will be recorded as one instance of 'Black Doves' in that province.
Over time, breakouts have been becoming more frequent and powerful as the Titans gather their strength:
In 1079, the Titan of Plague in Anatolia attempted to break out. It caused an Earthquake in Anatolia, and deposited Black Doves in Anatolia, Parthia and Grecia. This year, those black doves may have faded (~20%) or may have spread to Italia and Scythia (~80%).
In 1080, the Titans of War, Famine and Fire all attempted to break out. This caused Earthquakes in Britannia, Italia and Parthia, and deposited Black Doves in many places.
As of the most recent data, while only 8 Black Doves were listed, there are actually at least 9, and possibly as many as 11, because some provinces contain more than one:
STRATEGY
Current province statuses are:
Some of these Population and Wealth values could be worked out exactly (for example, Grecia had the Low-Population omen 'Rivers of Blood' in 1079 but the High-Population omen 'Moon turns Red' in 1080, uniquely identifying its Population as 5.
Others could only be estimated: for example, Italia's Wealth can be identified as necessarily very low, but it's plausible it could be 1, and theoretically possible it could even be 2.
Fortunately, which interventions were valuable was robust to small variations in Population and Wealth. The probabilities of various disasters were:
Given these strategies, the optimal strategy was to buy:
LEADERBOARD
*simon's comments on the scenario listed only 40,000 denarii of interventions. His score here reflects only those. Sorry, simon. At least you saved the Emperor money while still hitting most of the valuable interventions!
Congratulations to all players, particularly Yonge, who managed to correctly protect Italia from Plague without also protecting Hispania.
FEEDBACK REQUEST
As usual, I'm interested to hear feedback on what people thought of this scenario. If you played it, what did you like and what did you not like? If you might have played it but decided not to, what drove you away? What would you like to see more of/less of in future? Do you think the underlying data model was too complicated to decipher? Or too simple to feel realistic? Or both at once? Was the need to use joins to analyze the data too large a barrier to entry?
Thanks for playing!