I have no experience with data science, but D&D Sci seems fun and I would like to improve and get better at it. Where can/should I start?
I have no experience with data science, but D&D Sci seems fun and I would like to improve and get better at it. Where can/should I start?
By imitating other players
As Jay Bailey mentioned, you can look at how other players approached challenges, and copy the approaches that worked. Pablo Repetto’s playthroughs of three early .scis seem particularly worthwhile given your situation, both because of how comprehensive & well-written they are, and because they were made by someone in the process of learning to use code on data science problems (the first playthrough was done in pure Excel, the other two were handled in Python).
By following a sensible strategy
Below is my standard plan for investigating a dataset, synthetic or otherwise (cribbed from an otherwise-mediocre Udacity course I took most of a decade ago, and still worth following).
-
Univariate Analysis: How is each feature distributed when considered in isolation? You should probably make a histogram for each column.
Bivariate Analysis: Construct and check the correlation matrix between all features. Are there clusters? Create scatterplots (or equivalent) for any pair of features which correlate unusually strongly, any pair of features where at least one is a response variable, and any pair of features you find yourself curious about.
Feature Derivation: Based on what you’ve seen so far – and/or common sense – are there meaningful features you can create from what you’ve been provided? (i.e., if you’re given "Number of Wizards", "Number of Sorcerors" and "Number of Druids" for each row, it might be worth creating a “Total Number of Magic Users” column.) Investigate how these features interact with others.
ML Modelling: If you can, and it seems like a good idea, build an ML model predicting the important/unknown features from those you have. If constructed successfully, this becomes an oracle you can ask about the outcome of any possible choice you could make. (XGBoost and similar tools are extremely versatile, and have pretty good performance on most problems.)
-
(The above is just a rough guide for what to do when you don’t know what to do. If you follow it, you should pretty quickly find yourself with a list of rabbitholes to fall down; you should probably err on the side of dropping everything and deviating from the path as soon as you find something interesting.)
By playing easier D&D.Scis
Difficulty of D&D.Sci games tends to be both high and high-variance; it’s usually assumed that players will have both data-manipulation and model-building skills. For what it’s worth, I can confirm that two relatively-approachable scenarios where not-using-ML won't put you at a disadvantage are (spoilered because this technically leaks information about them):
To provide the obvious advice first:
I assume you have some programming experience here - if not, that seems like a prerequisite to learn. Or maybe you can get away with using LLM's to write the Python for you.
That sounds like a pretty good basic method- I do have some (minimal) programming experience, but I didn't use it for D&D Sci, I literally just opened the data in Excel and tried looking at it and manipulating it that way. I don't know where I would start as far as using code to try and synthesize info from the dataset. I'll definitely look into what other people did though.
If you want to be generally skilled at the type of challenges D&D Sci provides, putting some points into the data science and statistics proficiencies would be a good way to start.
In particular, some related skills:
Lol just the last few days I was running through Leetcode's SQL 50 problems to refresh myself. They're some good, fun puzzles.
I'll look into R and basic statistical methods as well.