I have no experience with data science, but D&D Sci seems fun and I would like to improve and get better at it. Where can/should I start?

New Answer
New Comment

3 Answers sorted by

abstractapplic

92

By imitating other players

As Jay Bailey mentioned, you can look at how other players approached challenges, and copy the approaches that worked. Pablo Repetto’s playthroughs of three early .scis seem particularly worthwhile given your situation, both because of how comprehensive & well-written they are, and because they were made by someone in the process of learning to use code on data science problems (the first playthrough was done in pure Excel, the other two were handled in Python).

By following a sensible strategy

Below is my standard plan for investigating a dataset, synthetic or otherwise (cribbed from an otherwise-mediocre Udacity course I took most of a decade ago, and still worth following).

-

Univariate Analysis: How is each feature distributed when considered in isolation? You should probably make a histogram for each column.

Bivariate Analysis: Construct and check the correlation matrix between all features. Are there clusters? Create scatterplots (or equivalent) for any pair of features which correlate unusually strongly, any pair of features where at least one is a response variable, and any pair of features you find yourself curious about.

Feature Derivation: Based on what you’ve seen so far – and/or common sense – are there meaningful features you can create from what you’ve been provided? (i.e., if you’re given "Number of Wizards", "Number of Sorcerors" and "Number of Druids" for each row, it might be worth creating a “Total Number of Magic Users” column.) Investigate how these features interact with others.

ML Modelling: If you can, and it seems like a good idea, build an ML model predicting the important/unknown features from those you have. If constructed successfully, this becomes an oracle you can ask about the outcome of any possible choice you could make. (XGBoost and similar tools are extremely versatile, and have pretty good performance on most problems.)

-

(The above is just a rough guide for what to do when you don’t know what to do. If you follow it, you should pretty quickly find yourself with a list of rabbitholes to fall down; you should probably err on the side of dropping everything and deviating from the path as soon as you find something interesting.)

By playing easier D&D.Scis

Difficulty of D&D.Sci games tends to be both high and high-variance; it’s usually assumed that players will have both data-manipulation and model-building skills. For what it’s worth, I can confirm that two relatively-approachable scenarios where not-using-ML won't put you at a disadvantage are (spoilered because this technically leaks information about them):

Jay Bailey

42

To provide the obvious advice first:

  • Attempt a puzzle.
  • If you didn't get the answer, check the comments of those who did.
  • Ask yourself how you could have thought of that, or what common principle that answer has. (e.g, I should check for X and Y)
  • Repeat.

I assume you have some programming experience here - if not, that seems like a prerequisite to learn. Or maybe you can get away with using LLM's to write the Python for you.

That sounds like a pretty good basic method- I do have some (minimal) programming experience, but I didn't use it for D&D Sci, I literally just opened the data in Excel and tried looking at it and manipulating it that way. I don't know where I would start as far as using code to try and synthesize info from the dataset. I'll definitely look into what other people did though.

2Jay Bailey
pandas is a good library for this - it takes CSV files and turns them into Python objects you can manipulate. plotly / matplotlib lets you visualise data, which is also useful. GPT-4 / Claude could help you with this. I would recommend starting by getting a language model to help you create plots of the data according to relevant subsets. Like if you think that the season matters for how much gold is collected, give the model a couple of examples of the data format and simply ask it to write a script to plot gold per season.

ErioirE

20

If you want to be generally skilled at the type of challenges D&D Sci provides, putting some points into the data science and statistics proficiencies would be a good way to start.
In particular, some related skills:

  • SQL - Easy to pick up for someone with good technical skills. Challenging to master. Before going too deep on relational databases I also recommend learning good theory and practices behind it like the different design forms and why they're important.
  • R programming language
  • Familiarization with various statistical analysis methods and what use cases they are intended for

Lol just the last few days I was running through Leetcode's SQL 50 problems to refresh myself. They're some good, fun puzzles.

I'll look into R and basic statistical methods as well.

1 comment, sorted by Click to highlight new comments since:

Fellow not-at-all-a-data-scientist-but-wait-actually-that-sounds-fun here! I don’t know more about it than you do, but I’m glad you asked, since I hope to also benefit from the answers you’ll get :-)