This is a "live" Fermi estimate, where I expect to make mistakes and haven't done much editing. If you're working on a simple mathematical program, you can attempt to estimate the number of floating point or integer operations, compare that to statistics on CPU latencies/throughput, and use that to get...
At my work, we run experiments – we specify some set of input parameters, run some code, and get various metrics as output. Since we run so many of these, it's important for them to be fast and cheap. Recently I was working on an experiment type that took about...
Tasks are how Julia handles parallelism & concurrency. Tasks are defined at the program level and Julia's scheduler maps them to hardware/OS threads. Tasks have many names in other languages: "symmetric coroutines, lightweight threads, cooperative multitasking, or one-shot continuations". They're particularly similar to the coroutines used by Go and Cilk....
Here's a concrete example of two approaches to a software problem, each with different advantages, and circumstances that would lead me to choose one over the other. Recently I wrote a job to create and save datasets that are used for downstream work. The code is run through a scheduler...
Data locality is a key part of writing fast data science code. The core idea is simple: your data starts out in RAM (or disk), and to actually do anything useful with it, you need to move it to the CPU. This is actually pretty slow, so you want to...
Summary: the Kalman Filter is Bayesian updating applied to systems that are changing over time, assuming all our distributions are Gaussians and all our transformations are linear. Preamble - the general Bayesian approach to estimation: the Kalman filter is an approach to estimating moving quantities. When I think about a...