All of ClimateDoc's Comments + Replies

I have been spending around 10%-20% of my time over the past 5 years working as a fund manager on the Long Term Future Fund. As a result, Lightcone has never applied to the LTFF, or the EA Infrastructure Fund, as my involvement with EA Funds would pose too tricky of a COI in evaluating our application. But I am confident that both the LTFF and the EAIF would evaluate an application by Lightcone quite favorably

 

Would it make sense then for Lightcone people to not have positions at funding orgs like these?

Coming here from your email. It's great to see so much thought and effort go into trying to make an evidence-based case for asking for public donations, which I think is rare to see.

A question that occurs is that the donations chart seems to show that <$1M/year of donations were needed up to 2020 (I can't read off precise figures) - let's say it would be $1M/yr today after adding inflation. The site metrics seem to have been on solid upward trends by then, so it seems like it was at least possible to run LessWrong well on that budget. Would it be fair t... (read more)

Why's that? They seem to be going for AGI, can afford to invest billions if Zuckerberg chooses, their effort is led by one of the top AI researchers and they have produced some systems that seem impressive (at least to me). If you wanted to cover your bases, wouldn't it make sense to include them? Though 3-5% may be a bit much (but I also think it's a bit much for the listed companies besides MS and Google). Or can a strong argument be made for why, if AGI were attained in the near term, they wouldn't be the ones to profit from it?

  • Invest like 3-5% of my portfolio into each of Nvidia, TSMC, Microsoft, Google, ASML and Amazon

 

Should Meta be in the list? Are the big Chinese tech companies considered out of the race?

2Jonas V
I personally would not put Meta on the list

Do you mean you'd be adding the probability distribution with that covariance matrix on top of the mean prediction from f, to make it a probabilistic prediction? I was talking about deterministic predictions before, though my text doesn't make that clear. For probabilistic models, yes adding an uncertainty distribution may make result in non-zero likelihoods. But if we know the true dynamics are deterministic (pretend there's no quantum effects, which are largely irrelevant for our prediction errors for systems in the classical physics domai... (read more)

2johnswentworth
Ah, that's where we need to apply more Bayes. The underlying system may be deterministic at the macroscopic level, but that does not mean we have perfect knowledge of all the things which effect the system's trajectory. Most of the uncertainty in e.g. a weather model would not be quantum noise, it would be things like initial conditions, measurement noise (e.g. how close is this measurement to the actual average over this whole volume?), approximation errors (e.g. from discretization of the dynamics), driving conditions (are we accounting for small variations in sunlight or tidal forces?), etc. The true dynamics may be deterministic, but that doesn't mean that our estimates of all the things which go into those dynamics have no uncertainty. If the inputs have uncertainty (which of course they do), then the outputs also have uncertainty. The main point of probabilistic models is not to handle "random" behavior in the environment, it's to quantify uncertainty resulting from our own (lack of) knowledge of the system's inputs/parameters. Yeah, you're pointing to an important issue here, although it's not actually likelihoods which are the problem - it's point estimates. In particular, that makes linear approximations a potential issue, since they're implicitly approximations around a point estimate. Something like a particle filter will do a much better job than a Kalman filter at tracing out an attractor, since it accounts for nonlinearity much better. Anyway, reasoning with likelihoods and posterior distributions remains valid regardless of whether we're using point estimates. When the system is chaotic but has an attractor, the posterior probability of the system state will end up smeared pretty evenly over the whole attractor. (Although with enough fine-grained data, we can keep track of roughly where on the attractor the system is at each time, which is why Kalman-type models work well in that case.)

Yes I'd selected that because I thought it might get it to work. And now I've unselected it, it seems to be working. It's possible this was a glitch somewhere or me just being dumb before I guess.

2habryka
Huh, okay. Sorry for the weird experience! 

I wonder whether the models are so coarse that the cyclones that do emerge are in a sense the minimum size.

It's not my area, but I don't think that's the case. My impression is that part of what drives very high wind speeds in the strongest hurricanes is convection on the scale of a few km in the eyewall, so models with that sort of spatial resolution can generate realistically strong systems, but that's ~20x finer than typical climate model resolutions at the moment, so it will be a while before we can simulate those systems routinely (though, some argue we could do it if we had a computer costing a few billion dollars).

1Kenny
Thanks! That's very interesting to me. It seems like it might be an example of relatively small structures having potentially arbitrarily large long-term effects on the state of the entire system. It could be the case tho that the overall effects of cyclones are still statistical at the scale of the entire planet's climate. Regardless, it's a great example of the kind of thing for which we don't yet have good general learning algorithms.

do you know what discretization methods are typically used for the fluid dynamics?

There's a mixture - finite differencing used to be used a lot but seems to be less common now, semi-Lagrangian advection seems to have taken over from that in models that used it, then some work by doing most of the computations in spectral space and neglecting the smallest spatial scales. Recently newer methods have been developed to work better on massively parallel computers. It's not my area, though, so I can't give a very expert answer - but I'm pretty sure the people

... (read more)

I'm using Chrome 80.0.3987.163 in Mac OSX 10.14.6. But I also tried it in Firefox and didn't get formatting options. But maybe I'm just doing the wrong thing...

2habryka
You do currently have the markdown editor activated, which gets rid of all formatting options, so you not getting it right now wouldn't surprise me. But you should have gotten them before you activated the markdown editor.

Thanks, yes this is very relevant to thinking about climate modelling, with the dominant paradigm being that we can separately model phenomena above and below the resolved scale - there's an ongoing debate, though, about whether a different approach would work better, and it gets tricky when the resolved scale gets close to the size of important types of weather system.

climate models are already "low-level physics" except that "low-level" means coarse aggregates of climate/weather measurements that are so big that they don't include tropical cyclones!

Just as as aside, a typical modern climate model will simulate tropical cyclones as emergent phenomena from the coarse-scale fluid dynamics, albeit not enough of the most intense ones. Though, much smaller tropical thunderstorm-like systems are much more crudely represented.

1Kenny
I would have hoped that was the case, but that's interesting that both large and small ones are apparently not so easily emergent. I wonder whether the models are so coarse that the cyclones that do emerge are in a sense the minimum size. That would readily explain the lack of smaller emergent cyclones. Maybe larger ones don't emerge because the 'next larger size' is too big for the models. I'd think 'scaling' of eddies in fluids might be informative: What's the smallest eddy possibly in some fluid? What other eddy sizes are observed (or can be modeled)?
3johnswentworth
Tangential, but now I'm curious... do you know what discretization methods are typically used for the fluid dynamics? I ask because insufficiently-intense cyclones sound like exactly the sort of thing APIC methods were made to fix, but those are relatively recent and I don't have a sense for how much adoption they've had outside of graphics.

Thanks again.

I think I need to think more about the likelihood issue. I still feel like we might be thinking about different things - when you say "a deterministic model which uses fundamental physics", this would not be in the set of models that we could afford to run to make predictions for complex systems. For the models we could afford to run, it seems to me that no choice of initial conditions would lead them to match the data we observe, except by extreme coincidence (analogous to a simple polynomial just happening to pass through all the datapoints

... (read more)
2johnswentworth
Ok, let's talk about computing with error bars, because it sounds like that's what's missing from what you're picturing. The usual starting point is linear error - we assume that errors are small enough for linear approximation to be valid. (After this we'll talk about how to remove that assumption.) We have some multivariate function f(x) - imagine that x is the full state of our simulation at some timestep, and f calculates the state at the next timestep. The value ¯¯¯x of x in our program is really just an estimate of the "true" value x; it has some error Δx=x−¯¯¯x. As a result, the value of ¯¯¯f of f in our program also has some error Δf=f−¯¯¯f. Assuming the error is small enough for linear approximation to hold, we have: Δf=f−¯¯¯f=f(¯¯¯x+Δx)−f(¯¯¯x)≈(dfdx|¯¯¯x)Δx where dfdx is the Jacobian, i.e. the matrix of derivatives of every entry of f(x) with respect to every entry of x. Next, assume that Δx has covariance matrix Σx, and we want to compute the covariance matrix Σf of Δf. We have a linear relationship between Δx and Δf, so we use the usual formula for linear transformation of covariance: Σf=dfdxTΣxdfdx Now imagine iterating this at every timestep: we compute the timestep itself, then differentiate that timestep, and matrix multiply our previous uncertainty on both sides by the derivative matrix to get the new uncertainty: ¯¯¯x(t+1)=f(¯¯¯x(t)) Σx(t+1)=(dfdx|¯¯¯x(t))TΣx(t)(dfdx|¯¯¯x(t)) Now, a few key things to note: * For most systems of interest, that uncertainty is going to grow over time, usually exponentially. That's correct: in a chaotic system, if the initial conditions are uncertain, then of course we should become more and more uncertain about the system's state over time. * Those formulas only propagate uncertainty in previous state to uncertainty in the next state. Really, there's also new uncertainty introduced at each timestep, e.g. from error in f itself (i.e. due to averaging) or from whatever's driving the system. Typically, such

Thanks again. OK I'll try using MarkDown...

I think 'algorithm' is an imprecise term for this discussion.

Perhaps I used the term imprecisely - I basically meant it in a very general sense of being some process, set of rules etc. that a computer or other agent could follow to achieve the goal.

We need good decision theories to know when to search for more or better bottom-up models. What are we missing? How should we search? (When should we give up?)

The name for 'algorithms' (in the expansive sense) that can do what you're asking is 'general intelligenc

... (read more)
1Kenny
There are some pretty general learning algorithms, and even 'meta-learning' algorithms in the form of tools that attempt to more or less automatically discover the best model (among some number of possibilities). Machine learning hyper-parameter optimization is an example in that direction. My outside view is that a lot of scientists should focus on running better experiments. According to a possibly apocryphal story told by Richard Feynman in a commencement address, one researcher discovered (at least some of) the controls one had to employ to be able to effectively study mice running mazes. Unfortunately, no one else bothered to employ those controls (let alone look for others)! Similarly, a lot of scientific studies or experiments are simply too small to produce even reliable statistical info. There's probably a lot of such low hanging fruit available. Tho note that this is often a 'bottom-up' contribution for 'modeling' a larger complex system. But as you demonstrate in your last two paragraphs, searching for a better 'ontology' for your models, e.g. deciding what else to measure, or what to measure instead, is a seemingly open-ended amount of work! There probably isn't a way to avoid having to think about it more (beyond making other kinds of things that can think for us); until you find an ontology that's 'good enough' anyways. Regardless, we're very far from being able to avoid even small amounts of this kind of work.

Thanks for your reply. (I repeat my apology from below for not apparently being able to use formatting options in my browser in this.)

"I think it's an open question whether we can generally model complex systems at all – at least in the sense of being able to make precise predictions about the detailed state of entire complex systems."

I agree modelling the detailed state is perhaps not possible. However, there are at least some complex systems we can model and get substantial positive skill at predicting particular variables without needing to model all th

... (read more)
2habryka
[Meta] Curious what browser you are using, so I can figure out whether anyone else has this problem.
3Kenny
(I'm not sure if there are formatting options anymore in the site UI – formatting is (or can be) done via Markdown syntax. In your user settings, there's a "Activate Markdown Editor" option that you might want to test changing if you don't want to use Markdown directly.) I think 'algorithm' is an imprecise term for this discussion. I don't think there are any algorithms similar to a prototypical example of a computational 'algorithm' that could possibly do a better job, in general, than human minds. In the 'expansive' sense of 'algorithm', an AGI could possibly do better, but we don't know how to build those yet! There might be algorithms that could indicate whether, or how likely, it is that a model is 'missing' something, solving that problem generally would require access to the 'target' system like we have (i.e. by almost entirely living inside of it). If you think about something like using an (AI) 'learning' algorithm, you wouldn't expect it to be able to learn about aspects of the system that aren't provided to it as input. But how could we feasibly, or even in principle, provide the Earth's climate as input, i.e. what would we measure (and how would we do it)? What I was sketching was something like how we currently model complex systems. It can be very helpful to model a system top-down, e.g. statistically, by focusing on relatively simple global attributes. The inputs for fluid mechanics models of the climate are an example of that. Running those models is a mix of top-down and bottom-up. The model details are generated top-down, but studying the dynamics of those models in detail is more bottom-up. Any algorithm is in effect a decision theory. A general algorithm for modeling arbitrary complex systems would effectively make a vast number of decisions. I suspect finding or building a feasible and profitable algorithm like this will also effectively require "a complete epistemology and decision theory". We already have a lot of fantastically effective

Thanks for your detailed reply. (And sorry I couldn't format the below well - I don't seem to get any formatting options in my browser.)

"It is rarely too difficult to specify the true model...this means that "every member of the set of models available to us is false" need not hold"

I agree we could find a true model to explain the economy, climate etc. (presumably the theory of everything in physics). But we don't have the computational power to make predictions of such systems with that model - so my question is about how should we make predictions when t

... (read more)
2johnswentworth
Side note: one topic I've been reading about recently which is directly relevant to some of your examples (e.g. thunderstorms) is multiscale modelling. You might find it interesting.
2johnswentworth
The trick here is that the data on which the model is trained/fit has to include whatever data the scientists used to learn about that feedback loop in the first place. As long as that data is included, the model which accounts for it will have lower minimum description length. (This fits in with a general theme: the minimum-complexity model is simple and general-purpose; the details are learned from the data.) ... I'm responding as I read. Yup, exactly. As the Bayesians say, we do need to account for all our prior information if we want reliably good results. In practice, this is "hard" in the sense of "it requires significantly more complicated programming", but not in the sense of "it increases the asymptotic computational complexity". The programming is more complicated mainly because the code needs to accept several qualitatively different kinds of data, and custom code is likely needed for hooking up each of them. But that's not a fundamental barrier; it's still the same computational challenges which make the approach impractical. Again, we need to include whatever data allowed scientists to connect it to the climate in the first place. (In some cases this is just fundamental physics, in which case it's already in the model.) Picture a deterministic model which uses fundamental physics, and models the joint distribution of position and momentum of every atom comprising the Earth. The unknown in this model is the initial conditions - the initial position and momentum of every particle (also particle identity, i.e. which element/isotope each is, but we'll ignore that). Now, imagine how many of the possible initial conditions are compatible with any particular high-level data we observe. It's a massive number! Point is: the deterministic part of a model of a fundamental physical model is the dynamics; the initial conditions are still generally unknown. Conceptually, when we fit the data, we're mostly looking for initial conditions which match. So zero likeli

OK, I made some edits. I left the "rational" in the last paragraph because it seemed to me to be the best word to use there.