All of technicalities's Comments + Replies

I hear that you and your band have sold your technical agenda and bought suits. I hear that you and your band have sold your suits and bought gemma scope rigs.

 

(riff on this tweet, which is a riff on the original)

As of two years ago, the evidence for this was sparse. Looked like parity overall, though the pool of "supers" has improved over the last decade as more people got sampled.

There are other reasons to be down on XPT in particular.

I like Hasselt and Meyn (extremely friendly, possibly too friendly for you)

Maybe he dropped the "c" because it changes the "a" phoneme from æ to ɑː and gives a cleaner division in sounds: "brac-ket" pronounced together collides with "bracket" where "braa-ket" does not. 

It's under "IDA". It's not the name people use much anymore (see scalable oversight and recursive reward modelling and critiques) but I'll expand the acronym.

1Joey KL
Iterated Amplification is a fairly specific proposal for indefinitely scalable oversight, which doesn't involve any human in the loop (if you start with a weak aligned AI). Recursive Reward Modeling is imagining (as I understand it) a human assisted by AIs to continuously do reward modeling; DeepMind's original post about it lists "Iterated Amplification" as a separate research direction.  "Scalable Oversight", as I understand it, refers to the research problem of how to provide a training signal to improve highly capable models. It's the problem which IDA and RRM are both trying to solve. I think your summary of scalable oversight:  is inconsistent with how people in the industry use it. I think it's generally meant to refer to the outer alignment problem, providing the right training objective. For example, here's Anthropic's "Measuring Progress on Scalable Oversight for LLMs" from 2022: It references "Concrete Problems in AI Safety" from 2016, which frames the problem in a closely related way, as a kind of "semi-supervised reinforcement learning". In either case, it's clear what we're talking about is providing a good signal to optimize for, not an AI doing mechanistic interpretability on the internals of another model. I thus think it belongs more under the "Control the thing" header. I think your characterization of "Prosaic Alignment" suffers from related issues. Paul coined the term to refer to alignment techniques for prosaic AI, not techniques which are themselves prosaic. Since prosaic AI is what we're presently worried about, any technique to align DNNs is prosaic AI alignment, by Paul's definition. My understanding is that AI labs, particularly Anthropic, are interested in moving from human-supervised techniques to AI-supervised techniques, as part of an overall agenda towards indefinitely scalable oversight via AI self-supervision.  I don't think Anthropic considers RLAIF an alignment endpoint itself. 

The story I heard is that Lightspeed are using SFF's software and SFF jumped the gun in posting them and Lightspeed are still catching up. Definitely email.

4Victor Levoso
So update on this, I got busy with applications this last week and forgot to mail them about this but I just got a mail from ligthpeed saying I'm going to get a grant because Jaan Tallinn, has increased the amount he is distributing through Lightspeed Grants. (thought they say that "We have not yet received the money, so delays of over a month or even changes in amount seem quite possible") (Edit: for the record I did end up getting funded).

d'oh! fixed

no, probably just my poor memory to blame

Yep, no idea how I forgot this. concept erasure!

3Victor Levoso
Yeah Stag told me that's where they saw it.But I'm confused about what that means?  I certainly didn't get money from lighstpeed, I applied but got mail saying I wouldn't get a grant. I still have to read on what that is but it says "recomendations" so it might not necesarily mean those people got money or something?. I might have to just mail them to ask I guess, unless after reading their faq more deeply about what this S-process is it becomes clear whats up with that.

Not speaking for him, but for a tiny sample of what else is out there, ctrl+F "ordinary"

If the funder comes through I'll consider a second review post I think

Being named isn't meant as an honorific btw, just a basic aid to the reader orienting.

Ta! 

I've added a line about the ecosystems. Nothing else in the umbrella strikes me as direct work (Public AI is cool but not alignment research afaict). (I liked your active inference paper btw, see ACS.)

A quick look suggests that the stable equilibrium things aren't in scope - not because they're outgroup but because this post is already unmanageable without handling policy, governance, political economy and ideology. The accusation of site bias against social context or mechanism was perfectly true last year, but no longer, and my personal scoping ... (read more)

5Roman Leventov
I'm talking about science of governance, digitalised governance, and theories of contracting, rather than not-so-technical object-level policy and governance work that is currently done at institutions. And this is absolutely not to the detriment of that work, but just as a selection criteria for this post, which could decide to focus on technical agendas where technical visitors of LW may contribute to. The view that there is a sharp divide between "AGI-level safety" and "near-term AI safety and ethics" is itself controversial, e.g., Scott Aaronson doesn't share it. I guess this isn't a justification for including all AI ethics work that is happening, but of the NSF projects, definitely more than one (actually, most of them) appear to me upon reading abstracts as potentially relevant for AGI safety. Note that this grant program of NSF is in a partnership with Open Philanthropy and OpenPhil staff participate in the evaluation of the projects. So, I don't think they would select a lot of projects irrelevant for AGI safety.

I like this. It's like a structural version of control evaluations. Will think where to put it in

3LawrenceC
Expanding on this -- this whole area is probably best known as "AI Control", and I'd lump it under "Control the thing" as its own category. I'd also move Control Evals to this category as well, though someone at RR would know better than I. 

One big omission is Bengio's new stuff, but the talk wasn't very precise. Sounds like Russell:

With a causal and Bayesian model-based agent interpreting human expressions of rewards reflecting latent human preferences, as the amount of compute to approximate the exact Bayesian decisions increases, we increase the probability of safe decisions.

Another angle I couldn't fit in is him wanting to make microscope AI, to decrease our incentive to build agents.

I care a lot! Will probably make a section for this in the main post under "Getting the model to learn what we want", thanks for the correction.

I'm not seeing anything here about the costs of data collection (for licenced stuff) or curation (probably hundreds of thousands of cheap hours?), apart from one bullet on OAI's combined costs. As a total outsider I would guess this could move your estimates by 20-100%.

2YafahEdelman
I talk about this in the Granular Analysis subsection, but I'll elaborate a bit here. * I think that hundreds of thousands of cheap labor hours for curation is a reasonable guess, but this likely comes to under a million dollars in total which is less than 1% of the total. * I have not seen any substantial evidence of OpenAI paying for licenses before the training of GPT-4, much less the sort of expenditures that would move the needle on the total cost. * After training GPT-4 we do see things like a deal between OpenAI and the Associated Press (also see this article on that which mentions a first mover clause) with costs looking to be in the millions - more than 1% of the cost of GPT-4 but notably it seems that this came after GPT-4. I expect GPT-5, which this sort of deal might be relevant for, to cost substantially more. It's possible I'm wrong about the timing and substantial deals of this sort were in fact made before GPT-4 but I have not seen substantive evidence of this.

ICF is the only such mental viz whizz technique that has ever worked for me, and I say that having done CFAR, a dedicated focussing retreat, a weekend vipassana retreat, and a dedicated circling retreat. 

From context I think he meant not fibre laser but "free-space optics", a then-hyped application of lasers to replace radio. I get this from him mentioning it in the same sentence as satellites and then comparing lasers to radio: "A continuing advance of communications satellites, and the use of laser beams for communication in place of electric currents and radio waves. A laser beam of visible light is made up of waves that are millions of times shorter than those of radio waves". So I don't think this rises above the background radiation (ha) of Asimov... (read more)

1simon
While it's not our main communications method, infrared communication is a thing, and it's a lot closer to visible than radio. Also, Elon Musk claims that SpaceX is going to enable laser links for inter-satellite communications between Starlink satellites soon (admittedly, not within the 2020 target year, but this is still pretty close!)   My reading of the context is that screens are supposed to be the predominant form, and cube 3d is a prototype. This seems to be a correct prediction: see "crystal cube" here.

Good reason to apply this with nearly equal intensity to mainstream medical arguments, though. (Applies to a lesser extent to evidence-based places like Cochrane, but sadly still applies.)

Good catch! The book is generally written as the history of the world leading up to 2000, and most of its predictions are about that year. But this is clearly an exception and the section offers nothing more precise than "By the year 3000, then, it may well be that Earth will be only a small part of the human realm." I've moved it to the "nonresolved" tab.

DM me for your bounty ($10)! I added your comment to the changelog. Thanks! 

Data collector here. Strongly agree with your general point: most of these entries are extremely far from modern "clairvoyant" (cleanly resolving) forecasting questions. 
 

Space travel. Disagree. In context he means mass space travel. The relevant lead-up is this: 

"According to her, the Moon is a great place and she wants us to come visit her."

"Not likely!" his wife answers. "Imagine being shut up in an air - conditioned cave."    

"When you are Aunt Jane's age, my honey lamb, and as frail as she is, with a bad heart thrown in,

... (read more)
3simon
I would call it a full miss myself.   I still strongly disagree on the commercial interplanetary travel meaning. If "Cash on Delivery" has that old-timey meaning, it could push a bit to your interpretation, but not enough IMO.  My reasoning:  Actual interplanetary travel, or say a trip on a spaceship, cannot literally be waiting at your front door. So clearly, a metaphorical meaning is intended. Here he extends the metaphor.  But, in your view, that means it's cheap. I disagree, if it was cheap he wouldn't need to say "It's yours when you pay for it". Everything has to be paid for.  If he meant it was cheap, he would just stop at C.O.D. and not say "It’s yours when you pay for it."  IMO, the "It's yours when you pay for it" clearly means that he expected it to cost enough that it would be a significant barrier to progress (and the prediction is that it is in effect the only barrier to interplanetary travel). I do suspect though that he did intend the reader to pick up your connotation first, for the shock value, and the "It's yours when you pay for it" is intended to shift the reader to the correct interpretation of what he means by C.O.D, i.e., it's meant to be taken literally within the metaphorical context (and by Gricean implicature a large cost is meant) and not as an additional layer of metaphor.  I suppose the 1965 comments could have been written to retroactively support an interpretation that would make the prediction correct, but I would bet most 1950 readers would have interpreted it as I did. Also, I note that John C. Wright agrees with my interpretation (in your link to support Heinlein being a "dishonest bugger") (I didn't notice anything in that link about him being a dishonest bugger, though - could you elaborate?). Wright also agrees with me on the central piloting prediction; looking briefly through Wright's comments I didn't see any interpretation of Wright's that I disagreed with (I might quibble with some of Wright's scoring, though pr

Is the point that 1) AGI specifically is too weird for normal forecasting to work, or 2) that you don't trust judgmental forecasting in general, or 3) that respectability bias swamps the gains from aggregating a heavily selected crowd, spending more time, and debiasing in other ways?

The OpenPhil longtermists' respectability bias seems fairly small to me; their weirder stuff is comparable to Asimov (but not Clarke, who wrote a whole book about cryptids). 

And against this, you have to factor in the Big Three's huge bias towards being entertaining instea... (read more)

7johnswentworth
The third: respectability bias easily swamps the gains. (I'm not going to try to argue that case here, just give a couple examples of what such tradeoffs look like.) This is much more about the style of analysis/reasoning than about the topics; OpenPhil is certainly willing to explore weird topics. As an example, let's look at the nanotech risk project you linked to. The very first thing in that write-up is: So right at the very beginning, we're giving an explicit definition. That's almost always an epistemically bad move. It makes the reasoning about "nanotech" seem more legible, but in actual fact the reasoning in the write-up was based on an intuitive notion of "nanotech", not on this supposed definition. If the author actually wanted to rely on this definition, and not drag in intuitions about nanotech which don't follow from the supposed definition, then the obvious thing to do would be to make up a new word - like "flgurgle" - and give "flgurgle" the definition. And then the whole report could talk about risks from flgurgle, and not have to worry about accidentally dragging in unjustified intuitions about "nanotech". ... of course that would be dumb, and not actually result in a good report, because using explicit definitions is usually a bad idea. Explicit definitions just don't match the way the human brain actually uses words. But a definition does sound very Official and Respectable and Defendable. It's even from an Official Government Source. Starting with a definition is a fine example of making a report more Respectable in a way which makes its epistemics worse. (The actual thing one should usually do instead of give an explicit definition is say "we're trying to point to a vague cluster of stuff like <list of examples>". And, in fairness, the definition used for nanotech in the report does do that to some extent; it does actually do a decent job avoiding the standard pitfalls of "definitions". But the US National Nanotechnology Initiative's defin

Bentham was nonzero discount apparently (fn6). (He used 5% but only as an example.)

Mill thought about personal time preference (and was extremely annoyed by people's discount there). Can't see anything about social rate of discounting.

3Owain_Evans
I didn't follow the links, but how did Bentham and Mill think about future utility?

Ooh that's more intense that I realised. There might be plugins for yEd, but I don't know em. Maybe Tetrad?

I love Sketchviz for 10 second prototypes, but it requires the DOT language, and if you need very specific label placements it's a nightmare.

For using a mouse, yEd is good. Exports to GraphML for version control.

2JenniferRM
Does yEd have the ability to: (1) treat nodes as having "states" with a default prior probability and then  (2) treat directional node-to-node links as "relevant to reasoning about the states" and then  (3) put in some kind of numbers or formulas inspired by Bayes Rule for each link and then (4) later edit the graph on the fly (with "do()" or "observe()" basically) to clamp some nodes to definite states and then  (5) show all the new state probabilities across all other nodes in the graph?

Givewell's fine! 

Thanks again for caring about this.

Sounds fine. Just noticed they have a cloth and a surgical treatment. Take the mean?

5Mike Harris
Sure. My current belief state is that cloth masks will reduce case load by ~15% and surgical masks by ~20%. Without altering the bet I'm curious as to what your belief state is.

Great! Comment below if you like this wording and this can be our bond:

"Gavin bets 100 USD to GiveWell, to Mike's 100 USD to GiveWell that the results of NCT04630054 will show a median reduction in Rt > 15.0 % for the effect of a whole population wearing masks [in whatever venues the trial chose to study]."

5Mike Harris
I can't accept the wording because the masking study is not directly measuring Rt. I would prefer this wording "Gavin bets 100 USD to GiveWell, to Mike's 100 USD to GiveWell that the results of NCT04630054 will show a median reduction in cumulative cases > 15.0 % for the effect of a whole population wearing masks [in whatever venues the trial chose to study]."

This is an interesting counterpoint (though I'd like to see a model of CO2 cost vs thinning cost if you have one), and it's funny we happen to have such a qualified person on the thread. But your manner is needlessly condescending and - around here - brandishing credentials as a club will seriously undermine you rather than buttressing you. 

1J Mann
Hah! That is definitely a weakness of my "What does Gelman have to say" strategy.

Stretching the definition of 'substantial' further:

Beth Zero was an ML researcher and Sneerclubber with some things to say. Her blog is down unfortunately but here's her collection of critical people. Here's a flavour of her thoughtful Bulverism. Her post on the uselessness of Solomonoff induction and the dishonesty of pushing it as an answer outside of philosophy was pretty good.

Sadly most of it is against foom, against short timelines, against longtermism, rather than anything specific about the Garrabrant or Demski or Kosoy programmes.

Nostalgebraist (2019) sees it as equivalent to solving large parts of philosophy: a noble but quixotic quest. (He also argues against short timelines but that's tangential here.)

Here is what this ends up looking like: a quest to solve, once and for all, some of the most basic problems of existing and acting among others who are doing the same. Problems like “can anyone ever fully trust anyone else, or their future self, for that matter?” In the case where the “agents” are humans or human groups, problems of this sort have been wrestled with for a long

... (read more)

Huh, works for me. Anyway I'd rather not repeat his nasty slander but "They're [just] a sex cult" is the gist.

1TAG
All very spicy, but it doesn't address the intellectual content of LW at all.
2habryka
(That link just goes to the Google Books page for "The AI Does Not Hate You". Based on the query parameters there is something else you probably wanted to link to, but at least for me it isn't working.)

The received view of him is as just another heartless Conservative with an extra helping of tech fetishism and deceit. In reality he is an odd accelerationist just using the Tories (Ctrl+F "metastasising"). Despite him quoting Yudkowsky in that blog post, and it getting coverage in all the big papers, people don't really link him to LW or rationality, because those aren't legible, even in the country's chattering classes. We are fortunate that he is such a bad writer, so that no one reads his blog.

Here's a speculative rundown of things he probably got impl... (read more)

1TAG
Say more!

Great post. Do you have a sense of

  1. how much of tree success can be explained / replicated by interpretable models;
  2. whether a similar analysis would work for neural nets?

You suggest that trees work so well because they let you charge ahead when you've misspecified your model. But in the biomedical/social domains ML is most often deployed, we are always misspecifying the model. Do you think your new GLM would offer similar idiotproofing?

Yeah, the definition of evidence you use (that results must single out only one hypothesis) is quite strong, what people call "crucial" evidence.

https://en.m.wikipedia.org/wiki/Experimentum_crucis

1misabella16
THANK YOU!

I suspect there is no general way. ): Even the academic reviews tend to cherry-pick one or two flaws and gesture at the rest.

Partial solutions:

  1. Invest the time to follow the minority of Goodreads users who know their stuff. (Link is people I follow.)
  2. See if Stuart Ritchie has reviewed it for money.

The Economist ($) for non-Western events and live macroeconomics. They generally foreground the most important thing that happens every week, wherever it happens to occur. They pack the gist into a two page summary, "The World this Week". Their slant is pro-market pro-democracy pro-welfare pro-rights, rarely gets in the way. The obituaries are often extremely moving.

https://www.economist.com/the-world-this-week/

Raised in the old guard, Chalmers doesn't understand...

This amused me, given that in the 90s he was considered an outsider and an upstart, coming round here with his cognitive science, shaking things up. (" 'The Conscious Mind' is a stimulating, provocative and agenda-setting demolition-job on the ideology of scientific materialism. It is also an erudite, urbane and surprisingly readable plea for a non-reductive functionalist account of mind. It poses some formidable challenges to the tenets of mainstream materialism and its cognitivist offshoots" )

Not

... (read more)

I did a full accounting, including vague cost-benefit ranking:

https://www.gleech.org/stuff

Ignoring the free ones, which you should just go and get now, I think the best are:

  • Sweet Dreams Contoured sleep mask. Massively improved sleep quality, without having to alter the room, close the windows, whatever. 100:1.

  • Bowflex SelectTech dumbbells. A cheap gym membership is £150 a year; using these a couple times a week for 2 years means I’ve saved hundreds of pounds and dozens of hours commuting. They should last 15 years, so maybe total 30:1. (During the pre

... (read more)
1Brendan Long
I have three of the things you mention and would immediately buy them again if necessary: * A sleep mask (I've actually bought several of these because I lost them). Mine is the "Alaska Bear Natural Silk Sleep Mask" since I sleep on my sides sometimes and find the flat kind more comfortable. * Adjustable dumbbells. I have the "Ironmaster 45 lb Quick-Lock Adjustable Dumbbell System", which get points of durability, but I actually wish I had bought the Bowflex ones since they can be adjusted much more easily. * Bose Quietcomfort headphones. I have the bluetooth version and actually really like it, since I only need to recharge them every few days and they're nicer to use when I'm walking around. Note that you can sometimes get these for massive discounts on ebay. I bought my Bose Quietcomfort 35's for $120.
4Mark Xu
For the people who don't know acronyms (me), RSI stands for repetitive strain injury.
Load More