I think more leaders of orgs should be trying to shape their organizations incentives and cultures around the challenges of "crunch time". Examples of this include:
I have more questions than answers, but the background level of stress and disorientation for employees and managers will be rising, especially in AI Safety orgs, and starting to come up w/ contextually true answers (I doubt there's a universal answer) will be important.
Some kind of payment for training data from applications like MSFT rewind does seem fair. I wonder if there will be a lot of growth in jobs where your main task is providing or annotating training data.
I've seen reddit ads from multiple companies offering to work for them doing freelance annotation / high-quality-text-data generation.
Related: Reason as memetic immune disorder
I like the idea that having some parts of you protected from yourself makes them indirectly protected from people or memes who have power over you (and want to optimize you for their benefit, not yours). Being irrational is better than being transparently rational when someone is holding a gun at your head. If you could do something, you would be forced to do it (against your interests), so it's better for you if you can't.
But, what now? It seems like rationality and introspection is a bit like defusing a bomb -- great if you can do it perfectly, but it kills you when you do it halfways.
It reminds me of a fantasy book which had a system of magic where wizards could achieve 4 levels of power. Being known as a 3rd level wizard was a very bad thing, because all 4th level wizards were trying to magically enslave you -- to get rid of a potential competitor, and to get a powerful slave (I suppose the magical cost of enslaving someone didn't grow up proportionally to victim's level).
To use an analogy, being biologically incapable of reaching 3rd level of magic might be an evolutionary advantage. But at the same time, it would prevent you from reaching the 4th level, ever.
Thanks for including that link - seems right, and reminded me of Scott's old post Epistemic Learned Helplessness
The only difference between their presentation and mine is that I’m saying that for 99% of people, 99% of the time, taking ideas seriously is the wrong strategy
I kinda think this is true, and it's not clear to me from the outset whether you should "go down the path" of getting access to level 3 magic given the negatives.
Probably good heuristics are proceeding with caution when encountering new/out there ideas, remembering you always have the right to say no, finding trustworthy guides, etc.
Recently learned about Acquired savant syndrome. https://en.wikipedia.org/wiki/Jason_Padgett
After the attack, Padgett felt "off." He assumed it was an effect of the medication he was prescribed; but it was later found that, because of his traumatic brain injury, Padgett had signs of obsessive–compulsive disorder and post-traumatic stress disorder.[5] He also began viewing the world through a figurative lens of mathematical shapes.
"Padgett is one of only 40 people in the world with “acquired savant syndrome,” a condition in which prodigious talents in math, art or music emerge in previously normal individuals following a brain injury or disease.
this makes it seem more likely to me that bio interventions for increases in IQ in adult humans is possible, though likely algernon's law holds and there's a cost.
h/t @Jesse Hoogland
Previous discussion, comment by A.H. :
Sorry to be a party pooper, but I find the story of Jason Padgett (the guy who 'banged his head and become a math genius') completely unconvincing. From the video that you cite, here is the 'evidence' that he is 'math genius':
- He tells us, with no context, 'the inner boundary of pi is f(x)=x sin(pi/x)'. Ok!
- He makes 'math inspired' drawings (some of which admittedly are pretty cool but they're not exactly original) and sells them on his website
- He claims that a physicist (who is not named or interviewed) saw him drawing in the mall, and, on the basis of this, suggested that he study physics.
- He went to 'school' and studied math and physics. He says started with basic algebra and calculus and apparently 'aced all the classes', but doesn't tell us what level he reached. Graduate? Post-graduate?
- He was 'doing integrals with triangles instead of integrals with rectangles'
- He tells us 'every shape in the universe is a fractal'
- Some fMRI scans were done on his brain which found 'he had conscious access to parts of the brain we don't normally have access to'.
Why so few third party auditors of algorithms? for instance, you could have an auditing agency make specific assertions about what the twitter algorithm is doing, whether the community notes is 'rigged'
If Elon wanted to spend a couple hundred thousand on insanely commited high integrity auditors, it'd be a great experiment
Community notes is open source. You have to hope that Twitter is actually using the implementation from the open source library, but this would be easy to whistleblow on.
So maybe the general explanation is that most of the time, when the trustworthiness of an algorithm is really important, you open source it?
Your second option seems likely. Eg did you know community notes is open source? Given that information, are you going to even read the associated whitepaper or the issues page?
Even if you do, I think we can still confidently infer very few others reading this will (I know I’m not).
I think the biggest reason (especially for Twitter, but applies to other places) are currently lying about their algorithms, thus intentionally don't do third party audits to avoid tbe deception becoming known. (Like another comment mentioned community note's open source repo actually being used)
AI for improving human reasoning seems promising; I'm uncertain whether it makes sense to invest in new custom applications, as maybe improvements in models are going to do a lot of the work.
I'm more bullish on investing in exploration of promising workflows and design patterns. As an example, a series of youtube videos and writeups on using O3 as a forecasting aid for grantmaking, with demonstrations. Or a set of examples of using LLMs to aid in productive meetings, with a breakdown of the tech used and social norms that the participants agreed to.
- I think these are much cheaper to do in terms for time and money.
- A lot of epistemics seems to be HCI bottlenecked.
- Good design patterns are easily copyable, which also means they're probably underinvested in relative to their returns.
- Social diffusion of good epistemic practices will not necessarily hapepn as fast as AI improvements.
- Improving the AIs themselves to be more truth seeking and provide good advice - with good benchmarks - is another avenue.
I imagine a fellowship for prompt engineers and designers, prize competitions, or perhaps retroactive funding for people who have already developed good patterns.
Happy to see thinking on this.
I like the idea of getting a lot of small examples of clever uses of LLM in the wild, especially by particularly clever/experimental people.
I recently made this post to try to gather some of the techniques common around this community.
One issue that I have though is that I'm really unsure what it looks like to promote neat ideas like these, outside of writing long papers or making semi-viral or at least [loved by a narrow community] projects.
The most obvious way is via X/Twitter. But this often requires building an X audience, which few people are good at. Occasionally particularly neat images/clips by new authors go viral, but it's tough.
I'd also flag:
- It's getting cheaper to make web applications.
- I think EA has seen more success in making blog posts and web apps than we did things like [presenting neat ideas in videos/tweets].
- Often, [simple custom applications] are pretty crucial for actually testing out an idea. You can generate wireframes, but this only tells you a very small amount.
I guess what I'm getting at is that I think [web applications] are likely a major part of the solution - but that we should favor experimenting with many small ones, rather than going all-in on 2-4 ideas or so.
Good points! I agree that actual prototyping is necessary to see if an idea works, and as a demo it can be far more convincing. Especially w/ the decreased cost of building web apps, leveraging them for fast demos of techniques seems valuable.
https://www.commerce.gov/sites/default/files/migrated/reports/y2k_1.pdf
I think people should write a bunch of their own vignettes set in the AI 2027 universe. Small snippets of life predictions as things get crazy, on specific projects that may or may not bend the curve, etc.
epistemic status: thought about this for like 15 minutes + two deep research reports
a contrarian pick for underrated technology area is lie detection through brain imaging. It seems like it will become much more robust and ecologically valid through compute scaled AI techniques, and it's likely to be much better at lie detection than humans because we didn't have access to images of the internals of other peoples brains in the ancestral environment.
On the surface this seems like it would be transformative - brain scan key employees to make sure they're not leaking information! test our leaders for dark triad traits (ok that's a bit different than specific lies but still) - however there's a cynical part of me that sounds like some combo of @ozziegooen and Robin Hanson which notes we have methods now (like significantly increased surveillance and auditing) which we could use for greater trust and which we don't employ.
So perhaps this won't be used except for the most extreme natsec cases, where there are already norms of investigations and reduced privacy.
Related quicktake: https://www.lesswrong.com/posts/hhbibJGt2aQqKJLb7/shortform-1#25tKsX59yBvNH7yjD
however there's a cynical part of me that sounds like some combo of @ozziegooen and Robin Hanson which notes we have methods now (like significantly increased surveillance and auditing) which we could use for greater trust and which we don't employ.
Quick note: I think Robin Hanson is more on the side of "we're not doing this because we don't actually care". I'm more on the side of, "The technology and infrastructure just isn't good enough."
What I mean by that is that I think it's possible to get many of the benefits of surveillance without minimal costs, using a combination of Structured Transparency and better institutions. This would be a software+governance challenge.
Depression as a concept doesn't make sense to me. Why on earth would it be fitness enhancing to have a state of withdrawal, retreat, collapse where a lack of energy prevents you from trying new things? I've brainstormed a number of explanations:
I'm partial to the explanation offered by the Predictive Processing Model, that depression is an extreme form of low confidence. As SSC write:
imagine the world’s most unsuccessful entrepreneur. Every company they make flounders and dies. Every stock they pick crashes the next day. Their vacations always get rained-out, their dates always end up with the other person leaving halfway through and sticking them with the bill.
What if your job is advising this guy? If they’re thinking of starting a new company, your advice is “Be really careful – you should know it’ll probably go badly”.
if sadness were a way of saying “Things are going pretty badly, maybe be less confidence and don’t start any new projects”, that would be useful...
Depression isn’t normal sadness. But if normal sadness lowers neural confidence a little, maybe depression is the pathological result of biological processes that lower neural confidence.
But I still don't understand why the behaviors we often see with depression - isolation, lack of energy - are 'longterm adaptive'. If a particular policy isn't working, I'd expect to see more energy going into experimentation.
[TK. Unfinished because I accidentally clicked submit and haven't finished editing the full comment]
I think you're asking too much of evolutionary theory here. Human bodies do lots of things that aren't longterm adaptive -- for example, if you stab them hard enough, all the blood falls out and they die. One could interpret the subsequent shock, anemia, etc. as having some fitness-enhancing purpose, but really the whole thing is a hard-to-fix bug in body design: if there were mutant humans whose blood more reliably stayed inside them, their mutation would quickly reach fixation in the early ancestral environment.
We understand blood and wound healing well enough to know that no such mutation can exist: there aren't any small, incrementally-beneficial changes which can produce that result. In the same way, it shouldn't be confusing that depression is maladaptive; you should only be confused if it's both maladaptive and easy to improve on. Intuitively it feels like it should be -- just pick different policies -- but that intuition isn't rooted in fine-grained understanding of the brain and you shouldn't let it affect your beliefs.
On a group selection level it might make lots more sense to have certain people get into states where they're very unlikely to procreate.
On of the finding of data-driven models of evolution of the last decades, is that group selection mostly isn't strong enough to create effects.
My views come more from listening to experts and not from looking at specifics. When studying bioinformatics that's basically what they told us about the result of researching genetics with computer models. Afterwards when talking to experts, I also heard the same sentiments that most claims of group selection shouldn't be trusted.
I too have heard that group selection is not well believed it just seems so out of sync with my understanding of systems theory that I'm skeptical about taking people's word on it.
Since we can sequence genomes we know how many changes need to happen for the difference between organisms. We know that gene drift destroys features for which there isn't selection pressure to keep them like our ability to make our own Vitamin C.
It seems to me like the moving pieces that are needed for computer models are there, so I would trust experts opinions of people on the topic more strongly then would be warranted 30 years ago where opinions were mostly based on intellectual arguments.
Is the clearest "win" of a LW meme the rise of the term "virtue signaling"? On the one hand I'm impressed w/ how dominant it has become in the discourse, on the other... maybe our comparative advantage is creating really sharp symmetric weapons...
Do I understand it correctly that you believe the words "virtue signaling", or at least their frequent use, originates on LW? What is your evidence for this? (Do you have a link to what appears to be the first use?)
In my opinion, Robin Hanson is more likely suspect, because he talks about signaling all the time. But I would not be surprised to hear that someone else used that idiom first, maybe decades ago.
In other words, is there anything more than "I heard about 'virtue signaling' first on LW"?
https://twitter.com/esyudkowsky/status/910941417928777728
I remember seeing other claims/analysis of this but don't remember where
When EY says our community he means more then just LW but the whole rationalist diaspora as well towards which Robin Hanson can be counted.
what are the bottlenecks preventing 10x-100x scaling of Control Evaluations?
Are there others?
Recently learned about Acquired savant syndrome. https://en.wikipedia.org/wiki/Jason_Padgett
this makes it seem more likely to me that bio interventions for increases in IQ in adult humans is possible, though likely algernon's law holds and there's a cost.
h/t @Jesse Hoogland
Sounds like synesthesia