I did! and I in fact have read - well some of :) - the whitepaper. But it still seems weird that it's not possible to Increase the Trust in the third party through financial means, dramatic PR stunts (auditor promises to commit sepuku if they are found to have lied)

Reply

bgold's Shortform

Ben Goldhaber23d10

source needed, but I recall someone on the community notes team saying it was very similar but there are some small differences between prod and the open source version (it's difficult to maintain exact compatibility). For the point of the comment and context I agree open source does a good job of this, though given the number of people on twitter who still allege its being manipulated, I think you need some additional juice (a whistleblower prize?)

Reply

bgold's Shortform

Ben Goldhaber26d160

Why so few third party auditors of algorithms? for instance, you could have an auditing agency make specific assertions about what the twitter algorithm is doing, whether the community notes is 'rigged'

It could be that this is too large of a codebase, too many people can make changes, it's too hard to verify the algorithm in production is stable. This seems unlikely to me with most modern devops stacks
It could be that no one will trust the third party agency. I guess this seems most likely... but really, have we even tried? Could we not have some group of monk like Auditors who would rather die than lie (my impression is some cyber professionals have this ethos already)

If Elon wanted to spend a couple hundred thousand on insanely commited high integrity auditors, it'd be a great experiment

Reply

bgold's Shortform

Ben Goldhaber1mo40

epistemic status: thought about this for like 15 minutes + two deep research reports

a contrarian pick for underrated technology area is lie detection through brain imaging. It seems like it will become much more robust and ecologically valid through compute scaled AI techniques, and it's likely to be much better at lie detection than humans because we didn't have access to images of the internals of other peoples brains in the ancestral environment.

On the surface this seems like it would be transformative - brain scan key employees to make sure they're not leaking information! test our leaders for dark triad traits (ok that's a bit different than specific lies but still) - however there's a cynical part of me that sounds like some combo of @ozziegooen and Robin Hanson which notes we have methods now (like significantly increased surveillance and auditing) which we could use for greater trust and which we don't employ.

So perhaps this won't be used except for the most extreme natsec cases, where there are already norms of investigations and reduced privacy.

Related quicktake: https://www.lesswrong.com/posts/hhbibJGt2aQqKJLb7/shortform-1#25tKsX59yBvNH7yjD

Reply

1

bgold's Shortform

Ben Goldhaber1mo30

Good points! I agree that actual prototyping is necessary to see if an idea works, and as a demo it can be far more convincing. Especially w/ the decreased cost of building web apps, leveraging them for fast demos of techniques seems valuable.

Reply

bgold's Shortform

Ben Goldhaber1mo*149

AI for improving human reasoning seems promising; I'm uncertain whether it makes sense to invest in new custom applications, as maybe improvements in models are going to do a lot of the work.

I'm more bullish on investing in exploration of promising workflows and design patterns. As an example, a series of youtube videos and writeups on using O3 as a forecasting aid for grantmaking, with demonstrations. Or a set of examples of using LLMs to aid in productive meetings, with a breakdown of the tech used and social norms that the participants agreed to.
- I think these are much cheaper to do in terms for time and money.
- A lot of epistemics seems to be HCI bottlenecked.
- Good design patterns are easily copyable, which also means they're probably underinvested in relative to their returns.
- Social diffusion of good epistemic practices will not necessarily hapepn as fast as AI improvements.
- Improving the AIs themselves to be more truth seeking and provide good advice - with good benchmarks - is another avenue.

I imagine a fellowship for prompt engineers and designers, prize competitions, or perhaps retroactive funding for people who have already developed good patterns.

Reply

bgold's Shortform

Ben Goldhaber1mo7-1

I think people should write a bunch of their own vignettes set in the AI 2027 universe. Small snippets of life predictions as things get crazy, on specific projects that may or may not bend the curve, etc.

Reply

1

Provably Safe AI: Worldview and Projects

Ben Goldhaber2mo52

fyi @Zac Hatfield-Dodds my probability has fallen below 10% - I expected at least one relevant physical<>cyber project to have started in the past six months, since it hasn't I doubt this will make the timeline. While not conceding (because I'm still unsure how far AI uplift alone gets us), seems right to note the update.

Reply

1

bgold's Shortform

Ben Goldhaber2mo30

good to know thanks for flagging!

Reply

bgold's Shortform

Ben Goldhaber2mo240

Recently learned about Acquired savant syndrome. https://en.wikipedia.org/wiki/Jason_Padgett

After the attack, Padgett felt "off." He assumed it was an effect of the medication he was prescribed; but it was later found that, because of his traumatic brain injury, Padgett had signs of obsessive–compulsive disorder and post-traumatic stress disorder.^[5] He also began viewing the world through a figurative lens of mathematical shapes.
"Padgett is one of only 40 people in the world with “acquired savant syndrome,” a condition in which prodigious talents in math, art or music emerge in previously normal individuals following a brain injury or disease.

this makes it seem more likely to me that bio interventions for increases in IQ in adult humans is possible, though likely algernon's law holds and there's a cost.

h/t @Jesse Hoogland

Reply

1