in case i forgot last month, here's a link to july
One proof of concept for the GSAI stack would be a well-understood mechanical engineering domain automated to the next level and certified to boot. How about locks? Needs a model of basic physics, terms in some logic for all the parts and how they compose, and some test harnesses that simulate an adversary. Can you design and manufacture a provably unpickable lock?
Zac Hatfield-Dodds (of hypothesis/pytest and Anthropic, was offered and declined authorship on the GSAI position paper) challenged Ben Goldhaber to a bet after Ben coauthored a post with Steve Omohundro. It seems to resolve in 2026 or 2027, the comment thread should get cleared up once Ben gets back from Burning Man. The arbiter is Raemon from LessWrong.
Zac says you can’t get a provably unpickable lock on this timeline. Zac gave (up to) 10:1 odds, so recall that the bet can be a positive expected value for Ben even if he thinks the event is most likely not going to happen.
For funsies, let’s map out one path of what has to happen for Zac to pay Ben $10k. This is not the canonical path, but it is a path:
Arguments take place in 5 parts.
- Claim: What do you want me to believe?
- Reasons: Why should I agree?
- Evidence: How do you know? Can you back it up?
- Acknowledgment and Response: But what about ... ?
- Warrant: How does that follow?
This can be modeled as a conversation with readers, where the reader prompts the writer to taking the next step on the list.
Claim ought to be supported with reasons. Reasons ought to be based on evidence. Arguments are recursive: a part of an argument is an acknowledgment of an anticipated response, and another argument addresses that response. Finally, when the distance between a claim and a reason grows large, we draw connections with something called warrants.
The logic of warrants proceeds in generalities and instances. A general circumstance predictably leads to a general consequence, and if you have an instance of the circumstance you can infer an instance of the consequence.
Arguing in real life papers is complexified from the 5 steps, because
Thinking about a top-level post on FOMO and research taste
Idk maybe this shortform is most of the value of the top level post
A trans woman told me
I get to have all these talkative blowhard traits and no one will punish me for it cuz I'm a girl. This is one major reason detrans would make my life worse. Society is so cruel to men, it sucks so much for them
And another trans woman had told me almost the exact same thing a couple months ago.
My take is that roles have upsides and downsides, and that you'll do a bad job if you try to say one role is better or worse than another on net or say that a role is more downside than upside. Also, there are versions of "women talk too much" as a stereotype in many subcultures, but I don't have a good inside view about it.
Getting the easy things right shows respect for your readers and is the best training for dealing with the hard things.
If they don't believe the evidence, they'll reject the reasons and, with them, your claim.
We saw previously that claims ought to be supported with reasons, and reasons ought to be based on evidence. Now we will look closer at reasons and evidence.
Reasons must be in a clear, logical order. Atomically, readers need to buy each of your reasons, but compositionally they need to buy your logic. Storyboarding is a useful technique for arranging reasons into a logical order: physical arrangements of index cards, or some DAG-like syntax. Here, you can list evidence you have for each reason or, if you're speculating, list the kind of evidence you would need.
When storyboarding, you want to read out the top level reasons as a composite entity without looking at the details (evidence), because you want to make sure the high-level logic makes sense.
...Readers will not accept a reason until they see it anchored in what they consider to be a bedrock of established fact. ... To count as evidence, a statement must report something tha
Primary sources provide you with the "raw data" or evidence you will use to develop, test, and ultimately justify your hypothesis or claim. Secondary sources are books, articles, or reports that are based on primary sources and are intended for scholarly or professional audiences. Tertiary sources are books and articles that synthesize and report on secondary sources for general readers, such as textbooks, articles in encyclopedias, and articles in mass-circulation publications.
The distinction between primary and secondary sources comes from 19th century historians, and the idea of tertiary sources came later. The boundaries can be fuzzy, and are certainly dependent on the task at hand.
I want to reason about what these distinctions look like in the alignment community, and whether or not they're important.
The rest of chapter five is about how to use libraries and information technologies, and evaluating sources for relevance and reliability.
Chapter 6 starts off with the kind of thing you should be looking for while you read
Yesterday I quit my job for direct work on epistemic public goods! Day one of direct work trial offer is April 4th, and it'll take 6 weeks after that to know if I'm a fulltime hire.
I'm turning down
for
did anyone draw up an estimate of how much the proportion of code written by LLMs will increase? or even what the proportion is today
I think this is a crucial part of a lot of psychological maladaption and social dysfunction, very salient to EAs. If you're way more trait xyz than anyone you know for most of your life, your behavior and mindset will be massively effected, and depending on when in life / how much inertia you've accumulated by the time you end up in a different room where suddenly you're average on xyz, you might lose out on a ton of opportunities for growth.
In other words, the concept of "big fish small pond" is deeply insightful a...
I think a property of my theory of change is that academic and commercial speed is a bottleneck. I recently realized that my mass assignment for timelines synchronized with my mass assignment for the prosaic/nonprosaic axis. The basic idea is that let's say a radical new paper that blows up and supplants the entire optimization literature gets pushed to the arxiv tomorrow, signaling the start of some paradigm that we would call nonprosaic. The lag time for academics and industry to figure out what's going on, fi...
I feel like 7 years from AlexNet to the world of PyTorch, TPUs, tons of ML MOOCs, billion-parameter models, etc. is strong evidence against what you're saying, right? Or were deep neural nets already a big and hot and active ecosystem even before AlexNet, more than I realize? (I wasn't paying attention at the time.)
Moreover, even if not all the infrastructure of deep neural nets transfers to a new family of ML algorithms, much of it will. For example, the building up of people and money in ML, the building up of GPU / ASIC servers and the tools to use them, the normalization of the idea that it’s reasonable to invest millions of dollars to train one model and to fab ASICs tailored to a particular ML algorithm, the proliferation of expertise related to parallelization and hardware-acceleration, etc. So if it took 7 years from AlexNet to smooth turnkey industrial-scale deep neural nets and billion-parameter models and zillions of people trained to use them, then I think we can guess <7 years to get from a different family of learning algorithms to the analogous situation. Right? Or where do you disagree?
Cope isn't a very useful concept.
For every person who has a bad reason that they catch because you say "sounds like cope", there are 10x as many people who find their reason actually compelling. Saying "if that was my reason it would be a sign I was in denial of how hard I was coping" or "I don't think that reason is compelling" isn't really relevant to the person you're ostensibly talking to, who's trying to make the best decisions for the best reasons. Just say you don't understand why the reason is compelling.
I asked a friend whether I should TA for a codeschool called ${{codeschool}}.
You shouldn't hang around ${{codeschool}}. People at ${{codeschool}} are not pursuing excellence.
A hidden claim there that I would soak up the pursuit of non-excellence by proximity or osmosis isn't what's interesting (though I could see that turning out either way). What's interesting is the value of non-excellence, which I'll call adequacy.
${{codeschool}} in this case is effective and impactful at putting butts in seats at companies, and is thereby re...
I used to think "community builder" was a personality trait I couldn't switch off, but once I moved to the bay I realized that I was just desperate for serendipity and knew how to take it from 0% to 1%. Since the bay is constantly humming at 70-90% serendipity, I simply lost the urge to contribute.
Benefactors are so over / beneficiaries are so back / etc.
Let FairBot
be the player that sends an opponent to Cooperate
(C
) if it is provable that they cooperate with FairBot
, and sends them to Defect
(D
) otherwise.
Let FairBot_k
be the player that searches for proofs of length <= k
that it's input cooperates with FairBot_k
, and cooperates if it finds one, returning defect if all the proofs of length <= k
are exhausted without one being valid.
Critch writes that "100%" of the time, mathematicians and computer scientists report believing that FairBot_k(FairBot_k) = D
, owing to the basic vision of a stack overf...
I find myself, just as a random guy, deeply impressed at the operational competence of airports and hospitals. Any good books about that sort of thing?
Consider politics. You should take your political preferences/aesthetics, go to the tribes that are based on them, and help them be more sane. In the politics example, everyone's favorite tribe has failure modes, and it is sort of the responsibility of the clearest-headed members of that tribe to make sure that those failure modes don't become the dominant force of that tribe.
Speaking for myself, having been deeply in an activist tribe before I was a rat/EA, I regret I wasn't there to hel...
Broadly, the two kinds of claims are conceptual and practical.
Conceptual claims ask readers not to ask, but to understand. The flavors of conceptual claim are as follows:
There's essentially one flavor of practical claim
If you read between the lines, you might notice that a kind of claim of fact or cause/consequence is that a policy work...
Training for alignment research is one part competence (at math, cs, philosophy) and another part having an inside view / gears-level model of the actual problem. Competence can be outsourced to universities and independent study, but inside view / gears-level model of the actual problem requires community support.
A background assumption I'm working with is that training as a longtermist is not always synchronized with legible-to-academia training. It might be the case that jr researchers oug...
I'm excited for language model interpretability to teach us about the difference between compilers and simulations of compilers. In the sense that chatgpt and I can both predict what a compiler of a suitably popular programming language will do on some input, what's going on there---- surely we're not reimplementing the compiler on our substrate, even in the limit of perfect prediction? Will be an opportunity for a programming language theorist in another year or two of interp progress
In the Safeguarded AI programme thesis[1], proof certificates or certifying algorithms are relied upon in the theory of change. Let's discuss!
From the thesis:
Proof certificates are a quite broad concept, introduced by[33] : a certifying algorithm is defined as one that produces enough metadata about its answer that the answer's correctness can be checked by an algorithm which is so simple that it is easy to understand and to formally verify by hand.
The abstract of citation 33 (McConnell et al)[2]
...A certifying algorithm is an algorithm
any interest in a REMIX study group? in meatspace in berkeley, a few hours a week. https://github.com/redwoodresearch/remix_public/
Any tips for getting out of a "rise to the occasion mindset" and into a "sink to your training" mindset?
I'm usually optimizing for getting the most out of my A-game bursts. I want to start optimizing for my baseline habits, instead. I should cover the B-, C-, and Z-game; the A-game will cover itself.
Mathaphorically, "rising to the occasion" is taking a max of a max, whereas "sinking to the level of your habits" looks like a greatest lower bound.
Yall, this is a rant. It will be sloppy.
I'm really tired of high functioning super smart "autism" like ok we all have madeup diagnoses--- anyone with a IQ slightly above 90 knows that they can learn the slogans to manipulate gatekeepers to get performance enhancement, and they decide not to if they think theyre performing well enough already. That doesn't mean "ADHD" describes something in the world. Similarly, there's this drift of "autism" getting more and more popular. It's obnoxious because labels and identities are obnoxious, but i only find it repuls...
For the record, to mods: I waited till after petrov day to answer the poll because my first guess upon receiving a message on petrov day asking me to click something is that I'm being socially engineered. Clicking the next day felt pretty safe.
messy, jotting down notes:
Methods, famously, includes the line "I am a descendant of the line of Bacon", tracing empiricism to either Roger (13th century) or Francis (16th century) (unclear which).
Though a cursory wikiing shows an 11th century figure providing precedents for empiricism! Alhazen or Ibn al-Haytham worked mostly optics apparently but had some meta-level writings about the scientific method itself. I found this shockingly excellent quote
...The duty of the man who investigates the writings of scientists, if learning the truth is his goal, is to make himself an enemy of a
DM me for invite if you're at all interested in multipolar scenarios, cooperative AI, ARCHES, social applications & governance, computational social choice, heterogeneous takeoff, etc.
(side note I'm also working on figuring out what unipolar worlds and/or homogeneous takeoff worlds imply for MMD research).
Last time we discussed the difference between information and a question or a problem, and I suggested that the novelty-satisfied mode of information presentation isn't as good as addressing actual questions or problems. In chapter 3 which I have not typed up thoughts about, A three step procedure is introduced
Event for the paper club: https://calendar.app.google/2a11YNXUFwzHbT3TA
blurb about the paper in last month's newsletter:
...... If you’re wondering why you just read all that, here’s the juice: often in GSAI position papers there’ll be some reference to expectations that capture “harm” or “safety”. Preexpectations and postexpectations with respect to particular pairs of programs could be a great way to cash this out, cuz we could look at programs as interventions and simulate RCTs (labeling one program
Does anyone use vim / mouse-minimal browser? I like Tridactyl better than the other one I tried, but it's not great when there's a vim mode in a browser window everything starts to step on eachother (like in jupyter, colab, leetcode, codesignal)
I'm halfway through how to measure anything: cybersecurity, which doesn't have a lot of specifics to cybersecurity and mostly reviews the first book. I never finished the first one, and it was about four years ago that I read the parts that I did.
I think for top of the funnel EA recruiting it remains the best and most underrated book. Basically anyone worried about any kind of problem will do better if they read it, and most people in memetically adaptive / commonsensical activist or philanthropic mindsets probably aren't measuring enough.
However, the mate...
We can say "a monotonic map, is a phenomenon of as observed by ", then, emergence is simply the impreservation of joins.
Given preorders and , we say a map in "preserves" joins (which, recall, are least upper bounds) iff where by "" we mean .
Suppose is a measurement taken from a particle. We would like for our measurement system to be robust against emergence, which is literally operationalized by measuring one particle, measuring another, t...
Jotted down some notes about the law of mad science on the EA Forum. Looks like some pretty interesting open problems in the global priorities, xrisk strategy space. https://forum.effectivealtruism.org/posts/r5GbSZ7dcb6nbuWch/quinn-s-shortform?commentId=DqSh6ifdXpwHgXnCG
Two premises of mine are that I'm more ambitious than nearly everyone I meet in meatspace and normal distributions. This implies that in any relationship, I should expect to be the more ambitious one.
I do aspire to be a nagging voice increasing the ambitions of all my friends. I literally break the ice with acquaintances by asking "how's your master plan going?" because I try to create vibes like we're having coffee in the hallway of a supervillain conference, and I like to also ask "what harder project is your current project a war...
I'm not aware of a literature or a dialogue on what I think is a very crucial divide in longtermism.
In this shortform, I'm going to take a polarity approach. I'm going to bring each pole to it's extreme, probably each beyond positions that are actually held, because I think median longtermism or the longtermism described in the Precipice is a kind of average of the two.
Negative longtermism is saying "let's not let some bad stuff happen", namely extinction. It wants to preserve. If nothing gets better for the poor or the an...
Writers can't avoid creating some role for themselves and their readers, planned or not
Before considering the role you're creating for your reader, consider the role you're creating for yourself. Your broad options are the following
He had become so caught up in building sentences that he had almost forgotten the barbaric days when thinking was like a splash of color landing on a page.
a -valued quantifier is any function , so when is bool quantifiers are the functions that take predicates as input and return bool as output (same for prop). the standard max
and min
functions on arrays count as real-valued quantifiers for some index set .
I thought I had seen as the max of the Prop-valued quantifiers, and exists as the min somewhere, which has a nice mindfeel since forall has this "big" feeling (if you determined for that (of which is just syntax sugar since the variable name is irrelevant) by exhaustive ...
claude and chatgpt are pretty good at ingesting textbooks and papers and making org-drill cards.
here's my system prompt https://chat.openai.com/g/g-rgeaNP1lO-org-drill-card-creator though i usually tune it a little further per session.
Here are takes on the idea from the anki ecosystem
I tried a little ankigpt and it was fine, i haven't tried the direct plugin from ankiweb. I'm opting for org-drill here cuz I really like plaintext.
consider how our nonconstructive existence proof of nash equilibria creates an algorithmic search problem, which we then study with computational complexity. For example, 2-player 0-sum games are P but for three or more players general sum games are NP-hard. I wonder if every nonconstructive existence proof is like this? In the sense of inducing a computational complexity exercise to find what class it's in, before coming up with greedy heuristics to accomplish an approximate example in practice.
I like thinking about "what it feels like to write computer programs if you're a transformer".
Does anyone have a sense of how to benchmark or falsify Nostalgebraist's post on the subject?
Quick version of conversations I keep having, might be worth a top level effortpost.
whistleblower protections at large firms, dating, project management and internal company politics--- all userbases with underserved opinions about transparency. Manifold could pivot to this but have a lot of other stuff they could do instead.
Think about slack admins are confused about how to prevent some usergroups from @channel
and discord admins aren't.
There's a somewhat niche CS subtopic that a friend wants to learn, I'm really well positioned to teach her. More discussion on the manifold bounty:
When you see a new intricate discipline, and you're reticent to invest in navigating it, asking to be convinced that your attention has been earned is fine, but I don't recall seeing a valid or interesting complaint about jargon that deviates from this.
Some elaboration here
There's a remarkable TNG episode about enfeeblement and paul-based threatmodels, if I recall correctly.
There's a post-scarcity planet with some sort of Engine of Prosperity in the townsquare, and it doesn't require maintenance for enough generations that engineering itself is a lost oral tradition. Then it starts showing signs of wear and tear...
If paul was writing this story, they would die. I think in the actual episode, there's a disagreeable autistic teenager who expresses curiosity about the Engine mechanisms, and the grownups basically shame him, lik...
We need a cool one-word snappy thing to say for "just what do you think you know and how do you think you know it" or like "I'm requesting more background about this belief you've stated, if you have time".
I want something that has the same mouthfeel as "roll to disbelieve" for this.
Is there an EV monad? I'm inclined to think there is not, because EV(EV(X))
is a way simpler structure than a "flatmap" analogue.
Would there be a way of estimating how many people within the amazon organization are fanatical about same day delivery ratio against how many are "just working a job"? Does anyone have a guess? My guess is that an organization of that size with a lot of cash only needs about 50 true fanatics, the rest can be "mere employees". What do yall think?
We need a name for the following heuristic, I think, I think of it as one of those "tribal knowledge" things that gets passed on like an oral tradition without being citeable in the sense of being a part of a literature. If you come up with a name I'll certainly credit you in a top level post!
I heard it from Abram Demski at AISU'21.
Suppose you're either going to end up in world A or world B, and you're uncertain about which one it's going to be. Suppose you can pull lever which will be 100 valuable if you end up in world A, or you can pull lever whi...
I think one of the most crucial meta skills i've developed is honing my sense of who's criticizing me vs. who's complaining.
A criticism is actionable, implicitly often it's from someone who wants you to win. A complaint is when you can't figure out how you'd actionably fix something or improve based on what you're being told.
This simple binary story is problematic. It can empower you to ignore criticism you don't like by providing a set of excuses, if you're not careful. Sometimes it's operationally impossible to parse out a critic...
hmu for a haskell job in decentralized finance. Super fun zero knowledge proof stuff, great earning to give opportunity.
In game theory, a focal point (or Schelling point) is a solution that people tend to choose by default in the absence of communication. (wikipedia)
Intuitively I think simplicity is a good explanation for a solution being converged upon.
Does anyone have any crisp examples that violate the schelling point - occam's razor correspondence?
My deontologist friend just told me that treating people like investments is no way to live. The benefits of living by that take are that your commitments are more binding, you actually do factor out uncertainty, because when you treat people like investments you always think "well someday I'll no longer be creating value for this person and they'll drop me from their life". It's hard to make long term plans, living like that.
I've kept friends around out of loyalty to what we shared 5-10 years ago w...
I may refine this into a formal bounty at some point.
I'm curious if censorship would actually work in the context of blocking deployment of superpowerful AI systems. Sometimes people will mention "matrix multiplication" as a sort of goofy edge case, which isn't very plausible, but that doesn't mean there couldn't be actual political pressure to censor it. A more plausible example would be attention. Say the government threatens soft power against arxiv if they don't pull attention is all you need, or threatens soft power against harvard if their linguistic...
any literature on estimates of social impact of businesses divided by their valuations?
the idea that dollars are a proxy for social impact is neat, but leaves a lot of room for goodhart and I think it's plausible that they diverge entirely in cases. It would be useful to know, if possible to know, what's going on here.
Why have I heard about Tyson investing into lab grown, but I haven't heard about big oil investing in renewable?
Tyson's basic insight here is not to identify as "an animal agriculture company". Instead, they identify as "a feeding people company". (Which happens to align with doing the right thing, conveniently!)
It seems like big oil is making a tremendous mistake here. Do you think oil execs go around saying "we're an oil company"? When they could instead be going around saying "we're a powering stuff" company. Being a powering stuff company means you hav...
I've had a background assumption in my interpretation of and beliefs about reward functions for as long as I can remember (i.e. since first reading the sequences), that I suddenly realized I don't believe is written down. Over the last two years I've gained experience writing coq sufficient to inspire a convenient way of framing it.
A proof engineer calls a proposition computational if it's proof can be broken down into parts.
For example, a + (b + c) = (a + b) + c
i...
I come to you with a dollar I want to spend on AI. You can allocate p
pennies to go to capabilities and 100-p
pennies to go to alignment, but only if you know of a project that realizes that allocation. For example, we might think that GAN research sets p = 98
(providing 2 cents to alignment) while interpretability research sets p = 10
(providing 90 cents to alignment).
Is this remotely useful? This is a really rough model (you might think it's more of a venn diagram and that this model doesn't provide a way of reasoning about t...
Three predictable disagreements are
There are roughly two kinds of queries readers will have about your argument
there's a gap in my inside view of the problem, part of me thinks that capabilities progress such as out-of-distribution robustness or the 4 tenets described in open problems in cooperative ai is necessary for AI to be transformative, i.e. a prereq of TAI, and another part of me that thinks AI will be xrisky and unstable if it progresses along other aspects but not along the axis of those capabilities.
There's a geometry here of transformative / not transformative cross product with dangerous not dangerous.
To have an inside view I must be able to adequately navigate between the quadrants with respect to outcomes, interventions, etc.
I was reminiscing about my prediction market failures, the clearest "almost won a lot of mana dollars" (if manifold markets had existed back then) was this executive order. The campaign speeches made it fairly obvious, and I'm still salty about a few idiots telling me "stop being hysterical" when I accused him of being what he's writing on the tin that he is pre inauguration even though I overall reminisce that being a time when my epistemics were way worse than they are now.
However, there d...