CEO at AE Studio
Yes, I am hopeful we have enough time before superintelligent AI systems are created to implement effective alignment approaches. I don't know if that is possible or not, but I think it is worth trying.
Given uncertainty about timelines and currently accelerating capabilities, it would be preferable to live in a world where we are making sure alignment advances more than otherwise.
I think this is precisely the reason that you’d want to make sure the agent is engineered such that its utility function includes the utility of other agents—ie, so that the ‘alignment goals’ are its goals rather than ‘goals other than [its] own.’ We suspect that this exact sort of architecture could actually exhibit a negative alignment tax insofar as many other critical social competencies may require this as a foundation.
I think this risks getting into a definitions dispute about what concept the words ‘alignment tax’ should point at. Even if one grants the point about resource allocation being inherently zero-sum, our whole claim here is that some alignment techniques might indeed be the most cost-effective way to improve certain capabilities and that these techniques seem worth pursuing for that very reason.
Thanks for this comment! Definitely take your point that it may be too simplistic to classify entire techniques as exhibiting a negative alignment tax when tweaking the implementation of that technique slightly could feasibly produce misaligned behavior. It does still seem like there might be a relevant distinction between:
We are definitely supportive of approaches that fall under both 1 and 2 (and acknowledge that 1-like approaches would not inherently have negative alignment taxes), but it does seem very likely that there are more undiscovered approaches out there with the general 2-like effect of “technique X got invented for safety reasons—and not only does it clearly help with alignment, but it also helps with other capabilities so much that, even as greedy capitalists, we have no choice but to integrate it into our AI’s architecture to remain competitive!” This seems like a real and entirely possible circumstance where we would want to say that technique X has a negative alignment tax.
Overall, we’re also sensitive to this all becoming a definitions dispute about what exactly is meant by terminology like ‘alignment taxes,’ ‘capabilities,’ etc, and the broader point that, as you put it,
you can advance capabilities and alignment at the same time, and should think about differentially advancing alignment
is indeed a good key general takeaway.
Interesting relevant finding from the alignment researcher + EA survey we ran:
We also find in both datasets—but most dramatically in the EA community sample, plotted below—that respondents vastly overestimate (≈2.5x) how much high intelligence is actually valued, and underestimate other cognitive features like having strong work ethics, abilities to collaborate, and people skills. One potentially clear interpretation of this finding is that EAs/alignment researchers actually believe that high intelligence is necessary but not sufficient for being impactful—but perceive other EAs/alignment researchers as thinking high intelligence is basically sufficient. The community aligning on these questions seems of very high practical importance for hiring/grantmaking criteria and decision-making.
Interesting. I wouldn't totally rule number 1 out though. Depending on how fast things go, the average time to successful IPO may decrease substantially.
Yes, excellent point, and thanks for the callout.
Note though that a fundamental part of this is that we at AE Studio do intend eventually to incubate as part of our skunkworks program alignment-driven startups.
We've seen that we can take excellent people, have them grow on client projects for some amount of time, get better at stuff they don't even realize they need to get better at in a very high-accountability way, and then be well positioned to found startups we incubate internally.
We've not turned attention to internally-incubated startups for alignment specifically yet but hope to by later this year or early next.
Meanwhile, there are not many orgs like us, and for various reasons it's easier to start a startup than to start something doing what we do.
If you think you can start something like what we do, I'd generally recommend it. You're probably more likely to succeed doing something more focused though to start.
Also, to start, we flailed a bit till we figured out we should get good at one thing at a time before doing more and more.
We plan to announce further details in a later post.
Reading this post, my immediate hunch is that the decline in sentence lengths has a lot to do with the historical role of Latin grammar and how deeply it influenced educated English writers. Latin inherently facilitates longer, complex sentences due to its use of grammatical inflections, declensions, and verb conjugations, significantly reducing reliance on prepositions and conjunctions. This syntactic flexibility allowed authors to naturally craft extensive yet smooth-flowing sentences. Latin's liberating lack of fixed word order and its fun little rhetorical devices combine to support nuanced, flexible thinking. From my own experience studying Latin 7th-12th grade, I find this sort of stuff contributes significantly to freer, more expansive expression when writing or speaking in English, and I often can immediately tell when speaking with or reading something written by someone else who studied Latin. An easy "tell" is when they say "having done x."
Educated English writers historically learned Latin as a foundational part of their education, internalizing this syntactic complexity. As a result, English prose from authors like Chaucer, Samuel Johnson, and Henry James shows a clear preference for hypotaxis, complex sentences with nested subordinate clauses, rather than simpler paratactic structures consisting of shorter, sequential clauses.
The practical advantage of these complex sentence structures is the precise communication of nuanced and sophisticated ideas. Longer sentences enabled authors to maintain coherent, detailed arguments and descriptions within a single cohesive thought. I see this as reflecting "transcription fluency," where authors aim for fidelity in translating their complex internal thought processes directly into prose, trusting readers’ intelligence and attention span to engage deeply.
Here's a fun example from Thoreau’s "Walden," which makes it clear that such elaborate writing was intended to be understood even by poorer and less formally educated readers. Consider the following (just) two sentences: