cherrvak
cherrvak has not written any posts yet.

cherrvak has not written any posts yet.

David Chapman actually uses social media recommendation algorithms as a central example of AI that is already dangerous: https://betterwithout.ai/apocalypse-now
shared a review in some private channels, might as well share it here:
The book positions itself as a middle ground between optimistic capabilities researchers striding blithely into near-certain catastrophe and pessimistic alignment researchers too concerned with dramatic abstract doom scenarios to address more realistic harms that can still be averted. When addressing the latter, Chapman constructs a hypothetical "AI goes FOOM and unleashes nanomachine death" scenario and argues that while alignment researchers are correct that we have no capacity to prevent this awful scenario, it relies on many leaps (very fast boostrapped self-optimization, solving physics in seconds, nanomachines) that provoke skepticism. I'm inclined to agree: I know that the common line is... (read more)
It is possible that the outlier dimensions are related to the LayerNorms since the layernorm gain and bias parameters often also have outlier dimensions and depart quite strongly from Gaussian statistics.
This reminds me of a LessWrong comment that I saw a few months ago:
I think at least some GPT2 models have a really high-magnitude direction in their residual stream that might be used to preserve some scale information after LayerNorm.
I am surprised that these issues would apply to, say, Google translate. Google appears unconstrained by cost or shortage of knowledgeable engineers. If Google developed a better translation model, I would expect to see it quickly integrated into the current translation interface. If some external group developed better translation models, I would expect to see them quickly acquired by Google.
Why haven't they switched to newer models?
I thoroughly enjoyed this paper, and would very much like to see the same experiment performed on language models in the billion-parameter range. Would you expect the results to change, and how?
AIs that are superhuman at just about any task we can (or simply bother to) define a benchmark, for
Something that I’m really confused about: what is the state of machine translation? It seems like there is massive incentive to create flawless translation models. Yet when I interact with Google translate or Twitter’s translation feature, results are not great. Are there flawless translation models that I’m not aware of? If not, why is translation lagging behind other text analysis and generation tasks?
Thank you for clarifying your intended point. I agree with the argument that playful thinking is intrinsically valuable, but still hold that the point would have been better-reinforced by including some non-mathematical examples.
I literally don’t believe this
Here are two personal examples of playful thinking without obvious applications to working on alignment:
I agree. It seems awfully convenient that the all of the “fun” described in this post involve the legibly-impressive topics of physics and mathematics. Most people, even highly technically competent people, aren’t intrinsically drawn to play with intellectually prestigious tasks. They find fun in sports, drawing, dancing, etc. Even when they take adopt an attitude of intellectual inquiry to their play, the insights generated from drawing techniques or dance moves are far less obviously applicable to working on alignment than the insights generated from studying physics. My concern is that presenting such a safely “useful” picture of fun undercuts the message to “follow your playful impulses, even if they’re silly”, because the implicit signal is “and btw, this is how smart people like us should be having fun B)”
A paper from 2023 exploits differences in full-precision and int8 inference to create a compromised model which only activates its backdoor post-quantization.