While I participated in a previous edition, and somewhat enjoyed it, I couldn't bring myself to support it now considering Remmelt is the organizer, between his anti-AI-art crusades and an overall "stop AI" activism. It's unfortunate, since technical AI safety research is very valuable, but promoting those anti-AI initiatives makes it a probable net negative in my eyes.
Maybe it's better to let AISC die a hero.
Because you could make the same argument could be made earlier in the "exponential curve". I don't think we should have paused AI (or more broadly CS) in the 50's, and I don't think we should do it now.
Modern misaligned AI systems are good, actually. There's some recent news about Sakana AI developing a system where the agents tried to extend their own runtime by editing their code/config.
This is amazing for safety! Current systems are laughably incapable of posing x-risks. Now, thanks to capabilities research, we have a clear example of behaviour that would be dangerous in a more "serious" system. So we can proceed with empirical research, create and evaluate methods to deal with this specific risk, so that future systems do not have this failure mode.
The future of AI and AI safety has never been brighter.
Expert opinion is an argument for people who are not themselves particularly informed about the topic. For everyone else, it basically turns into an authority fallacy.
And how would one go about procuring such a rock? Asking for a friend.
The ML researchers saying stuff like AGI is 15 years away have either not carefully thought it through, or are lying to themselves or the survey.
Ah yes, the good ol' "If someone disagrees with me, they must be stupid or lying"
For what it's worth, I think you're approaching this in good faith, which I appreciate. But I also think you're approaching the whole thing from a very, uh, lesswrong.com-y perspective, quietly making assumptions and using concepts that are common here, but not anywhere else.
I won't reply to every individual point, because there's lots of them, so I'm choosing the (subjectively) most important ones.
This is the actual topic. It's the Black Marble thought experiment by Bostrom,
No it's not, and obviously so. The actual topic is AI safety. It's not...
So I genuinely don't want to be mean, but this reminds me why I dislike so much of philosophy, including many chunks of rationalist writing.
This whole proposition is based on vibes, and is obviously false - just for sake of philosophy, we decide to ignore the "obvious" part, and roll with it for fun.
The chair I'm sitting on is finite. I may not be able to draw a specific boundary, but I can have a bounding box the size of the planet, and that's still finite.
My life as a conscious being, as far as I know, is finite. It started some years ago, it will ...
What are the actual costs of running AISC? I participated in it some time ago, kinda participating this year again (it's complicated). As far as I can tell, the only things that are required is some amount of organization, and then maybe a paid slack workspace. Is this just about salaries for the organizers?
The answer seems to be yes.
On the manifund page it says the following:
Virtual AISC - Budget version
Software etc
$2K
Organiser salaries, 2 ppl, 4 months
$56K
Stipends for participants
$0
Total $58K
In the Budget version, the organisers do the minimum job required to get the program started, but no continuous support to AISC teams during their projects and no time for evaluations and improvement for future versions of the program.Salaries are calculated based on $7K per person per month.
Based on the minimum threshold of $28k, that woul...
Huh, whaddayaknow, turns out Altman was in the end pushed back, the new interim CEO is someone who is pretty safety-focused, and you were entirely wrong.
Normalize waiting for more details before dropping confident hot takes.
You're not taking your own advice. Since your message, Ilya has publicly backed down, and Polymarket has Sam coming back as CEO at coinflip odds: Polymarket | Sam back as CEO of OpenAI?
The board has backed down after Altman rallied staff into a mass exodus
[citation needed]
I've seen rumors and speculations, but if you're that confident, I hope you have some sources?
(for the record, I don't really buy the rest of the argument either on several levels, but this part stood out to me the most)
I'm never a big fan of this sort of... cognitive rewiring? Juggling definitions? This post reinforces my bias, since it's written from a point of very stong bias itself.
AI optimists think AI will go well and be helpful.
AI pessimists think AI will go poorly and be harmful.
It's not that deep.
The post itself is bordering on insulting anyone who has a different opinion than the author (who, no doubt, would prefer the label "AI strategist" than "AI extremists"). I was thinking about going into the details of why, but honestly... this is unlikely to be pro...
In what sense do you think it will (might) not go well? My guess is that it will not go at all -- some people will show up in the various locations, maybe some local news outlets will pick it up, and within a week it will be forgotten
There's a pretty significant difference here in my view -- "carnists" are not a coherent group, not an ideology, they do not have an agenda (unless we're talking about some very specific industry lobbyists who no doubt exist). They're just people who don't care and eat meat.
Ideological vegans (i.e. not people who just happen to not eat meat, but don't really care either way) are a very specific ideological group, and especially if we qualify them like in this post ("EA vegan advocates"), we can talk about their collective traits.
TBF, the meat/dairy/egg industries are specific groups of people who work pretty hard to increase animal product consumption, and are much better resourced than vegan advocates. I can understand why animal advocacy would develop some pretty aggressive norms in the face of that, and for that reason I consider it kind of besides the point to go after them in the wider world. It would basically be demanding unilateral disarmament from the weaker side.
But the fact that the wider world is so confused there's no point in pushing for truth is the point. EA needs to stay better than that, and part of that is deescalating the arms race when you're inside its boundaries.
Is this surprising though? When I read the title I was thinking "Yea, that seems pretty obvious"
Speaking for myself, I would have confidently predicted the opposite result for the largest models.
My understanding is that LLMs work by building something like a world-model during training by compressing the data into abstractions. I would have expected something like "Tom Cruise's mother is Mary Lee Pfeiffer" to be represented in the model as an abstract association between the names that could then be "decompressed" back into language in a lot of different ways.
The fact that it's apparently represented in the model only as that exact phrase (or maybe a...
Often academics justify this on the grounds that you're receiving more than just monetary benefits: you're receiving mentorship and training. We think the same will be true for these positions.
I don't buy this. I'm actually going through the process of getting a PhD at ~40k USD per year, and one of the main reasons why I'm sticking with it is that after that, I have a solid credential that's recognized worldwide, backed by a recognizable name (i.e. my university and my supervisor). You can't provide either of those things.
This offer seems to take the worst of both worlds between academia and industry, but if you actually find someone good at this rate, good for you I suppose
My point is that your comment was extremely shallow, with a bunch of irrelevant information, and in general plagued with the annoying ultra-polite ChatGPT style - in total, not contributing anything to the conversation. You're now defensive about it and skirting around answering the question in the other comment chain ("my endorsed review"), so you clearly intuitively see that this wasn't a good contribution. Try to look inwards and understand why.
It's really good to see this said out loud. I don't necessarily have a broad overview of the funding field, just my experiences of trying to get into it - both into established orgs, or trying to get funding for individual research, or for alignment-adjacent stuff - and ending up in a capabilities research company.
I wonder if this is simply the result of the generally bad SWE/CS market right now. People who would otherwise be in big tech/other AI stuff, will be more inclined to do something with alignment. Similarly, if there's less money in overall tech (maybe outside of LLM-based scams), there may be less money for alignment.
Is it a thing now to post LLM-generated comments on LW?
If Orthogonal wants to ever be taken seriously, by far the most important thing is improving the public-facing communication. I invested a more-than-fair amount of time (given the strong prior for "it won't work" with no author credentials, proof-of-concepts, or anything that would quickly nudge that prior) trying to understand QACI, and why it's not just gibberish (both through reading LW posts and interacting with authors/contributors on the discord server), and I'm still mostly convinced there is absolutely nothing of value in this direction.
And n...
When you say "X is not a paradox", how do you define a paradox?
Does the original paper even refer to x-risk? The word "alignment" doesn't necessarily imply that specific aspect.
I feel like this is one of the cases where you need to be very precise about your language, and be careful not to use an "analogous" problem which actually changes the situation.
Consider the first "bajillion dollars vs dying" variant. We know that right now, there's about 8B humans alive. What happens if the exponential increase exceed that number? We probably have to assume there's an infinite number of humans, fair enough.
What does it mean that "you've chosen to play"? This implies some intentionality, but due to the structure of the game, where th...
Counterpoint: this is needlessly pedantic and a losing fight.
My understanding of the core argument is that "agent" in alignment/safety literature has a slightly different meaning than "agent" in RL. It might be the case that the difference turns out to be important, but there's still some connection between the two meanings.
I'm not going to argue that RL inherently creates "agentic" systems in the alignment sense. I suspect there's at least a strong correlation there (i.e. an RL-trained agent will typically create an agentic system), but that's honestly be...
I would be interested in some advice going a step further -- assuming a roughly sufficient technical skill level (in my case, soon-to-be PhD in an application of ML), as well as an interest in the field, how to actually enter the field with a full-time position? I know independent research is one option, but it has its pros and cons. And companies which are interested in alignment are either very tiny (=not many positions), or very huge (like OpenAI et al., =very selective)
Isn't this extremely easy to directly verify empirically?
Take a neural network $f$ trained on some standard task, like ImageNet or something. Evaluate $|f(kx) - kf(x)|$ on a bunch of samples $x$ from the dataset, and $f(x+y) - f(x) - f(y)$ on samples $x, y$. If it's "almost linear", then the difference should be very small on average. I'm not sure right now how to define "very small", but you could compare it e.g. to the distance distribution $|f(x) - f(y)|$ of independent samples, also depending on what the head is.
FWIW my opinion is that all this "...
At least how I would put this -- I don't think the important part is that NNs are literally almost linear, when viewed as input-output functions. More like, they have linearly represented features (i.e. directions in activation space, either in the network as a whole or at a fixed layer), or there are other important linear statistics of their weights (linear mode connectivity) or activations (linear probing).
Maybe beren can clarify what they had in mind, though.
"Overall, it continually gets more expensive to do the same amount of work"
This doesn't seem supported by the graph? I might be misunderstanding something, but it seems like research funding essentially followed inflation, so it didn't get more expensive in any meaningful terms. The trend even seems to be a little bit downwards for the real value.
Looking for research idea feedback:
Learning to manipulate: consider a system with a large population of agents working on a certain goal, either learned or rule-based, but at this point - fixed. This could be an environment of ants using pheromones to collect food and bring it home.
Now add another agent (or some number of them) which learns in this environment, and tries to get other agents to instead fulfil a different goal. It could be ants redirecting others to a different "home", hijacking their work.
Does this sound interesting? If it works, would it potentially be publishable as a research paper? (or at least a post on LW) Any other feedback is welcome!
But isn't the whole point that the hotel is full initially, and yet can accept more guests?
Has anyone tried to work with neural networks predicting the weights of other neural networks? I'm thinking about that in the context of something like subsystem alignment, e.g. in an RL setting where an agent first learns about the environment, and then creates the subagent (by outputting the weights or some embedding of its policy) who actually obtains some reward
This reminds me of an idea bouncing around my mind recently, admittedly not aiming to solve this problem, but possibly exhibiting it.
Drawing inspiration from human evolution, then given a sufficiently rich environment where agents have some necessities for surviving (like gathering food), they could be pretrained with something like a survival prior which doesn't require any specific reward signals.
Then, agents produced this way could be fine-tuned for downstream tasks, or in a way obeying orders. The problem would arise when an agent is given an ord...
Trump "announces" a lot of things. It doesn't matter until he actually does them.