lc

Sequences

The Territories
Mechanics of Tradecraft

Wikitag Contributions

Comments

Sorted by
lc*100

If you plot a line, does it plateau or does it get to professional human level (i.e. reliably doing all the things you are trying to get it to do as well as a professional human would)?

It plateaus before professional human level, both in a macro sense (comparing what ZeroPath can do vs. human pentesters) and in a micro sense (comparing the individual tasks ZeroPath does when it's analyzing code). At least, the errors the models make are not ones I would expect a professional to make; I haven't actually hired a bunch of pentesters and asked them to do the same tasks we expect of the language models and made the diff. One thing our tool has over people is breadth, but that's because we can parallelize inspection of different pieces and not because the models are doing tasks better than humans.

What about 4.5? Is it as good as 3.7 Sonnet but you don't use it for cost reasons? Or is it actually worse?

We have not yet tried 4.5 as it's so expensive that we would not be able to deploy it, even for limited sections. 

lc100

We use different models for different tasks for cost reasons. The primary workhorse model today is 3.7 sonnet, whose improvement over 3.6 sonnet was smaller than 3.6's improvement over 3.5 sonnet. When taking the job of this workhorse model, o3-mini and the rest of the recent o-series models were strictly worse than 3.6.

lc*40

I haven't read the METR paper in full, but from the examples given I'm worried the tests might be biased in favor of an agent with no capacity for long term memory, or at least not hitting the thresholds where context limitations become a problem:

 

For instance, task #3 here is at the limit of current AI capabilities (takes an hour). But it's also something that could plausibly be done with very little context; if the AI just puts all of the example files in its context window it might be able to write the rest of the decoder from scratch. It might not even need to have the example files in memory while it's debugging its project against the test cases.

Whereas a task to fix a bug in a large software project, while it might take an engineer associated with that project "an hour" to finish, requires stretching the limits of the amount of information it can fit inside a context window, or recall beyond what we seem to be capable of doing today. 

lc3-7

There was a type of guy circa 2021 that basically said that gpt-3 etc. was cool, but we should be cautious about assuming everything was going to change, because the context limitation was a key bottleneck that might never be overcome. That guy's take was briefly "discredited" in subsequent years when LLM companies increased context lengths to 100k, 200k tokens.

I think that was premature. The context limitations (in particular the lack of an equivalent to human long term memory) are the key deficit of current LLMs and we haven't really seen much improvement at all.

lc109

If AI executives really are as bullish as they say they are on progress, then why are they willing to raise money anywhere in the ballpark of current valuations?

The story is that they need the capital to build the models that they think will do that.

lc*41

Moral intuitions are odd. The current government's gutting of the AI safety summit is upsetting, but somehow less upsetting to my hindbrain than its order to drop the corruption charges against a mayor. I guess the AI safety thing is worse in practice but less shocking in terms of abstract conduct violations.

lc*40

It helps, but this could be solved with increased affection for your children specifically, so I don't think it's the actual motivation for the trait.

The core is probably several things, but note that this bias is also part of a larger package of traits that makes someone less disagreeable. I'm guessing that the same selection effects that made men more disagreeable than women are also probably partly responsible for this gender difference.

lc*40

I suspect that the psychopath's theory of mind is not "other people are generally nicer than me", but "other people are generally stupid, or too weak to risk fighting with me".

That is true, and it is indeed a bias, but it doesn't change the fact that their assessment of whether others are going to hurt them seems basically well calibrated. The anecdata that needs to be explained is why nice people do not seem to be able to tell when others are going to take advantage of them, but mean people do. The posts' offered reason is that generous impressions of others are advantageous for trust-building.

Mr. Portman probably believed that some children forgot to pay for the chocolate bars, because he was aware that different people have different memory skills.

This was the explanation he offered, yeah.

lc50

This post is about a suspected cognitive bias and why I think it came to be. It's not trying to justify any behavior, as far as I can tell, unless you think the sentiment "people are pretty awful" justifies bad behavior in of itself.

The game theory is mostly an extended metaphor rather than a serious model. Humans are complicated.

lc13-5

Elon already has all of the money in the world. I think he and his employs are ideologically driven, and as far as I can tell they're making sensible decisions given their stated goals of reducing unnecessary spend/sprawl. I seriously doubt they're going to use this access to either raid the treasury or turn it into a personal fiefdom. It's possible that in their haste they're introducing security risks, but I also think the tendency of media outlets and their sources will be to exaggerate those security risks. I'd be happy to start a prediction market about this if a regular feels very differently.

If Trump himself was spearheading this effort I would be more worried.

Load More