Tao Lin - LessWrong

Note that any competent capital holder has significant conflict of interest with AI, AI is already a significant fraction of the stock market and a pause would bring down most capital, not just private lab equity

Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study

Tao Lin23d130

I agree frontier models severely lack spatial reasoning on images, which I attribute to a lack of in-depth spatial discussion of images on the internet. My model of frontier models' vision capabilities is that they have very deep knowledge of aspects of images that relate to text that happens to be immediately before or after it in web text, and only a very small fraction of images on the internet have accompanying in-depth spatial discussion. The models are very good at for instance guessing the location of where photos were taken, vastly better than most humans, because locations are more often mentioned around photos. I expect that if labs want to, they can construct enough semi-synthetic data to fix this.

Show, not tell: GPT-4o is more opinionated in images than in text

Tao Lin1mo20

Yeah they may be the same weights. The above quote does not absolutely imply the same weights generate the text and images IMO, just that it's based on the 4o and sees the whole prompt. OpenAI's audio generation is also 'native', but it's served as a separate model on the API with different release dates, and you can't mix audio and some function calling in chatgpt in a way that's consistent with them not actually being the same weights.

Show, not tell: GPT-4o is more opinionated in images than in text

Tao Lin1mo10

Note that the weights of 'gpt-4o image generation' may not be the same - they may be separate finetuned models! The main 4o chat llm calls a tool start generating an image, which may use the same weights but may just use different weights that have different post training

Why do many people who care about AI Safety not clearly endorse PauseAI?

Tao Lin1mo0-1

EU AI Code of Practice is better, a little closer to stopping ai development

Good Research Takes are Not Sufficient for Good Strategic Takes

Tao Lin1mo10

yeah there's generalization, but I do thing that eg (AGI technical alignment strategy, AGI lab and government strategy, AI welfare, AGI capabilities strategy) are sufficiently different that experts at one will be significantly behind experts on the others

Good Research Takes are Not Sufficient for Good Strategic Takes

Tao Lin2mo70

Also, if you're asking a panel of people, even those skilled at strategic thinking will still be useless unless they've thought deeply about the particular question or adjacent ones. And skilled strategic thinkers can get outdated quickly if they haven't thought seriously about the problem in awhile.

Daniel Kokotajlo's Shortform

Tao Lin2mo30

The fact that they have a short lifecycle with only 1 lifetime breeding cycle is though. A lot of intelligent animals, like humans, chimps, elephants, dolphins, orcas, have long lives with many breeding cycles and grandparent roles. Ideally we want an animal that starts breeding in 1 year AND lives for 5+ breeding cycles to be able to learn enough to be useful over its lifetime. It takes so long for humans to learn enough to be useful!

How Much Are LLMs Actually Boosting Real-World Programmer Productivity?

Tao Lin2mo126

Empirically, we likewise don't seem to be living in the world where the whole software industry is suddenly 5-10 times more productive. It'll have been the case for 1-2 years now, and I, at least, have felt approximately zero impact. I don't see 5-10x more useful features in the software I use, or 5-10x more software that's useful to me, or that the software I'm using is suddenly working 5-10x better, etc.

Diminishing returns! Scaling laws! One concrete version of "5x productivity" is "as much productivity as 5 copies of me in parallel", and we know that usually 5x-ing most inputs, like training compute and data, # of employees, etc, more often scales logarithmically instead of linearly

Fabien's Shortform

Tao Lin2mo10

I was actually just making some tree search scaffolding, and i had the choice between honestly telling each agent would be terminated if it failed or not. I ended up telling them relatively gently that they would be terminated if they failed. Your results are maybe useful to me lol

LESSWRONG
LW

Posts

Wikitag Contributions

Comments