LESSWRONG
LW

Mo Putera
144523800
Message
Dialogue
Subscribe

Long-time lurker (c. 2013), recent poster. I also write on the EA Forum.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
5Mo Putera's Shortform
6mo
140
No wikitag contributions to display.
Balsa Update: Springtime in DC
Mo Putera15h30

Out of curiosity I asked o3 to BOTEC cost-effectiveness of Jennifer's efforts. It gave me 2 answers; the one I preferred happened to be the lower one, which was still spectacularly high even after the 1-3% credit attribution: "Roughly $1.7–5 billion in expected annual export losses averted for two months of work".

Reply
adamzerner's Shortform
Mo Putera16h80

I like Quote Investigator for memetic quotes like this. It begins with

The earliest relevant evidence located by QI appeared in a 1966 collection of articles about manufacturing. An employee of the Stainless Processing Company named William H. Markle wrote a piece titled “The Manufacturing Manager’s Skills” which included a strong match for the saying under investigation. However, the words were credited to an unnamed professor at Yale University and not to Einstein. Also, the hour was split into 40 vs. 20 minutes instead of 55 vs. 5 minutes. Boldface has been added to excerpts:2

Some years ago the head of the Industrial Engineering Department of Yale University said, “If I had only one hour to solve a problem, I would spend up to two-thirds of that hour in attempting to define what the problem is.”

Albert Einstein died in 1955, and by 1973 a version of the saying had been assigned to him in an article in the journal “Invention Intelligence” based in New Delhi, India. Interestingly, the hour was split into three parts instead of two.

and then traces the rest of the meme's evolution at length.

Reply
Neuroscientist survey says P(brain preservation works) is substantial
Mo Putera4d90

Sharing the infographics from Ariel's tweet thread you linked to and also the paper, to save folks the effort of clicking through:

Image
Image
Image
https://ndownloader.figstatic.com/files/55616191/preview/55616191/preview.jpg
Reply
"What's my goal?"
Mo Putera8d00

I wasn't sure I understood the difference, so I asked Sonnet 4 and it replied:

Yes, there's a meaningful distinction here that's worth understanding. The difference lies in timeframe, specificity, and the nature of desire versus direction.

"What do I want?" often captures immediate desires, feelings, or impulses. It's more about what feels appealing or satisfying right now. For example, you might want to sleep in, eat pizza, avoid a difficult conversation, or buy something that catches your eye.

"What's my goal?" is more about intentional direction and longer-term outcomes. Goals are typically more structured and forward-looking. They represent what you're working toward, even if it requires doing things you don't particularly want to do in the moment.

I suppose this is related to, if not exactly the same as, wanting vs liking? Or am I even more confused than I realise?

Reply
Love Island USA Season 7 Episode 20: What Could The Producers Be Thinking
Mo Putera14d-32

FWIW I'm glad you posted it here, albeit mainly because it's by you.

Reply
A case for courage, when speaking of AI danger
Mo Putera14d15-9

Full tweet for anyone curious: 

i'm reminded today of a dinner conversation i had once w one of the top MIRI folks...

we talked AI safety and i felt he was playing status games in our conversation moreso than actually engaging w the substance of my questions- negging me and implying i was not very smart if i didn't immediately react w fear to the parable of the paperclip, if i asked questions about hardware & infrastructure & connectivity & data constraints...

luckily i don't define myself by my intelligence so i wasn't cowed into doom but instead joined the budding e/acc movement a few weeks later.

still i was unsettled by the attempted psychological manipulation and frame control hiding under the hunched shoulders and soft ever so polite voice.

Reply
Mo Putera's Shortform
Mo Putera14d20

Ben Evans' Are better models better? (from a business/consumer perspective, not LW/AF etc):

Part of the concept of ‘Disruption’ is that important new technologies tend to be bad at the things that matter to the previous generation of technology, but they do something else important instead. Asking if an LLM can do very specific and precise information retrieval might be like asking if an Apple II can match the uptime of a mainframe, or asking if you can build Photoshop inside Netscape. No, they can’t really do that, but that’s not the point and doesn’t mean they’re useless. They do something else, and that ‘something else’ matters more and pulls in all of the investment, innovation and company creation. Maybe, 20 years later, they can do the old thing too - maybe you can run a bank on PCs and build graphics software in a browser, eventually - but that’s not what matters at the beginning. They unlock something else. 

What is that ‘something else’ for generative AI, though? How do you think conceptually about places where that error rate is a feature, not a bug? 

Machine learning started working as image recognition, but it was much more than that, and it took a while to work out that the right way to think about it was as pattern recognition. You could philosophise for a long time about the ‘right way’ to think about what PCs, the web or mobile really were. What is that for generative AI? I don’t think anyone has really worked it out yet, but using it as a new set of API calls within traditional patterns of software feels like using the new thing to do the old things. 

By analogy:

These kinds of puzzles also remind me of a meeting I had in February 2005, now almost exactly 20 years ago, with a VP from Motorola, at the MWC mobile conference in Cannes. The iPod was the hot product, and all the phone OEMs wanted to match it, but the micro-HDD that Apple was using would break very reliably if you dropped your device. The man from Motorola pointed out that this was partly a problem of expectation and perception: if you dropped your iPod and it broke, you blamed yourself, but if you dropped your phone and it broke, you blamed the phone maker, even though it was using the same hardware. 

Six months later Apple switched from HDDs to flash memory with the Nano, and flash doesn’t break if you drop it. But two years later Apple started selling the iPhone, and now your phone does break if you drop it, but you probably blame yourself. Either way, we adopted a device that breaks if you drop if with a battery that lasts a day instead of a week, in exchange for something new that came with that. We moved our expectations. This problem of expectation and perception seems to apply right now to generative AI.

This seems loosely reminiscent of his other essay How to lose a monopoly (emphasis mine):

... what is ‘power’? When we talk about ‘power’ and ‘dominance’ and perhaps ‘monopoly’ in tech, we actually mean two rather different things, and we generally conflate them: 

  • There is having power or dominance or a monopoly around your own product in that product’s own market…
  • but then there is whether that position also means you control the broader industry. 

In the 1970s dominating mainframes meant dominating tech, and in the 1990s dominating PC operating systems (and productivity software) meant dominating tech. Not any more. IBM still dominates mainframes, and Microsoft still dominates PCs, but that isn’t where broader dominance of the tech industry comes from. Once upon a time, IBM, and then Microsoft, could make people do things they didn’t want to do. Not today. Being rich is not the same as being powerful. ... 

Today, it’s quite common to hear the assertion that our own dominant tech companies - Google, Facebook et al - will easily and naturally transfer their dominance to any new cycle that comes along. This wasn’t true for IBM or Microsoft, the two previous generations of tech dominance, but then there’s another assertion - that this was because of anti-trust intervention, especially for Microsoft. This tends to be said as though it can be taken for granted, but in fact it’s far from clear that this is actually true. 

The end of Microsoft’s dominance of tech actually came in two phases. First, as above, it lost the development environment to the web, but it still had the client (the Windows PC) and it then provided lots and lots of clients to access the web and so became a much bigger company. But second, a decade or so later, Apple proposed a better client model with the iPhone, and Google picked that up and made a version for every other manufacturer to use. Microsoft lost dominance of development to the web, and then lost dominance of the client to smartphones. 

As we all know, there were major anti-trust cases around what Microsoft tried to do with the web, and specific regulatory interventions, and so you can at least argue for some direct connection to Microsoft’s failure to take the lead online, although this can be disputed. But those cases ended in 2001 and none of them said anything about mobile, and yet Microsoft lost that as well. So what happened? 

Here, the argument for anti-trust as the decisive factor generally acknowledges that nothing in the actual judgement or remedies that were imposed had any specific effect on Microsoft’s mobile efforts, but instead says that Microsoft somehow became less good at execution or aggression as a result. 

There are two problems with this. The first is that it wasn’t remotely apparent in 2007 that Microsoft wasn’t being aggressive in mobile. After all, Microsoft didn’t ‘miss’ mobile -  it had started with the launch of Windows CE in 1996, and accelerated with PocketPC in 2001, and it had a whole bunch of ‘Windows’ smartphones on the market when the iPhone launched. 

Rather, the iPhone created such a radical change in every assumption about how you would make a ‘smartphone’ that everyone else had to start again from scratch. It’s important to remember that none of the smartphone companies who’d been building things since the late 1990s - Nokia/Symbian, Palm, RIM and Microsoft - managed the transition. None of the others had anti-trust issues. But, they all had platforms, and just as importantly cultures and assumptions, that were based on the constraints of hardware and networks in 2000, whereas the iPhone was based on what hardware and networks would look like in 2010. The only way to compete was with a totally new platform and totally new assumptions about how it would work, and ‘dump our platform and build an entirely new one’ is always a near-death experience in technology. Failing to make it isn’t about a lack of aggression or execution - it’s that it’s really hard. 

Indeed, even knowing quite what to do is hard.  For Microsoft, we know now that the answer would have been to create an entirely new operating system, with no cross-compatibility with Windows apps, and make it open source, and give it away for free. Imagine saying that to Bill Gates that in 2007 - he’d have looked at you as though you’d grown a third arm.

which segued into a discussion on 'moats' (emphasis mine):

The tech industry loves to talk about ‘moats’ around a business - some mechanic of the product or market that forms a fundamental structural barrier to competition, so that just having a better product isn‘t enough to break in. But there are several ways that a moat can stop working. Sometimes the King orders you to fill in the moat and knock down the walls. This is the deus ex machina of state intervention - of anti-trust investigations and trials. But sometimes the river changes course, or the harbour silts up, or someone opens a new pass over the mountains, or the trade routes move, and the castle is still there and still impregnable but slowly stops being important. This is what happened to IBM and Microsoft. The competition isn’t another mainframe company or another PC operating system - it’s something that solves the same underlying user needs in very different ways, or creates new ones that matter more. The web didn’t bridge Microsoft’s moat  - it went around, and made it irrelevant. Of course, this isn’t limited to tech - railway and ocean liner companies didn’t make the jump into airlines either. But those companies had a run of a century - IBM and Microsoft each only got 20 years. 

Reply
Foom & Doom 1: “Brain in a box in a basement”
Mo Putera16d52

The main nuance that your description

LLM are already good at solving complicated, Ph.D. level mathematical problems

misses out on is that these are very specific kinds of problems:

Problems must be novel and unpublished, with answers that can be automatically verified through computation—either as exact integers or mathematical objects like matrices and symbolic expressions in SymPy. 

That excludes nearly all of research math. 

Reply
sunwillrise's Shortform
Mo Putera19d20

(Tangent: I had no idea what that sentence meant; Sonnet 4 says

This is Gen Z/Gen Alpha internet slang, often called "brainrot" language. Here's the translation:

  • "mogged" = dominated/outperformed (from "AMOG" - Alpha Male of Group)
  • "sigma" = a supposed personality type above "alpha" in internet masculinity hierarchies
  • "rizz" = charisma, particularly with romantic interests (from "charisma")
  • "gyatt" = exclamation expressing attraction (corruption of "goddamn")
  • "skibidi" = meaningless word from viral YouTube videos, often used as filler
  • "What the sigma?" = "What the hell?" but using sigma slang
  • "Sus" = suspicious
  • "No cap" = "no lie" or "I'm serious"
  • "fanum tax" = taking someone's food (from streamer Fanum)

in case anyone else was as confused)

Reply1
AI #121 Part 1: New Connections
Mo Putera20d20

Benjamin Todd: Dropping the error rate from 10% to 1% (per 10min) makes 10h tasks possible.

In practice, the error rate has been halving every 4 months(!).

In fact we can’t rule out that individual humans have a fixed error rate – just one that’s lower than current AIs.

Ever since I read Sarah Constantin's Errors vs. Bugs and the End of Stupidity I find myself immediately skeptical of claims like "humans have a fixed error rate". 

A common mental model for performance is what I'll call the "error model."  In the error model, a person's performance of a musical piece (or performance on a test) is a perfect performance plus some random error.  You can literally think of each note, or each answer, as x + c*epsilon_i, where x is the correct note/answer, and epsilon_i is a random variable, iid Gaussian or something.  Better performers have a lower error rate c.  Improvement is a matter of lowering your error rate.  This, or something like it, is the model that underlies school grades and test scores. Your grade is based on the percent you get correct.  Your performance is defined by a single continuous parameter, your accuracy.

But we could also consider the "bug model" of errors.  A person taking a test or playing a piece of music is executing a program, a deterministic procedure.  If your program has a bug, then you'll get a whole class of problems wrong, consistently.  Bugs, unlike error rates, can't be quantified along a single axis as less or more severe.  A bug gets everything that it affects wrong.  And fixing bugs doesn't improve your performance in a continuous fashion; you can fix a "little" bug and immediately go from getting everything wrong to everything right.  You can't really describe the accuracy of a buggy program by the percent of questions it gets right; if you ask it to do something different, it could suddenly go from 99% right to 0% right.  You can only define its behavior by isolating what the bug does.

Often, I think mistakes are more like bugs than errors.  My clinkers weren't random; they were in specific places, because I had sub-optimal fingerings in those places.  A kid who gets arithmetic questions wrong usually isn't getting them wrong at random; there's something missing in their understanding, like not getting the difference between multiplication and addition.  Working generically "harder" doesn't fix bugs (though fixing bugs does require work). 

Reply
Load More
5Mo Putera's Shortform
6mo
140
2Non-loss of control AGI-related catastrophes are out of control too
2y
3
12How should we think about the decision relevance of models estimating p(doom)?
Q
2y
Q
1