All of npostavs's Comments + Replies

Unlike first-order logic, second-order logic is not recursively enumerable—less computationally tractable, more fluid, more human. It operates in a space that, for now, remains beyond the reach of machines still bound to the strict determinism of their logic gates.

In what sense is second-order logic "beyond the reach of machines"? Is it non-deterministic? Or what are you trying to say here? (Maybe some examples would help)

1milanrosko
Ah okay. Sorry for being an a-hole, but some of the comments here are just... You asked a question in good faith and I mistook it. So, it's simple: Imagine you’re playing with LEGO blocks. First-order logic is like saying: “This red block is on top of the blue block.” You’re talking about specific things (blocks), and how they relate. It’s very rule-based and clear. Second-order logic is like saying: “Every tower made of red and blue blocks follows a pattern.” Now you’re talking about patterns of blocks, not just the blocks. You're making rules about rules. Why can't machines fully "do" second-order logic? Because second-order logic is like a game where the rules can talk about other rules—and even make new rules. Machines (like computers or AIs) are really good at following fixed rules (like in first-order logic), but they struggle when: The rules are about rules themselves, and You can’t list or check all the possibilities, ever—even in theory. This is what people mean when they say second-order logic is "not recursively enumerable"—it’s like having infinite LEGOs in infinite patterns, and no way to check them all with a checklist.
-1milanrosko
Think of it like this: Why is Gödel’s attack on ZFC and Peano Arithmetic so powerful... Gödel’s Incompleteness Theorems are powerful because they revealed inherent limitations USING ONLY first-order logic. He showed that any sufficiently expressive, consistent system cannot prove all truths about arithmetic within itself... but with only numbers. First-order logic is often seen as more fundamental because it has desirable properties like completeness and compactness, and its semantics are well-understood. In contrast, second-order logic, while more expressive, lacks these properties and relies on stronger assumptions... According to EN, this is also because second order logic is entirely human made.So what is second-order-logic? The question itself is a question of second-order-logic.  If you ask me what first order logic is... The question STILL is a question of second-order-logic. First order logic are things that are clear as night and day. 1+1, what is x in x+3=4... these type of things.

What about tuning the fiddle strings down 1 tone?

You say this:

If you’re thinking, “Wait no, I’m pretty sure my group is fundamentally about X, which is fundamentally good,” then you’re probably still in Red or Blue.

But you also say this:

First, the Grey tribe is about something, [...] things that people already think are good in themselves.

Doesn't the first statement completely undermine the second one?

1PatrickDFarley
You're right, I'm assuming the reader belongs to a "real" tribe ie red or blue . I should've tweaked it for the LW crosspost 

I guess you meant jukebox, not jutebox. Unless there is some kind of record-playing box made of jute fiber that I haven't heard of...

2Donald Hobson
Fixed

but I recently tried again to see if it could learn at runtime not to lose in the same way multiple times. It couldn't. I was able to play the same strategy over and over again in the same chat history and win every time.

I wonder if having the losses in the chat history would instead be training/reinforcing it to lose every time.

6gwern
For a base model, probably yes. Each loss is additional evidence that the simulacrum or persona which is 'playing' the 'human' is very bad at tic-tac-toe and will lose each time (similar to how rolling out a random chess game to 'test a chess LLM's world model' also implies to the LLM that the chess player being imitated must be very stupid to be making such terrible moves), and you have the usual self-reinforcing EDT problem. It will monotonically play the same or worse. (Note that the underlying model may still get better at playing, because it is learning from each game, especially if the actual human is correcting the tic-tac-toe outputs and eg. fixing mistakes in the LLM's world-model of the game. This could be probed by forcing a switch of simulacra, to keep the world-modeling but shed the hobbled simulacrum: for example, you could edit in a passage saying something like "congratulations, you beat the first level AI! Now prepare for the Tic-Tac-Toe Master to defeat you!"; the more games trained on / in context, the worse the first simulacra but better the second will be.) For the actual chatbot assistants you are using, it's more ambiguous. They are 'motivated' to perform well, whatever that means in context (according to their internal model of a human rater), and they 'know' that they are chatbots, and so a history of errors doesn't much override their prior about their competence. But you still have issues with learning efficiently from a context window and 'getting lost in the muddle' and one still sees the assistant personas 'getting confused' and going in circles and eg. proposing the same program whose compile error you already pasted in, so it would depend. Tic-tac-toe is simple enough that I think I would expect a LLM to get better over a decent number of games before probably starting to degrade.

Yes, my understanding is that the system prompt isn't really priviledged in any way by the LLM itself, just in the scaffolding around it.

But regardless, this sounds to me less like maintaining or forming a sense of purpose, and more like retrieving information from the context window.

That is, if the LLM has previously seen (through system prompt or first instruction or whatever) "your purpose is to assist the user", and later sees "what is your purpose?" an answer saying "my purpose is to assist the user" doesn't seem like evidence of purposefulness. Same if you run the exercise with "flurbles are purple", and later "what color are flurbles?" with the answer "purple".

#2: Purposefulness.  The Big 3 LLMs typically maintain or can at least form a sense of purpose or intention throughout a conversation with you, such as to assist you.

Isn't this just because the system prompt is always saying something along the lines of "your purpose is to assist the user"?

3Ann
There are APIs. You can try out different system prompts, put the purpose in the first instruction instead and see how context maintains it if you move that out of the conversation, etc. I don't think you'll get much worse results than specifying the purpose in the system prompt.

by saying their name aloud: [...] …but it’s a lot more difficult to use active recall to remember people’s names.

I'm confused, isn't saying their name in a sentence an example of active recall?

1Saul Munn
hmm, that's fair — i guess there's another, finer distinction here between "active recall" and chaining the mental motion of recalling of something to some triggering mental motion. i usually think of "active recall" as the process of: * mental-state-1 * ~stuff going on in your brain~ * mental-state-2 over time, you build up an association between mental-state-1 and mental-state-2. doing this with active recall looks like being shown something that automatically triggers mental-state-1, then being forced to actively recall mental-state-2. with names/faces, i think that if you were to e.g. look at their face, then try to remember their name, i'd say that probably counts as active recall (where mental-state-1 is "person's face," mental-state-2 is "person's name," and ~stuff going on in your brain~ is the mental motion of going from their face to their name). thanks for pointing that out!

Finding two bugs in a large codebase doesn't seem especially suspicious to me.

I don't think I understand, what is the strawman?

I think the AI gave the expected answer here, that is, it agreed with and expanded on the opinions given in the prompt. I wouldn't say it's great or dumb, it's just something to be aware of when reading AI output.

-5Donatas Lučiūnas
npostavs
125

It looks like you are measuring smartness by how much it agrees with your opinions? I guess you will find that Claude is not only smarter than LessWrong, but it's also smarter any human alive (except yourself) by this measure.

-9Donatas Lučiūnas

Entries 1a and 1b are obviously not not relevant to the OP, which is mainly about the sense in 3b (maybe a little bit the 3a sense too, since it is "merged with or coloured by sense 3b").

Entry 3b looks (to me) sufficiently broad and vague that it doesn't really rule anything out. Do you think it contradicts anything that's in the OP?

The OED defines ‘gender’, excluding obsolete meanings, as follows:

Okay? Why are you telling us this?

3M. Y. Zuo
Because it’s already settled, at least according to some authority with a better track record and higher credibility than any individual author/reviewer/OP/etc… That’s pretty much always the intended meaning, whenever anyone copies text straight from a dictionary anywhere on this site.

Maybe if you solve for equilibrium you get that after releasing the tool, the tool is defeated reasonably quickly?

I believe it's already known that running the text through another (possibly smaller and cheaper) LLM to reword it can remove the watermarking. So for catching cheaters it's only a tiny bit stronger than searching for "as a large language model" in the text.

Why release a phone with 5 new features when you can just R&D one and put it in a new case?

In the ideal case of a competitive market, you don't release just one new feature, because any of your competitors could release a phone with two new features and eat your lunch. But the real-world smartphone market is surely much closer to oligopoly than perfect competition.

The costs of the competition of the market are almost invisible, but we have been seeing them over decades get more and more obvious.

How sure are you that this isn't rather the costs of lack of competition?

1James Stephen Brown
Thanks for your points npostavs This is essentially my point, the government actually have to take measures to break up oligopolies, because oligopolies they are beneficial to companies for maximising profits by charging as much as possible for the least improvement (cost). The costs we've been discussing are externalities like environmental degradation and economic inequality. Competition has been shown to bring down prices—by making sales dependent on lower prices, so this is good for consumers but doesn't take into account those externalities (hence why they're called externalities). A need for lower prices means companies necessarily have to look for ways to cut costs, which means prioritising lower cost materials over environmentally friendly materials and lower cost labour and automation over good pay for employees. So, as long as the externalities aren't part of the profit equation, the wider system will bear the cost and the system won't self-balance. There are many ways that externalities can be made part of the profit equation; carbon credits or minimum wage requirements and regulations, but these necessarily have to be introduced from outside the market, they are not produced by the market.

Maybe, although what is "sufficient" depends a lot on the rate of catching the evaders. I don't have a good guess as to what that rate is.

Yes, currently very few companies report paying ransom payments. When this tax is introduced the motivation for hiding payments will be even higher, and go up with the tax rate. So when you say "With each increase in tax rates, a market equilibrium will be reached where the funding of ransomware is significantly reduced" I would guess instead that reporting will go down.

2Brian Bien
Do you think sufficiently stiff penalties for non-reporting (in proportion to the payment amount, perhaps) might address this?

You didn't say anything about tax evasion in this post, which seems like an important thing to consider. Most ransomware payments are made secretly, right?

1Brian Bien
Is the argument roughly, "some will evade taxes, so the policy will not work as well, and therefore is not worth implementing?"

Worsening housing and rent problems in California, Canada, major metropolitan areas, Japan, China, and other places that are facing housing shortages could ignite support for Georgism.

Do Japan and China have housing shortages? I thought Japan was the canonical "zoning done right" example. And doesn't China have some sort of over-supply sitatution due to government subsidies?

4Zero Contradictions
That depends on what you consider to be a "housing shortage". The cost of rent is a major reason why birth rates remain low in China, Japan, and other countries. One of the reasons why capsule hotels are popular in Japan is because rent is so expensive.
npostavs
10

slatestarcodex being contra hanson on healthcare

That case (I didn't follow the others) seemed like it was mostly about confusion over what Hanson's position even is. Maybe because Hanson and/or people misunderstanding him tried to compress it into short tweets.

npostavs
32

But how can you know that? Couldn't there be actual insider sources truthfully reporting the existence of such discussions?

Yes, I perhaps should have said "I think there is a 99% chance this is made up". As a general rule, I think any politically charged story based on "anonymous insider sources" should be considered very low credibility, and if there is no other support, then a 90+ chance of being made up is about right. More credibility points lost in this case for the only source being a tweet from a guy who seems to be advertising some kind of pass... (read more)

npostavs
90

Canada also is looking to impose a $25k penalty and double its ‘exit fee’ for citizens who leave the country, to ‘curb the emigration crisis.’

 

This is made up, apparently. 

https://thezvi.substack.com/p/monthly-roundup-18-may-2024/comment/56269684

https://www.yahoo.com/news/users-spread-unfounded-claims-impending-163724801.html

Recent headlines are about too much immigration (e.g., https://www.theglobeandmail.com/business/article-canada-stuck-in-population-trap-needs-to-reduce-immigration-bank/), so 'emigration crisis' doesn't make much sense.

3Radford Neal
The quote says that "according to insider sources" the Trudeau government is "reportedly discussing" such measures.  Maybe they just made this up.  But how can you know that?  Couldn't there be actual insider sources truthfully reporting the existence of such discussions?  A denial from the government does not carry much weight in such matters.   There can simultaneously be an crisis of immigration of poor people and a crisis of emigration of rich people.

Unless you also think the United States is an outlier in terms of spouses who don't unconditionally love each other, I guess you have to endorse something like Kaj_Sotala's point that divorce isn't always the same as ending love though, right?

probably the majority of spouses unconditionally love their partners.

How do you square this with ~50% of marriages ending in divorce?

1Martin Randall
Perhaps the majority of spouses think they unconditionally love their partners, and think they are unconditionally loved back, but some are wrong. Prediction is hard.
1Odd anon
The United States is an outlier in divorce statistics. In most places, the rate is nowhere near that high.
6Kaj_Sotala
Ending a relationship/marriage doesn't necessarily imply that you no longer love someone (I haven't been married but I do still love several of my ex-partners), it just implies that the arrangement didn't work out for one reason or another.

a good trade for immunity to cavities and gum disease.

If you throw in immunity to bad breath

FYI, https://www.luminaprobiotic.com/faq says used to say

This strain doesn't do anything to protect against gum disease, or bad breath.

8ryan_greenblatt
Does it? I see: Emphasis mine.

And he thinks Hermes 2 Pro is ‘cracked for agentic function calling,’

I don't understand what the word 'cracked' means here; "broken" or "super awesome" or ...?

1tslarm
Pretty sure it's "super awesome". That's one of the common slang meanings, and it fits with the paragraphs that follow.

persuade/inspire/motivate/stimulate etc is just the politically correct way of saying what it actual is, which is manipulation.

Persuade has a fairly neutral connotation for me, that is "I was persuaded to give 10k to a scammer" and "I was persuaded by a friend to quit my day job" both seem correct to me. I would nominate that as the word for describing what it "actually" is, rather than "manipulation" which seems overly negative/cynical.

1Anders Lindström
Thank you npostavs for your comment, As I points out in my answer to SeñorDingDong below, we are manipulated not persuaded into certain actions. Just as you do not persuade an excavator to dig, you manipulate the system into the digging action by pulling levers and pushing button. The same must apply for other systems, including humans, as well.

I think anorexia is in a different category because the patient often doesn't want to get better. David Burns talks about it a little on https://feelinggood.com/2019/11/25/168-ask-david-the-blushing-cure-how-to-heal-a-broken-heart-treating-anorexia-and-more/, where he mentions that some sort of therapy with a 50% success rate is good.

The rapid cure stuff is mainly about depression and anxiety disorders, I guess agoraphobia should count (with the caveat that the patient has to be well enough to reach the therapist's office). Certainly whether it "could take... (read more)

David Burns also has his own podcast, many episodes of which are example live sessions of this rapid cure (see https://feelinggood.com/list-of-feeling-good-podcasts/ and search for "live therapy", or https://feelinggood.com/podcast-database/ which has a fancy Javascript interface allowing filtering on tags).

He does often make the explicit claim on his podcast, that 90% of patients can be cured in one or two sessions (plus one more for "relapse prevention"). It's a bit hard to know how much of this is from a selection effect on the patients though. I'm pret... (read more)

npostavs
5549

A big part of understanding the culture of futility is understanding how traumatic it is when the bad guys win. When SBF, the Luke Skywalker of crypto, and CZ, the Darth Vader of crypto, go head to head and CZ emerges victorious. Then CZ says "Ha! serves you right for being an idiotic do-gooder" and everyone cheers.

Didn't we actually learn that they were both bad guys? I find this example confusing.

8mako yass
If we're about to get a trevorpost about how SBF was actually good and we only think otherwise due to narrative manipulation and toxic ingroup signalling dynamics I'm here for it

I was kind of surprised by this too; I found this study which seems to support it though: https://theconversation.com/we-studied-what-happens-when-guys-add-their-cats-to-their-dating-app-profiles-144999

In our study, we recruited 1,388 heterosexual American women from 18 to 24 years old to take a short anonymous online survey[...] Most of the women found the men holding cats to be less dateable. This result surprised us, since previous studies had shown that women found men with pets to have higher potential as partners. They also thought the men holding

... (read more)
2CronoDAS
Well, since I like cats a lot and, when I was single, would have preferred a woman who liked cats too, maybe I could have filtered out some potentially bad matches this way? (I didn't actually have a cat, though, for various reasons.)
7gwern
As that study cited ("Domestic Dogs as Facilitators in Social Interaction: An Evaluation of Helping and Courtship Behaviors", Nicolas Guéguen & Serge Ciccotti 2008) is highly likely to be fabricated and made up, it's not surprising that it might be contradicted by some real research. The first/corresponding author is Nicolas Guéguen, who is a notorious serial fabricator/retractor (see Retraction Watch's regular posts on him). It is regrettable that neither Kogan, Volsche, or any of the commenters at The Conversation appear to be aware of this and cite it uncritically, rather than solely to chuck it into the trash bin...

The NYT paywall doesn't didn't do anything if Javascript is disabled.

EDIT: I've noticed recently that NYT articles are cut-off before the end now, even without JavaScript. I wonder if the timing of this paywall upgrade is related to the lawsuit?

No particular reason why we can only have 42 chromosomes

Isn't having extra chromosomes usually bad? https://en.wikipedia.org/wiki/Trisomy

(PS the usual number is 46)

2[anonymous]
The extra chromosomes have duplicate genes on them. The "excessive number of copies of correct genes" is the hypothesis for why it is bad. Theoretically a new chromosome with new cellular firmware codes genes that turn off the legacy genes being overwritten. If you had an advanced enough understanding of biology and advanced tools and advanced testing methods you could do this. It would look nothing like today.

What is an example where two negative numbers multiply to give a negative number?

Since you didn't specify real numbers, it seems like -i * -i = -1 should fit?

4blf
Usually, negative means "less than 0", and a comparison is only available for real numbers and not complex numbers, so negative numbers mean negative real numbers. That said, ChatGPT is actually correct to use "Normally" in "Normally, when you multiply two negative numbers, you get a positive number." because taking the product of two negative floating point numbers can give zero if the numbers are too tiny.  Concretely in python -1e300 * -1e300 gives an exact zero, and this holds in all programming languages that follow the IEEE 754 standard.

We know roughly how to achieve immortality

Isn't the assumption that once we successfully align AGI, it can do the work on immortality? So "we" don't need to know how beyond that.

then you could spread the pesticide (and not other pesticides) in the region

This would affect other insects in addition to the targeted mosquitoes, right? This seems strictly worse than the original gene drive proposition to me.

3Nathan Helm-Burger
You could use a more targeted pesticide. Also, people are already spraying pesticides to kill disease carrying mosquitos. I'm not saying it's a better option from a scientific/engineering/logic perspective, just that it should be something to do market research on to see if the target populations are less irrationally creeped out by it.

A survey shows that gay male teenagers are several times more likely to conceive girls than straight male teenagers.

Does "conceive" mean "have sex with" here? Because according to what I think of as the standard definition of that word, you would be saying that gay male teenagers are more likely to produce female offspring (which sounds pretty silly). Did the survey use that word?

5tailcalled
House Beaver is talking about surveys which find a correlation between saying one is gay and saying one has impregnated someone/become pregnant. So like House Beaver's idea is if those who say they are gay teen boys in surveys also have a greater tendency to say they've impregnated someone, then House Beaver thinks this is probably because gay teen boys are more likely to impregnated teen girls than straight teen boys are. Whereas I'd be inclined to say it's because some teens find it funny to say they are 7 foot tall blind gang members who are addicted to heroin.

Testing with PortAudio's demo paex_read_write_wire.c [2]

It looks like this uses the blocking IO interface, I guess that adds its own buffering on top of everything else. For minimal latency you want the callback interface. Try adapting test/patest_wire.c or test/pa_minlat.c.

Humans have lived during one of Earth's colder period, but historically it's been a lot hotter. Our bodies are well adapted for heat (so long as we can cool off using sweat)

This doesn't seem very reassuring? For example, https://climate.nasa.gov/explore/ask-nasa-climate/3151/too-hot-to-handle-how-climate-change-may-make-some-places-too-hot-to-live/

Since 2005, wet-bulb temperature values above 95 degrees Fahrenheit [35 C] have occurred for short periods of time on nine separate occasions in a few subtropical places like Pakistan and the Persian Gulf. T

... (read more)
1Gordon Seidoh Worley
Currently lots of the Earth is too cold to live in. In a warmer Earth those places would become habitable even as other places became too hot.

Let me just quote Wikipedia: "A seahorse [...] is any of 46 species of small marine fish in the genus Hippocampus." Because I spent a few confused minutes trying to figure out how males could face more intense competion in a brain part.

He says non-programmers; I guess you misread?

6Alex Vermillion
No need to guess, I will confirm: I was super wrong and you have correctly figured out how.

Theoretically capitalism should be fixing these examples automatically

Huh? Why?

2[anonymous]
1. By eventually having no choice but to hire new grads 2. By eventually offering roles that pay more due to a labor shortage with less hours 3. This one can stay in disequilibrium forever as animated characters can be immensely popular and generative Ai combined with modern rendering has crossed the uncanny valley after approximately 28 years. (Toy story 1,1995) So the animated actors would appear to be real. Actually on reflection assuming AI continues to improve, 1 and 2 also can stay in disequilibrium.

If you want to get a job working on machine learning research, the claim here is that the best way to do that is to replicate a bunch of papers. Daniel Ziegler (yes, a Stanford ML PhD dropout, and yes that was likely doing a lot of work here) spent 6 weeks replicating papers and then got a research engineer job at OpenAI.

Wait, a research job at OpenAI? That’s worse. You do know why that’s worse, right?

 

I don't know why, and I'm confused about what this sentence is saying. Worse than what?

I don't think anyone is proposing to offer this deal to Putin; it's not like the rank and file soldiers are able to make the "invade your neighbor" decision in a bid to get EU citizenship.

1Templarrr
That's not really the point? Incentives for "doing the right thing from the start" should be better than for just "stopping doing the wrong thing". I can probably see citizenship as an ok option for those Russians who join the fight from Ukrainian side (e.g. "Freedom of Russia Legion"), because they did both. But simply for stopping being murderer? No.

Wikipedia says:

Low confidence generally means questionable or implausible information was used, the information is too fragmented or poorly corroborated to make solid analytic inferences, or significant concerns or problems with sources existed.1

Load More