All of duck_master's Comments + Replies

Sorry I'm arriving late

Make base models great again. I'm still nostalgic for GPT-2 or GPT-3. I can understand why RLHF was invented in the first place but it seems to me that you could still train a base model, so that if it's about to say something dangerous, it just prematurely cuts off the generation by emitting the <endoftext> token instead.

Alternatively, make models natively emit structured data. LLMs in their current form emit free-form arbitrary text which needs to be parsed in all sorts of annoying ways in order to make it useful for any downstream applications anyways. Also, structured output could help with preventing misaligned behavior.

(I'm less confident in this idea than the previous one.)

Try to wean people off excessive reliance on LLMs. This is probably the biggest source of AI-related negative effects today. I am trying to do this myself (I formerly alternated between claude, chatgpt, and lmarena.ai several times a day), but it is hard.

(By AI-related risks I mean effects like people losing their ability to write originally or think independently.)

When I visited Manhattan, I realized that "Wall Street" and "Broadway" are not just overused clichés, but the names of actual streets (you can walk on them!)

i will have to probably leave by 6:30pm at the latest :|

2mishka
the meetup page says 7:30pm, but actually the building asks people to leave by 9pm

I am a bit sick today but the meetup will happen regardless.

Actually, not going at all. Scheduling conflict.

(To organizer: Sorry for switching to "Can't Go" and back; I thought this was on the wrong day. I might be able to make this.)

1duck_master
Actually, not going at all. Scheduling conflict.

The single biggest question I have is "what is Dirichlet?"

I might come if the venue wasn't a bar

To be fair, there is no evidence requirement for upvoting, either.

I could see why someone would want this (eg Reddit's upvote/downvote system seems to be terrible), but I think LW is small and homogenous-ish enough that it works okay here.

"AI that can verify itself" seems likely doable for reasons wholly unrelated to metamathematics (unlike what you claim offhandedly) since AIs are finite objects that nevertheless need to handle a combinatorially large space. This has the flavor of "searching a combinatorial explosion based on a small yet well-structured set of criteria" (ie the relatively easy instances of various NP problems), which has had a fair bit of success with SAT/SMT solvers and nonconvex optimizers and evolutionary algorithms and whatnot. I don't think constructing a system that ... (read more)

2ACrackedPot
The issue arises specifically in the situation of recursive self-improvement: You can't prove self-consistency in mathematical frameworks of "sufficient complexity" (that is, containing the rules of arithmetic in a provable manner). What this cashes out to is that, considering AI as a mathematical framework, and the next generation of AI (designed by the first) as a secondary mathematical framework - you can't actually prove that there are no contradictions in an umbrella mathematical framework that comprises both of them, if they are of "sufficient complexity".  Which means an AI cannot -prove- that a successor AI has not experienced value drift - that is, that the combined mathematical framework does not contain contradictions - if they are of sufficient complexity. To illustrate the issue, suppose the existence of a powerful creator AI, designing its successor; the successor, presumably, is more powerful than the creator AI in some fashion, and so there are areas of the combinatorially large space that the successor AI can explore (in a reasonable timeframe), but that the creator AI cannot.  If the creator can prove there are no contradictions in the combined mathematical framework - then, supposing its values are embedded in that framework in a provable manner, it can be assured that the successor has not experienced value drift.   Mind, I don't particularly think the above scenario is terribly likely; I have strong doubts about basically everything in there, in particular the idea of provable values.  I created the post for the five or six people who might still be interested in ideas I haven't seen kicked around on Less Wrong for over a decade.

Speaking of MathML are there other ways for one to put mathematical formulas into html? I know Wikipedia uses <math> and its own template {{math}} (here's the help page), but I'm not sure about any others. There's also LaTeX (which I think is the best program for putting mathematical formulas into text in general), as well as some other bespoke things in Google Docs and Microsoft Word that I don't quite understand.

5jefftk
In terms of what browsers support, MathML is the best way to do it in a modern browser. In an older browser you could do canvas, images, or something with custom fonts. Most users, though, are in authoring environments that offer something else, usually a way to write LaTeX-style math and have it automatically converted into something the browser can handle.

Thank you for placing the limit orders! (You are "Martin Randall" if I understand correctly? I didn't know you were a LessWronger!)

2Lorenzo
No I'm Lorenzo!

Thank you for building this! I have just signed up for it.

I've noticed that two of the three Manifold markets (Will a nuclear weapon detonate in New York City by end of 2023? and Will a nuclear weapon cause over 1,000 deaths in 2023?) could use a few thousand mana in subsidies to reduce the chance of a false alarm, even though both are moderately well-traded already. (I've just bet both of them down, but I personally don't have enough mana to feel comfortable subsidizing both.)

4Lorenzo
I've placed some limit orders on both. It's cheaper than subsidies and should work the same way (if there is no nuclear war)

I think this issue could be fixed by lengthening the message of the phone calls (if it ever gets sent out) to also quote all the comments on the sentinel markets from the last ~week before the trigger time. The reason why is that I expect, if there were to ever be legitimate signs of a impending nuclear war, that people would leave plenty of comments on the relevant markets about these signs.

Update: I have tested negative for COVID-19 twice with self-tests, but since I still feel ill, I recommend that participants mask up anyways (it could be the common cold or flu, for all I know). 

Thank you for the comment! I will not attend this since you stated that they check IDs at the door.

Two recent things that will likely affect this meetup:

  • Firstly, it will rain on Saturday around the time of the meetup, according to the weather forecast, particularly towards the planned end. Please bring umbrellas.
  • Secondly, I might have COVID-19 (which I suspect I caught on Thursday night). As such, I will wear a mask throughout the meetup, and I encourage all of you to do the same.

Thanks for your attention!

1duck_master
Update: I have tested negative for COVID-19 twice with self-tests, but since I still feel ill, I recommend that participants mask up anyways (it could be the common cold or flu, for all I know). 

Question: I’m not old enough to drink alcohol, and I think this place is a bar - but would I even be allowed in the bar?

2Carl Feynman
It’s a food court, with several stalls dispensing food, and one stall dispensing beer.  There’s a little fence around the beer-dispensing booth, which I think is the legal barrier.  I would imagine that it has the same legal status as a restaurant that serves drinks, I.e. it is OK for young people.  However, I’ve only been there once, and am myself old, so I am not certain of the legal status.

Here's a manually sorted list of meetup places in the USA, somewhat arbitrarily/unscientifically grouped by region for even greater convenience. I spent the past hour on this, so please make good use of it. (Warning: this is a long comment.)

NEW ENGLAND

  • Connecticut: Hartford
  • Massachusetts: Cambridge/Boston, Newton, Northampton
  • Vermont: Burlington

MID-ATLANTIC

  • DC: Washington
  • Maryland: Baltimore, College Park
  • New Jersey: Princeton
  • New York State: Java Village/Buffalo, Manhattan/New York City, Massapequa, Rochester
  • Pennsylvania: Harrisburg, Philadelphia, Pittsburgh
  • Virg
... (read more)
1derikk
There's also the meetup in Cavendish, VT!

This is a good suggestion! I'll plan on walking in addition to talking during my upcoming meetup.

I’m in the park now; how can I identify you?

@Screwtape I can make this but there is a different thing I also want to go to at 7:30pm.

3Screwtape
That's fine! I usually try and have people shift partners periodically anyway.

This is an excellent tip! I plan on using it from now on in my day-to-day life.

I haven't used GPT-4 (I'm no accelerationist, and don't want to bother with subscribing), but I have tried ChatGPT for this use. In my experience it's useful for finding small cosmetic changes to make and fixing typos/small grammar mistakes, but I tend to avoid copy-pasting the result wholesale. Also I tend to work with texts much shorter than posts, since ChatGPT's shortish context window starts becoming an issue for decently long posts.

2ChristianKl
ChatGPT doesn't have a fixed context window size. GPT-4's context window is much bigger.

Hello LessWrong! I'm duck_master. I've lurked around this website since roughly the start of the SARS-CoV-2/COVID-19 pandemic but I have never really been super active as of yet (in fact I wrote my first ever post last month). I've been around on the AstralCodexTen comment section and on Discord, though, among a half-dozen other websites and platforms. Here's my personal website (note: rarely updated) for your perusal.

I am a lifelong mathematics enthusiast and a current MIT student. (I'm majoring in mathematics and computer science; I added the latter part... (read more)

Thank you for creating this website! I’ve signed up and started contributing.

One tip I have for other users: many of the neurons are not about vague sentiments or topics (as in most of the auto-suggested explanations), but are rather about very specific keywords or turns of phrase. I’d even guess that many of the neurons are effectively regexes.

Also apparently Neuronpedia cut me off for the day after I did ~20 neuron puzzles. If this limit could be raised for power users or something like that, it could potentially be beneficial.

6Johnny Lin
Hi duck_master, thank you for playing and appreciate the tip. Maybe it's worth compiling these tips and putting it under a "tips" popup/page on the main site. Also - please consider joining the Discord if you're willing to offer more feedback and suggestions: https://discord.gg/kpEJWgvdAx  Apologies for the limit. It currently costs ~$0.24 to do each explanation score and it's coming from my personal funds, so I'm capping it daily until I can hopefully get approved for a grant. A few hours ago I raised the limit from 3 new explanations per day to 10 new explanations per day.

This text shows another key point: not only should your posts be a surprise, but the kind of surprise that causes good actions.

Exactly what it says on the tin.

Thoughts I want to expand on for later:

  • Rationality/philosophical tip: stop being surprised by the passage of time
  • Possible confusingness of the Sequences?
  • People * infrastructure = organization (both factors need to exist)
  • No "intro to predicting" guide so far; writing a good one would decrease the activation energy to predict well
  • Impurity (as in impure functions) as a source of strength
    • Contrastingly, the ills of becoming too involved (eg internet dramas, head overflowing with thoughts)
  • Writing a personal diary more frequently (which I really want to do)
    • Also, meditating and playing piano more

This is an extremely important point. (I remember thinking a long time ago that Wikipedia just Exists, and that although random people are allowed to edit it, doing it is generally Wrong.) FWIW I'm an editor now - User:Duckmather.

8ChristianKl
That's great to hear. Effects like this are what I was hoping for.

In fact, organized resources like Wikipedia, LW sequences, SEP, etc. are basically amortized scholarship. (This is particularly true for Wikipedia; its entire point is that we find vaguely-related content from around - or beyond - the web and then paraphrase it into a mildly-coherent article. Source: am wikipedia editor.)

I also agree that, for the purpose of previewing the content, this post is poorly titled (maybe it should be titled something like "Having bad names makes you open the black box of the name", except more concise?), although, for me, I didn't as much stick to a particular wrong interpretation as just view the entire title as unclear.

1Ericf
Saying poor naming instead of bad names would be clearer, since it wouldn't call up the idea of "bad names" = swear words. Saying "look in" instead of "open" would also distance from the AI concept.

Thanks for the reply. I take it that not only are you interested in the idea of knowledge, but that you are particularly interested in the idea of actionable knowledge. 

Upon further reflection, I realize that all of the examples and partial definitions I gave in my earlier comment can in fact be summarized in a single, simple definition: a thing X has knowledge of a fact Y iff it contains some (sufficiently simple) representation of Y. (For example, a rock knows about the affairs of humans because it has a representation of those affairs in the form o... (read more)

2Alex Flint
I very much agree with the emphasis on actionability. But what is it about a physical artifact that makes the knowledge it contains actionable? I don't think it can be simplicity alone. Suppose I record the trajectory of the moon over many nights by carving markings into a piece of wood. This is a very simple representation, but it does not contain actionable knowledge in the same way that a textbook on Newtonian mechanics does, even if the textbook were represented in a less simple way (say, as a PDF on a computer).

I think knowledge as a whole cannot be absent, but knowledge of a particular fact can definitely be absent (if there's no relationship between the thing-of-discourse and the fact).

1TAG
So rocks have non zero knowledge?

Since this is a literally a question about soliciting predictions, it should have one of those embedded-interactive-predictions-with-histograms gadgets* to make predicting easier. Also, it might be worth it to have two prediction gadgets, since this is basically a prediction: one gadget to predict what Recognized AI Safety Experts (tm) predict about how much damage unsafe AIs will do, and one gadget to predict about how much damage unsafe AIs will actually do (to mitigate weird second-order effects having to do with predicting a prediction). 

*I'm not sure what they're supposed to be called.

2Rob Bensinger
I think it might be more interesting to sketch what you expect the distribution of views to look like, as opposed to just giving a summary statistic. I can add probability Qs, but I avoided it initially so as not to funnel people into doing the less informative version of this exercise.

Au contraire, I think that "mutual information between the object and the environment" is basically the right definition of "knowledge", at least for knowledge about the world (as it correctly predicts that all four attempted "counterexamples" are in fact forms of knowledge), but that the knowledge of an object also depends on the level of abstraction of the object which you're considering.

For example, for your rock example: A rock, as a quantum object, is continually acquiring mutual information with the affairs of humans by the imprinting of subatomic in... (read more)

1TAG
Then how can it ever be absent?
3Alex Flint
Thank you for this comment duck_master. I take your point that it is possible to extract knowledge about human affairs, and about many other things, from the quantum structure of a rock that has been orbiting the Earth. However, I am interested in a definition of knowledge that allows me to say what a given AI does or does not know, insofar as it has the capacity to act on this knowledge. For example, I would like to know whether my robot vacuum has acquired sophisticated knowledge of human psychology, since if it has, and I wasn't expecting it to, then I might choose to switch it off. On the other hand, if I merely discover that my AI has recorded some videos of humans then I am less concerned, even if these videos contain the basic data necessary to constructed sophisticated knowledge of human psychology, as in the case with the rock. Therefore I am interested not just in information, but something like action-readiness. I am referring to that which is both informative and action-ready as "knowledge", although this may be stretching the standard use of this term. Now you say that we might measure more abstract kinds of knowledge by looking at what an AI is willing to bet on. I agree that this is a good way to measure knowledge if it is available. However, if we are worried that an AI is deceiving us, then we may not be willing to trust its reports of its own epistemic state, or even of the bets it makes, since it may be willing to lose money now in order to convince us that it is not particularly intelligent, in order to make a treacherous turn later. Therefore I would very much like to find a definition that does not require me to interact with the AI through its input/output channels in order to find out what it knows, but rather allows me to look directly at its internals. I realize this may be impossible, but this is my goal. So as you can see, my attempt at a definition of knowledge is very much wrapped up with the specific problem I'm trying to solve, and

I think this applies to every wiki ever, and also to this very site. There are probably a lot of others that I'm missing but this is a start.

I agree with you (meaning G Gorden Worley III) that Wikipedia is reliable, and I too treat it as reliable. (It's so well-known as a reliable source that even Google uses it!) I also agree that an army of bots and humans undo any defacing that may occur, and that Wikipedia having to depend on other sources helps keep it unbiased. I also agree with the OP that Wikipedia's status as not-super-reliable among the Powers that Be does help somewhat.

So I think that the actual secret of Wikipedia's success is a combination of the two: Mild illegibility prevents ram... (read more)

2Gordon Seidoh Worley
One bit of nuance my original comment leaves out is how flexible the citation policy is. Yes citations are required to include content on Wikipedia if it's not considered common knowledge, but also it's not that hard to produce something that Wikipedia can then cite, even if it must be referenced obliquely like "some people say X is true about Y". This is generally how Wikipedia deals with controversial topics today: cite sources expressing views in order to acknowledge the existence of disagreements and also keep disputed facts quarantined in "controversy" sections.

@Diffractor: I think I got a MIRIxDiscord invite in a way somehow related to this event. Check your PMs for details. (I'm just commenting here to get attention because I think this might be mildly important.)

1duck_master
Bumping this.

Don't worry, it was kind of a natural stopping point anyways, as the discussion was winding down.

2Ben Pace
Oh woops, I realize I ended the call for everyone when I left. I'm sorry.

"Mixture of infra-distributions" as in convex set, or something else? If it's something else then I'm not sure how to think about it properly.

2Diffractor
"mixture of infradistributions" is just an infradistribution, much like how a mixture of probability distributions is a probability distribution. Let's say we've got a prior ζ∈ΔN, a probability distribution over indexed hypotheses. If you're working in a vector space, you can take any countable collection of sets in said vector space, and mix them together according to a prior ζ∈ΔN giving a weight to each set. Just make the set of all points which can be made by the process "pick a point from each set, and mix the points together according to the probability distribution ζ" For infradistributions as sets of probability distributions or a-measures or whatever, that's a subset of a vector space. So you have a bunch of sets Ψi, and you just mix the sets together according to ζ, that gives you your set Ψζ. If you want to think about the mixture in the concave functional view, it's even nicer. You have a bunch of ψi:(X→R)→R which are "hypothesis i can take a function and output what its worst-case expectation value is". The mixture of these, ψζ, is simply defined as ψζ(f):=Ei∼ζ[ψi(f)]. This is just mixing the functions together! Both of these ways of thinking of mixtures of infradistributions are equivalent, and recover mixture of probability distributions as a special case.

Me too. I currently only have a very superficial understanding of infraBayesianism (all of which revolves around the metaphysical, yet metaphorical, deity Murphy).

More specifically: if two points are in a convex set, then the entire line segment connecting them must also be in the set.

Here's an ELI5: The evil superintelligent deity Murphy, before you were ever conceived, picked the worst possible world that you could live in (meaning the world where your performance is worst), and you have to use fancy math tricks to deal with that.

Load More