All of Joey Marcellino's Comments + Replies

Model at https://docs.google.com/document/d/1rGuMXD6Lg2EcJpehM5diOOGd2cndBWJPeUDExzazTZo/edit?usp=sharing.

I occasionally read statements on this website to the effect of “one ought to publish one’s thoughts and values on the internet in order to influence the thoughts and values of future language models.” I wondered “what if you wanted to do that at scale?” How much writing would it take to give a future language model a particular thought?

Suppose, for instance, that this contest was judged by a newly trained frontier model, and that I had the opportunity... (read more)

7gwern
Note that text in pretraining may even be an expensive way to go about it: one of the most dramatic demonstrations MS gave us with Sydney was the incredible speed & efficiency of web-search-powered adversarial attacks on LLMs. You don't need to dump a lot of samples onto the Internet and pray they make it into the training data and don't get forgotten, if you can set up a single sample with good SEO and the LLM kindly retrieves it for you and attacks itself with your sample. This is something to think about: it's not just making it into the training data, it's making it into the agent's prompt or context that can matter. People are currently talking about how Deep Research is an example of the AI trend which will drive paywalls everywhere... which may happen, but consider the positives for people who don't put up paywalls.

I see the two main arguments of the book as 1) we should understand "gender identity" as a bunch of subjective feelings about various traits, which may or may not cohere into an introspectively accessible "identity"; 2) we can understand gender categories as a particular kind of irreducible category (namely historical lineages) to which membership is granted by community consensus, the categories being "irreducible" in that they are not defined by additional facts about their members. These stand or fall independently of whether we accept gender self-id, a... (read more)

That's a good question. I think BG's way of thinking about gender categories is potentially useful for racial/ethnic categories as well, particularly the bit about category membership as a conferred status. I think they'd probably agree with this. They don't really argue that we ought to have gender self ID; they explicitly assume this to be the case, and are more trying to show that it's coherent. I suspect if you asked them they would probably say that we ought not to have racial self ID, or that it ought to be much more limited than in the case of gende... (read more)

1Ediz Ucar
Perhaps I've missed the point of your post, but to me the whole confusion around Gender is not internal validity, after all circular definitions are valid - but not convincing to the outside view.

Sure, one can always embed a game inside another one and so alter the overall expectation values how one likes. That said, we still only want to play the meta-game if it had positive expectation value, no?

1Nathaniel Monson
Minor semantic quibble: I would say we always want positive expected utility, but how that translates into money/time/various intangibles can vary tremendously both situationally and from person to person.

The conclusion seems rather to be "human metabolism is less efficient than solar panels," which, while perhaps true, has limited bearing on the question of whether or not the brain is thermodynamically efficient as a computer when compared to current or future AI. The latter is the question that recent discussion has been focused on, and to which the "No - " in the title makes it seem like you're responding.

Moreover, while a quick Google search turns up 100W as the average resting power output of a person, another search suggests the brain is only responsi... (read more)

2Maxwell Clarke
Well, yes, the point of my post is just to point out that the number that actually matters is the end-to-end energy efficiency — and it is completely comparable to humans. The per-flop efficiency is obviously worse. But, that's irrelevant if AI is already cheaper for a given task in real terms. I admit the title is a little clickbaity but i am responding to a real argument (that humans are still "superior" to AI because the brain is more thermodynamically efficient per-flop)

What does quantum entanglement mean for causality? Due to entanglement, there can be spacelike separated measurements such that there exists a reference frame > where it looks like measurement A precedes and has a causal influence on the outcomes of measurement B, and > also a reference frame where it looks like measurement B precedes and has a causal influence on the outcomes of measurement A.

"Causality" is already a somewhat fraught notion in fundamental physics irrespective of quantum mechanics; it's not clear that one needs to have some sort ... (read more)

Just to (hopefully) make the distinction a bit more clear:

A true copying operation would take |psi1>|0> to |psi1>|psi1>; that's to say, it would take as input one qubit in an arbitrary quantum state and a second qubit in |0>, and output two qubits in the same arbitrary quantum state that the first qubit was in. For our example, we'll take |psi1> to be an equal superposition of 0 and 1: |psi1> = |0> + |1> (ignoring normalization).

If CNOT is a copying operation, it should take (|0> + |1>)|0> to (|0> + |1>)(|0> + |... (read more)

4Viliam
Thank you! Some context: I am a "quantum autodidact", and I am currently reading a book Q is for Quantum, which is a very gentle, beginner-friendly introduction to quantum computing. I was thinking how it relates to the things I have read before, and then I noticed that I was confused. I looked at Wikipedia, which said that CNOT does not violate the no-cloning theorem... but I didn't understand the explanation why. I think I get it now. |00> + |11> is not a copy (looking at one qubit collapses the other), |00> + |01> + |10> + |11> would be a copy (looking at one qubit would still leave the other as |0> + |1>).