LESSWRONG
LW

All of Darklight's Comments + Replies

Why Should I Assume CCP AGI is Worse Than USG AGI?

It seems like it would depend pretty strongly on which side you view as having a closer alignment with human values generally. That probably depends a lot on your worldview and it would be very hard to be unbiased about this.

There was actually a post about almost this exact question on the EA Forums a while back. You may want to peruse some of the comments there.

8Caleb Biddulph1mo

Side note - it seems there's an unofficial norm: post about AI safety in LessWrong, post about all other EA stuff in the EA Forum. You can cross-post your AI stuff to the EA Forum if you want, but most people don't. I feel like this is pretty confusing. There was a time that I didn't read LessWrong because I considered myself an AI-safety-focused EA but not a rationalist, until I heard somebody mention this norm. If we encouraged more cross-posting of AI stuff (or at least made the current norm more explicit), maybe we wouldn't get near-duplicate posts like these two.

Darklight's Shortform

Darklight1mo*1-1

Back in October 2024, I tried to test various LLM Chatbots with the question:

"Is there a way to convert a correlation to a probability while preserving the relationship 0 = 1/n?"

Years ago, I came up with an unpublished formula that does just that:

p(r) = (n^r * (r + 1)) / (2^r * n)

So I was curious if they could figure it out. Alas, back in October 2024, they all made up formulas that didn't work.

Yesterday, I tried the same question on ChatGPT and, while it didn't get it quite right, it came, very, very close. So, I modified the question to be more specific:... (read more)

On Pseudo-Principality: Reclaiming "Whataboutism" as a Test for Counterfeit Principles

Darklight1mo1-1

The most I've seen people say "whataboutism" has been in response to someone trying to deflect criticism by pointing out apparent hypocrisy, as in the aforementioned Soviet example (I used to argue with terminally online tankies a lot).

I.e.

(A): "The treatment of Uyghurs in China is appalling. You should condemn this."

(B): "What about the U.S. treatment of Native Americans? Who are you to criticize?"

(A): "That's whataboutism!"

The thing I find problematic with this "defence" is that both instances are ostensibly examples of clear wrongdoing, and ... (read more)

A Fraction of Global Market Capitalization as the Best Currency

Darklight1mo10

Ok fair. I was assuming real world conditions rather than the ideal of Dath Ilan. Sorry for the confusion.

1Greenless Mirror1mo

To be fair, before publishing I thought this currency could be implemented in a real world environment with less improbability. My current main doubt is "how can the market cap be so volatile with a stable GDP, and would they be closer to each other in a more adequate equilibrium?". And I've basically switched to "okay oops, but under what conditions could this theoretically work, if could at all, and could you imagine better theoretical peak conditions?" mode. Deflation seems like a reasonable danger, I just can't see how it could be avoided if everyone used market fractions at least to store their money if not to exchange. Because, like, you don't introduce a random money-making machine into the system to solve your psychological problems at the cost of 2% of your money, there's no place for it, so I'm guessing that people would adapt to that, and there's a fictional example of dath ilan that such adaptation is real.

A Fraction of Global Market Capitalization as the Best Currency

Darklight2mo10

Why not? Like, the S&P 500 can vary by tens of percent, but as Google suggests, global GDP only fell 3% in 2021, and it usually grows, and the more stocks are distributed, the more stable they are.

Increases in the value of the S&P 500 are basically deflation relative to other units of account. When an asset appreciates in value, when its price goes up it is deflating relative to the currency the price is in. Like, when the price of bread increases, that means dollars are inflating, and bread is deflating. Remember, your currency is based on a perce... (read more)

1Greenless Mirror1mo

In dath ilan, inflation and deflation are not used as macroeconomic tools because people are rational enough to accept wage reductions if their purchasing power remains unchanged, or to voluntarily pay the government to prevent crises without the need for a hidden tax that dilutes money by printing more during a crisis. Interest rates on loans could be lower if you expect returns that outpace deflation. If people can afford not to work, they are expected to do so, and they would spend more "shares", redistributing them in favor of those who are more eager to work. Perhaps you aren't really interested in having people work that much or that often, especially if you're aiming for a utopia with a four-hour workday, or something similar. How relevant are these issues in a world where "every person is economist in the same way every earthling is a scribe by medieval standards"?

A Fraction of Global Market Capitalization as the Best Currency

Darklight2mo10

I guess I'm just not sure you could trade in "hundred-trillionths of global market cap". Like, fractions of a thing assume there is still an underlying quantity or unit of measure that the fraction is a subcomponent of. If you were to range it from 0 to 1, you'd still need a way to convert a 0.0001% into a quantity of something, whether it's gold or grain or share certificates or whatever.

I can sorta imagine a fractional shares of global market cap currency coming into existence alongside other currencies that it can be exchanged for, but having all tradit... (read more)

1Greenless Mirror2mo

Why not? Like, the S&P 500 can vary by tens of percent, but as Google suggests, global GDP only fell 3% in 2021, and it usually grows, and the more stocks are distributed, the more stable they are. If you imagine that the world's capitalization was once measured in dollars, but then converted to "0 to 1" proportionally to dollars, and everyone used that system, and there is no money printing anymore, what would be wrong with that? Of course you can still express money in gold if you want, it's just that not so many people store their money in it, and that would require exchanging money for money. If dath ilan heard a plan to ban all currencies, they would quickly come up with Something Which Is Not This. It might seem like deflation would make you hold off on buying, but not if you thought you could get more out of buying than from your money passively growing by a few percent a year, and in that case, you would reasonably buy it. If it made people do nothing, the economy would slow down enough for deflation to stop, for them to start doing things again, and so they wouldn't get to that point in the first place. Every transaction you make is an investment of your knowledge into the global market in the area where you believe you are smarter than the market and can outpace it in some sense.

A Fraction of Global Market Capitalization as the Best Currency

Darklight2mo10

The idea of labour hours as a unit of account isn't that new. Labour vouchers were actually tried by some utopian anarchists in the 1800s and early experiments like the Cincinnati Time Store were modestly successful. The basic idea is not to track subjective exchange values but instead a more objective kind of value, the value of labour, or a person's time, with the basic assumption that each person's time should be equally valuable. Basically, it goes back to Smith and Ricardo and the Labour Theory of Value that was popular in classical economics before m... (read more)

1Greenless Mirror2mo

I wasn’t surprised by the idea of using labor hours itself, but rather by the assumption that people in a system with free choice would naturally settle on it as the ideal solution. Sure, I’ll try to clarify. You seem comfortable with the idea that global market capitalization can be expressed in a single currency, like the dollar. Let’s assume the world’s total market cap is $100 trillion. Let's say Apple’s market cap is $3.5 trillion, or 3.5% of the total, so if you had $1, you could conceptually allocate 3.5 cents to Apple, 3 cents to Microsoft (which has a $3T market cap), and so on across all investable assets. This is how index funds work and I hope there’s nothing inherently strange about it. If there are non-equity assets you can’t invest in, you still aim to expand your investment base to represent the entire market as proportionally as possible. In a perfect world, you could also invest in governments, crypto, individuals, but even an approximate model works well. But of course you’re not doing this manually and this world, an index fund - like a "Vanguard S&P 500", but larger - handles it for you. You give them your money, and they allocate it proportionally across the entire market. Since this is the most stable strategy, many people trust it and invest in this company until they can effectively exchange shares of this fund among themselves as equivalent to money. And when the network effect becomes broad enough, the rest of the world economy uses the shares of this fund as a currency, and from that point on, the entire economy is measured in it, because you literally store money in it as an a priori option, and given that it is invested by capitalization, it represents "fractions of the global market". So from now people exchange fractions of the global market. Yeah, this technically depends on the "success of one company" - but the success of this index fund depends on hundreds or thousands of companies it holds. You don’t expect this "one compan

Non-Consensual Consent: The Performance of Choice in a Coercive World

Darklight2mo10

This put into well-written words a lot of thoughts I've had in the past but never been able to properly articulate. Thank you for writing this.

Oppression and production are competing explanations for wealth inequality.

Darklight4mo94

This sounds rather like the competing political economic theories of classical liberalism and Marxism to me. Both of these intellectual traditions carry a lot of complicated baggage that can be hard to disentangle from the underlying principles, but you seem to have a done a pretty good job of distilling the relevant ideas in a relatively apolitical manner.

That being said, I don't think it's necessary for these two explanations for wealth inequality to be mutually exclusive. Some wealth could be accumulated through "the means of production" as you call it,... (read more)

2Benquo4mo

I agree these mechanisms can coexist. But to test and improve our models and ultimately make better decisions, we need specific hypotheses about how they interact. The OP was limited in scope because it's trying to explain why more detailed analyses like the ones I offer in The Debtors' Revolt or Calvinism as a Theory of Recovered High-Trust Agency are decision-relevant. Overall my impression is that while the situation is complex, it's frequently explicable as an interaction between a relatively small and enumerable number of "types of guy" (e.g. debtor vs creditor, depraved vs self-interested).

RohanS's Shortform

Darklight4mo30

Another thought I just had was, could it be that ChatGPT, because it's trained to be such a people pleaser, is losing intentionally to make the user happy?

Have you tried telling it to actually try to win? Probably won't make a difference, but it seems like a really easy thing to rule out.

RohanS's Shortform

Darklight4mo10

Also, quickly looking into how LLM token sampling works nowadays, you may also need to set the parameters top_p to 0, and top_k to 1 to get it to actually function like argmax. Looks like these can only be set through the API if you're using ChatGPT or similar proprietary LLMs. Maybe I'll try experimenting with this when I find the time, if nothing else to rule out the possibility of such a seemingly obvious thing being missed.

RohanS's Shortform

Darklight4mo40

I've always wondered with these kinds of weird apparent trivial flaws in LLM behaviour if it doesn't have something to do with the way the next token is usually randomly sampled from the softmax multinomial distribution rather than taking the argmax (most likely) of the probabilities. Does anyone know if reducing the temperature parameter to zero so that it's effectively the argmax changes things like this at all?

1Darklight4mo

Darklight's Shortform

Darklight7mo30

p = (n^c * (c + 1)) / (2^c * n)

As far as I know, this is unpublished in the literature. It's a pretty obscure use case, so that's not surprising. I have doubts I'll ever get around to publishing the paper I wanted to write that uses this in an activation function to replace softmax in neural nets, so it probably doesn't matter much if I show it here.

Darklight's Shortform

Darklight7mo10

So, my main idea is that the principle of maximum entropy aka the principle of indifference suggests a prior of 1/n where n is the number of possibilities or classes. P x 2 - 1 leads to p = 0.5 for c = 0. What I want is for c = 0 to lead to p = 1/n rather than 0.5, so that it works in the multiclass cases where n is greater than 2.

3cubefox7mo

What's the solution?

Darklight's Shortform

Darklight7mo10

Correlation space is between -1 and 1, with 1 being the same (definitely true), -1 being the opposite (definitely false), and 0 being orthogonal (very uncertain). I had the idea that you could assume maximum uncertainty to be 0 in correlation space, and 1/n (the uniform distribution) in probability space.

3cubefox7mo

Not sure what you mean here, but p×2−1 would linearly transform a probability p from [0..1] to [-1..1]. You could likewise transform a correlation coefficient ϕ to [0..1] with ϕ(A,B)+12. For P(A)=P(B)=12, this would correspond to the probability of A occuring if and only if B occurs. I.e. ϕ(A,B)+12=P(A↔B) when P(A)=P(B)=0.5.

Darklight's Shortform

Darklight7mo-10

I tried asking ChatGPT, Gemini, and Claude to come up with a formula that converts between correlation space to probability space while preserving the relationship 0 = 1/n. I came up with such a formula a while back, so I figure it shouldn't be hard. They all offered formulas, all of which were shown to be very much wrong when I actually graphed them to check.

5cubefox7mo

What's correlation space, as opposed to probability space?

Darklight's Shortform

Darklight7mo10

I was not aware of these. Thanks!

Darklight's Shortform

Darklight7mo10

Thanks for the clarifications. My naive estimate is obviously just a simplistic ballpark figure using some rough approximations, so I appreciate adding some precision.

Darklight's Shortform

Darklight7mo1-2

Also, even if we can train and run a model the size of the human brain, it would still be many orders of magnitude less energy efficient than an actual brain. Human brains use barely 20 watts. This hypothetical GPU brain would require enormous data centres of power, and each H100 GPU uses 700 watts alone.

Vladimir_Nesov7mo102

Also, even if we can train and run a model the size of the human brain, it would still be many orders of magnitude less energy efficient than an actual brain. Human brains use barely 20 watts.

For inference on a GPT-4 level model, GPUs use much less than a human brain, about 1-2 watts (across all necessary GPUs), if we imagine slowing them down to human speed and split the power among the LLM instances that are being processed at the same time. Even for a 30 trillion parameter model, it might only get up to 30-60 watts in this sense.

each H100 GPU uses

... (read more)

Darklight's Shortform

Darklight7mo5-1

I've been looking at the numbers with regards to how many GPUs it would take to train a model with as many parameters as the human brain has synapses. The human brain has 100 trillion synapses, and they are sparse and very efficiently connected. A regular AI model fully connects every neuron in a given layer to every neuron in the previous layer, so that would be less efficient.

The average H100 has 80 GB of VRAM, so assuming that each parameter is 32 bits, then you have about 20 billion per GPU. So, you'd need 10,000 GPUs to fit a single instance of a huma... (read more)

5Vladimir_Nesov7mo

What I can find is 20,000 A100s. With 10K A100s, which are 300e12 FLOP/s in BF16, you'd need 6 months (so this is still plausible) at 40% utilization to get the rumored 2e25 FLOPs. We know Llama-3-405B is 4e25 FLOPs and approximately as smart, and it's dense, so you can get away with fewer FLOPs in a MoE model to get similar capabilities, which supports the 2e25 FLOPs figure from the premise that original GPT-4 is MoE. H200s are 140 GB, and there are now MI300Xs with 192 GB. B200s will also have 192 GB. Training is typically in BF16, though you need enough space for gradients in addition to parameters (and with ZeRO, optimizer states). On the other hand, inference in 8 bit quantization is essentially indistinguishable from full precision. The word is, next year it's 500K B200s[1] for Microsoft. And something in the gigawatt range from Google as well. ---------------------------------------- 1. He says 500K GB200s, but also that it's 1 gigawatt all told, and that they are 2-3x faster than H100s, so I believe he means 500K B200s. In various places, "GB200" seems to ambiguously refer either to a 2-GPU board with a Grace CPU, or to one of the B200s on such a board. ↩︎

1Darklight7mo

Darklight's Shortform

Darklight7mo30

I ran out of the usage limit for GPT-4o (seems to just be 10 prompts every 5 hours) and it switched to GPT-4o-mini. I tried asking it the Alpha Omega question and it made some math nonsense up, so it seems like the model matters for this for some reason.

Darklight's Shortform

Darklight8mo160

So, a while back I came up with an obscure idea I called the Alpha Omega Theorem and posted it on the Less Wrong forums. Given how there's only one post about it, it shouldn't be something that LLMs would know about. So in the past, I'd ask them "What is the Alpha Omega Theorem?", and they'd always make up some nonsense about a mathematical theory that doesn't actually exist. More recently, Google Gemini and Microsoft Bing Chat would use search to find my post and use that as the basis for their explanation. However, I only have the free version of ChatGPT... (read more)

3Darklight7mo

Darklight's Shortform

Darklight1y10

I'm wondering what people's opinions are on how urgent alignment work is. I'm a former ML scientist who previously worked at Maluuba and Huawei Canada, but switched industries into game development, at least in part to avoid contributing to AI capabilities research. I tried earlier to interview with FAR and Generally Intelligent, but didn't get in. I've also done some cursory independent AI safety research in interpretability and game theoretic ideas my spare time, though nothing interesting enough to publish yet.

My wife also recently had a baby, and carin... (read more)

Open Thread Spring 2024

Darklight1y10

Thanks for the reply!

So, the main issue I'm finding with putting them all into one proposal is that there's a 1000 character limit on the main summary section where you describe the project, and I cannot figure out how to cram multiple ideas into that 1000 characters without seriously compromising the quality of my explanations for each.

I'm not sure if exceeding that character limit will get my proposal thrown out without being looked at though, so I hesitate to try that. Any thoughts?

6habryka1y

Oh, hmm, I sure wasn't tracking a 1000 character limit. If you can submit it, I wouldn't be worried about it (and feel free to put that into your references section). I certainly have never paid attention to whether anyone stayed within the character limit.

Cooperation is optimal, with weaker agents too - tldr

Darklight1y20

I already tried discussing a very similar concept I call Superrational Signalling in this post. It got almost no attention, and I have doubts that Less Wrong is receptive to such ideas.

I also tried actually programming a Game Theoretic simulation to try to test the idea, which you can find here, along with code and explanation. Haven't gotten around to making a full post about it though (just a shortform).

1Ryo 1y

Thank you for the references! I'm reading your writings, it's interesting I posted the super-cooperation argument while expecting that LessWrong would likely not be receptive, but I'm not sure which community would engage with all this and find it pertinent at this stage More concrete and empirical productions seems needed

Open Thread Spring 2024

Darklight1y10

So, I have three very distinct ideas for projects that I'm thinking about applying to the Long Term Future Fund for. Does anyone happen to know if it's better to try to fit them all into one application, or split them into three separate applications?

3habryka1y

Three is a bit much. I am honestly not sure what's better. My guess is putting them all into one. (Context, I am one of the LTFF fund managers)

Darklight's Shortform

Darklight1y62

Recently I tried out an experiment using the code from the Geometry of Truth paper to try to see if using simple label words like "true" and "false" could substitute for the datasets used to create truth probes. I also tried out a truth probe algorithm based on classifying with the higher cosine similarity to the mean vectors.

Initial results seemed to suggest that the label word vectors were sorta acceptable, albeit not nearly as good (around 70% accurate rather than 95%+ like with the datasets). However, testing on harder test sets showed much worse accur... (read more)

Darklight's Shortform

Darklight1y10

Update: I made an interactive webpage where you can run the simulation and experiment with a different payoff matrix and changes to various other parameters.

Darklight's Shortform

Darklight1y10

So, I adjusted the aggressor system to work like alliances or defensive pacts instead of a universal memory tag. Basically, now players make allies when they both cooperate and aren't already enemies, and make enemies when defected against first, which sets all their allies to also consider the defector an enemy. This, doesn't change the result much. The alliance of nice strategies still wins the vast majority of the time.

I also tried out false flag scenarios where 50% of the time the victim of a defect first against non-enemy will actually be mistaken for... (read more)

Darklight's Shortform

Darklight1y10

Admittedly this is a fairly simple set up without things like uncertainty and mistakes, so yes, it may not really apply to the real world. I just find it interesting that it implies that strong coordinated retribution can, at least in this toy set up, be useful for shaping the environment into one where cooperation thrives, even after accounting for power differentials and the ability to kill opponents outright, which otherwise change the game enough that straight Tit-For-Tat doesn't automatically dominate.

It's possible there are some situations where this... (read more)

Darklight's Shortform

Darklight1y*40

Okay, so I decided to do an experiment in Python code where I modify the Iterated Prisoner's Dilemma to include Death, Asymmetric Power, and Aggressor Reputation, and run simulations to test how different strategies do. Basically, each player can now die if their points falls to zero or below, and the payoff matrix uses their points as a variable such that there is a power difference that affects what happens. Also, if a player defects first in any round of any match against a non-aggressor, they get the aggressor label, which matters for some strategies t... (read more)

1Darklight1y

Update: I made an interactive webpage where you can run the simulation and experiment with a different payoff matrix and changes to various other parameters.

2Dagon1y

So, the aggressor tag is a way to keep memory across games, so they're not independent. I wonder what happens when you start allowing more complicated reputation (including false accusations of aggression). I feel like any interesting real-world implications are probably fairly tenuous. I'd love to hear some and learn that I'm wrong.

Darklight's Shortform

Darklight1y10

I was recently trying to figure out a way to calculate my P(Doom) using math. I initially tried just making a back of the envelope calculation by making a list of For and Against arguments and then dividing the number of For arguments by the total number of arguments. This led to a P(Doom) of 55%, which later got revised to 40% when I added more Against arguments. I also looked into using Bayes Theorem and actual probability calculations, but determining P(E | H) and P(E) to input into P(H | E) = P(E | H) * P(H) / P(E) is surprisingly hard and confusing.

Apologizing is a Core Rationalist Skill

Darklight1y117

Minor point, but the apology needs to sound sincere and credible, usually by being specific about the mistakes and concise and to the point and not like, say, Bostrom's defensive apology about the racist email a while back. Otherwise you can instead signal that you are trying to invoke the social API call in a disingenuous way, which can clearly backfire.

Things like "sorry you feel offended" also tend to sound like you're not actually remorseful for your actions and are just trying to elicit the benefits of an apology. None of the apologies you described sound anything like that, but it's a common failure state among the less emotionally mature and the syncophantic.

johnswentworth1y*149

Expanding on this...

The "standard format" for calling the apology API has three pieces:

"I'm sorry"/"I apologize"
Explicitly state the mistake/misdeed
Explicitly state either what you should have done instead, or will do differently next time

Notably, the second and third bullet points are both costly signals: it's easier for someone to state the mistake/misdeed, and what they'll would/will do differently, if they have actually updated. Thus, those two parts contribute heavily to the apology sounding sincere.

Darklight's Shortform

Darklight1y20

I have some ideas and drafts for posts that I've been sitting on because I feel somewhat intimidated by the level of intellectual rigor I would need to put into the final drafts to ensure I'm not downvoted into oblivion (something a younger me experienced in the early days of Less Wrong).

Should I try to overcome this fear, or is it justified?

For instance, I have a draft of a response to Eliezer's List of Lethalities post that I've been sitting on since 2022/04/11 because I doubted it would be well received given that it tries to be hopeful and, as a former... (read more)

4mike_hawke1y

Personally, I find shortform to be an invaluable playground for ideas. When I get downvoted, it feels lower stakes. It's easier to ignore aloof and smugnorant comments, and easier to update on serious/helpful comments. And depending on how it goes, I sometimes just turn it into a regular post later, with a note at the top saying that it was adapted from a shortform. If you really want to avoid smackdowns, you could also just privately share your drafts with friends first and ask for respectful corrections. Spitballing other ideas, I guess you could phrase your claims as questions, like "have objections X, Y, or Z been discussed somewhere already? If so, can anyone link me to those discussions?" Seems like that could fail silently though, if an over-eager commenter gives you a link to low-quality discussion. But there are pros and cons for every course of action/inaction.

Could induced and stabilized hypomania be a desirable mental state?

Answer by DarklightJun 14, 202362

I would be exceedingly cautious about this line of reasoning. Hypomania tends to not be sustainable, with a tendency to either spiral into a full blown manic episode, or to exhaust itself out and lead to an eventual depressive episode. This seems to have something to do with the characteristics of the thoughts/feelings/beliefs that develop while hypomanic, the cognitive dynamics if you will. You'll tend to become increasingly overconfident and positive to the point that you will either start to lose contact with reality by ignoring evidence to the contrary... (read more)

Yoshua Bengio: How Rogue AIs may Arise

Darklight2y70

I still remember when I was a masters student presenting a paper at the Canadian Conference on AI 2014 in Montreal and Bengio was also at the conference presenting a tutorial, and during the Q&A afterwards, I asked him a question about AI existential risk. I think I worded it back then as concerned about the possibility of Unfriendly AI or a dangerous optimization algorithm or something like that, as it was after I'd read the sequences but before "existential risk" was popularized as a term. Anyway, he responded by asking jokingly if I was a journalist... (read more)

How Does the Human Brain Compare to Deep Learning on Sample Efficiency?

Answer by DarklightJan 15, 202391

The average human lifespan is about 70 years or approximately 2.2 billion seconds. The average human brain contains about 86 billion neurons or roughly 100 trillion synaptic connections. In comparison, something like GPT-3 has 175 billion parameters and 500 billion tokens of data. Assuming very crudely weight/synapse and token/second of experience equivalence, we can see that the human model's ratio of parameters to data is much greater than GPT-3, to the point that humans have significantly more parameters than timesteps (100 trillion to 2.2 billion), whi... (read more)

Darklight's Shortform

Darklight3y70

I recently interviewed with Epoch, and as part of a paid work trial they wanted me to write up a blog post about something interesting related to machine learning trends. This is what I came up with:

http://www.josephius.com/2022/09/05/energy-efficiency-trends-in-computation-and-long-term-implications/

What does moral progress consist of?

Darklight3y7-6

I should point out that the logic of the degrowth movement follows from a relatively straightforward analysis of available resources vs. first world consumption levels. Our world can only sustain 7 billion human beings because the vast majority of them live not at first world levels of consumption, but third world levels, which many would argue to be unfair and an unsustainable pyramid scheme. If you work out the numbers, if everyone had the quality of life of a typical American citizen, taking into account things like meat consumption to arabl... (read more)

Thoughts On Computronium

Darklight3y30

I'm using the number calculated by Ray Kurzweil for his book, the Age of Spiritual Machines from 1999. To get that figure, you need 100 billion neurons firing every 5 ms, or 200 Hz. That is based on the maximum firing rate given refractory periods. In actuality, average firing rates are usually lower than that, so in all likelihood the difference isn't actually six orders of magnitude. In particular, I should point out that six orders of magnitude is referring to the difference between this hypothetical maximum firing brain and the ... (read more)

AGI Safety FAQ / all-dumb-questions-allowed thread

Darklight3y30

Okay, so I contacted 80,000 hours, as well as some EA friends for advice. Still waiting for their replies.

I did hear from an EA who suggested that if I don't work on it, someone else who is less EA-aligned will take the position instead, so in fact, it's slightly net positive for myself to be in the industry, although I'm uncertain whether or not AI capability is actually funding constrained rather than personal constrained.

Also, would it be possible to mitigate the net negative by choosing to deliberately avoid capability research and just take an ML engineering job at a lower tier company that is unlikely to develop AGI before others and just work on applying existing ML tech to solving practical problems?

AGI Safety FAQ / all-dumb-questions-allowed thread

Darklight3y70

I previously worked as a machine learning scientist but left the industry a couple of years ago to explore other career opportunities. I'm wondering at this point whether or not to consider switching back into the field. In particular, in case I cannot find work related to AI safety, would working on something related to AI capability be a net positive or net negative impact overall?

1[comment deleted]3y

1Yonatan Cale3y

Working on AI Capabilities: I think this is net negative, and I'm commenting here so people can [V] if they agree or [X] if they disagree. Seems like habryka agrees? Seems like Kaj disagrees? I think it wouldn't be controversial to advise you to at least talk to 80,000 hours about this before you do it, as some safety net to not do something you don't mean to by mistake? Assuming you trust them. Or perhaps ask someone you trust. Or make your own gears-level model. Anyway, seems like an important decision to me

Thoughts On Computronium

Darklight4y30

Even further research shows the most recent Nvidia RTX 3090 is actually slightly more efficient than the 1660 Ti, at 36 TeraFlops, 350 watts, and 2.2 kg, which works out to 0.0001 PetaFlops/Watt and 0.016 PetaFlops/kg. Once again, they're within an order of magnitude of the supercomputers.

Thoughts On Computronium

Darklight4y50

So, I did some more research, and the general view is that GPUs are more power efficient in terms of Flops/watt than CPUs, and the most power efficient of those right now is the Nvidia 1660 Ti, which comes to 11 TeraFlops at 120 watts, so 0.000092 PetaFlops/Watt, which is about 6x more efficient than Fugaku. It also weighs about 0.87 kg, which works out to 0.0126 PetaFlops/kg, which is about 7x more efficient than Fugaku. These numbers are still within an order of magnitude, and also don't take into account the overhead costs of things like coo... (read more)

3Darklight4y

Darklight's Shortform

Darklight4y10

Another thought is that maybe Less Wrong itself, if it were to expand in size and become large enough to roughly represent humanity, could be used as such a dataset.

Darklight's Shortform

Darklight4y10

So, I had a thought. The glory system idea that I posted about earlier, if it leads to a successful, vibrant democratic community forum, could actually serve as a kind of dataset for value learning. If each post has a number attached to it that indicates the aggregated approval of human beings, this can serve as a rough proxy for a kind of utility or Coherent Aggregated Volition.

Given that individual examples will probably be quite noisy, but averaged across a large amount of posts, it could function as a real world dataset, with the post conte... (read more)

1Darklight4y

Another thought is that maybe Less Wrong itself, if it were to expand in size and become large enough to roughly represent humanity, could be used as such a dataset.

The Glory System: A Model For Moral Currency And Distributed Self-Moderation

Darklight4y10

A further thought is that those with more glory can be seen almost as elected experts. Their glory is assigned to them by votes after all. This is an important distinction from an oligarchy. I would actually be inclined to see the glory system as located on a continuum between direct demcracy and representative democracy.

The Glory System: A Model For Moral Currency And Distributed Self-Moderation

Darklight4y10

So, keep in mind that by having the first vote free and worth double the paid votes does tilt things more towards democracy. That being said, I am inclined to see glory as a kind of proxy for past agreement and merit, and a rough way to approximate liquid democracy where you can proxy your vote to others or vote yourself.

In this alternative "market of ideas" the ideas win out because people who others trust to have good opinions are able to leverage that trust. Decisions over the merit of the given arguments are aggregated by vote. As lon... (read more)

The Glory System: A Model For Moral Currency And Distributed Self-Moderation

Darklight4y40

Perhaps a nitpick detail, but having someone rob them would not be equivalent, because the cost of the action is offset by the ill-gotten gains. The proposed currency is more directly equivalent to paying someone to break into the target's bank account and destroying their assets by a proportional amount so that no one can use them anymore.

As for the more general concerns:

Standardized laws and rules tend in practice to disproportionately benefit those with the resources to bend and manipulate those rules with lawyers. Furthermore, this proposal... (read more)

The Glory System: A Model For Moral Currency And Distributed Self-Moderation

Darklight4y10

As for the cheaply punishing prolific posters problem, I don't know a good solution that doesn't lead to other problems, as forcing all downvotes to cost glory makes it much harder to deal with spammers who somehow get through the application process filter. I had considered an alternative system in which all votes cost glory, but then there's no way to generate glory except perhaps by having admins and mods gift them, which could work, but runs counter to the direct democracy ideal that I was sorta going for.