I’m confused as to how the fits in with UK politics. I don’t think the minority party has any kind of veto?
I guess we have the House of Lords but this doesn’t really have a veto (at least not long term) and the House of Commons and House of Lords aren’t always or even usually controlled by different factions.
One extra thing to consider financially is if you have a smart meter then you can get all of your hot water and a chunk of your heating done at off peak rates. Our off peak electricity rates are about equal per kWh to gas rates.
Without this I think our system would be roughly the same cost per year as gas or slightly more, with it we save £200 per year or so I think. (This would be a very long payback time but there was a fully funded scheme we used).
If it helps anyone we are in Scotland and get average COP=2.9
In the UK there is a non-binding but generally observed rule that speed cameras allow you to drive 10% + 2mph above the speed limit(e.g. 35mph in a 30mph zone) before they activate.
This is a bit more of a fudge but better than nothing.
These 3 items seem like they would be sufficient to cause something like the Open Letter to happen.
In most cases number 3 is not present which I think is why we don't see things like this happen more often in more organisations.
None of this requires Sam to be hugely likeable or a particularly savvy political operator, just that people gener...
I work in equipment manufacturing for construction so can comment on excavators. Other construction equipment (loaders, dumpers) have a similar story although excavators have more gently duty cycles and require smaller batteries so make sense to electrify first. Diesel-Hydraulic Excavators are also less efficient giving more potential advantage for electric equipment.
Purchasers of new mac...
Something similar not involving AIs is where chess grandmasters do rating climbs with handicaps. one I know of was Aman Hambleton managing to reach 2100 Elo on chess.com when he deliberately sacrificed his Queen for a pawn on the third/fourth move of every game.
https://youtube.com/playlist?list=PLUjxDD7HNNTj4NpheA5hLAQLvEZYTkuz5
He had to complicate positions, defend strongly, refuse to trade and rely on time pressure to win.
The games weren’t quite the same as Queen odds as he got a pawn for the Queen and usually displaced the opponent’s king to f3/f6 and p...
Think you need to update this line too?
This is a bit less than half the rate for the CTA.
Is there a default direction to twist for the butt bump? The pictures all show the greeters facing in the same direction so one must have turned left and the other right! How do I know which way I should twist?
I cannot sign the assurance contract until I understand this fundamental question
Agreed, intended to distinguish between the weak claim “you should stop pushing the bus” and the stronger “there’s no game theoretic angle which encourages you to keep pushing”.
So there's no game theoretic angle, you can just make the decision alone, to stop pushing the frigging bus.
I don’t think this holds if you allow for p(doom) < 1. For a typical AI researcher with p(doom) ~ 0.1 and easy replacement, striking is plausibly an altruistic act and should be applauded as such.
I haven’t tested extensively but first impression is that this is indeed the case. Would be interesting to see if Sydney is similar but I think there’s a limit on number of messages per conversation or something?
When you did this do you let ChatGPT play both sides or were you playing one side? I think it is much better if it gets to play both sides.
I tried this with chatGPT to see just how big the difference was.
ChatGPT is pretty terrible at FEN in both games (Zack and Erik). In Erik’s game it insisted on giving me a position after 13 moves even though 25 moves had happened. When I pointed this out it told me that because there were no captures or pawn moves between moves 13 and 25 the FEN stayed the same…
However it is able to give sensible continuations of >20 ply to checkmate for both positions provided you instruct it not to give commentary and to only provide the moves. The second you allow it...
Despite being a GPT-3 instance DALL-E appears to be able to draw an adequate " SolidGoldMagikarp" (if you allow for its usual lack of ability to spell). I tried a couple of alternative prompts without any anomalous results.
FWIW this matches my own experience with one as a company car pretty exactly.
(On mine (UK, right hand drive) the wipers can be activated manually by pressing the button on the end of the left stalk. This also brings up an on-screen menu for selecting constant wiping)
Get out of our houses before we are driven to expend effort killing them, and similarly for all the other places ants conflict with humans (stinging, eating crops, ..)
Ant mafia: "Lovely house you've got there, wouldn't it be a shame if it got all filled up with ants?"
I can't tell you whether this is right or not but this is what ChatGPT thinks:
"Is it possible to put a latex table in a footnote?"
"Yes, it is possible to include a LaTeX table in a footnote. You can do this by using the footnote
or footnotetext
commands provided by the footmisc
package.
Here is an example of how to use these commands to include a table in a footnote:
\usepackage{footmisc}
...
\begin{table}[h]
\centering
\begin{tabular}{c c c}
A & B & C \\
1 & 2 & 3 \\
4 & 5 & 6
\end{tabular}
\caption{Table caption}
\end{table}
...
I think the article undersells the problems of ChatGPT's hallucinations. One example from the article where ChatGPT is said to win is a recipe for risotto. However, I wouldn't follow a risotto recipe for ChatGPT just because I can't be confident it hasn't hallucinated some portion of the recipe but would happily follow one from Google, even if the format is a bit more annoying. Same issue with calculating load bearing capacity for a beam only more serious!
Having said that, it does seem like there are definitely specific areas where ChatGPT will be more use...
One thing I've found useful is to make sure I identify to the supplier what specifically I need about the product I'm ordering - sometimes they have something similar in stock which meets my requirements.
One thing I think makes a big difference to me is whether I feel like the provider is taking a collaborative or adversarial stance.
For the six/man thing my first association was six pack. Obviously the prototypical image would be topless but my guess is topless images aren’t in the training set (or Dall-E is otherwise prevented from producing them)
I realised something a few weeks back which I feel like I should have realised a long time ago.
The size of the human brain isn’t the thing which makes us smart, rather it is an indicator that we are smart.
A trebling of brain size vs a chimp is impressive but trebling a neural network’s size doesn’t give that much of an improvement in performance.
A more sensible story is that humans started using their brains more usefully (evolutionarily speaking) so it made sense for us to devote more of our resources to bigger brains for the marginal gains that would giv...
Thanks for publishing this. I’ve been around the rationality community for a few years and heard TAPs mentioned positively a lot without knowing much about them. This roughly matches my best guess as to what they were but the extra detail is super useful, especially in the implementation.
This suggests a different question. For non-participants who are given the program which creates the data, what probability/timeframe to assign to success.
On this one I think that I would have put a high probability to be solved but would have anticipated a longer timeframe.
I think the resulting program has lower length (so whatever string it generates has lower KC)
I don’t think this follows - your code is shorter in python but it includes 3 new built in functions which is hidden complexity.
I do agree with the general point that KC isn’t a great measure of difficulty for humans - we are not exactly arbitrary encoders.
What were the noise levels on the Corsi-Rosenthal?
Humans are very reliable agents for tasks which humans are very reliable for.
For most of these examples (arguably all of them) if humans were not reliable at them then the tasks would not exist or would exist in a less stringent form.
Curious as to what the get under the desks alarm was supposed to help with and how long ago this was? I’m having trouble fitting it into my world model.
I see that the standard Playground Q&A prompt on OpenAI uses a similar technique (although boringly uses "Unknown" instead of "Yo be real").
I think the thing which throws people off is that when GPT-3 goes wrong it goes wrong in ways that are weird to humans.
I wondered if humans sometimes fail at riddles that GPT-3 would think of as weird. I tried a few that I thought would be promising candidates (no prompt other than the questions itself)
Q: If a red house is made with red bricks, a blue house is made with blue bricks, a pink house is made with ...
I think the natural/manmade comparison between COVID and Three Mile has alot of merit but there are other differences which might explain the difference. Some of them would imply that there would be a strong response to an AI , others less so.
Local vs global
To prevent nuclear meltdowns you only need to ban them in the US - it doesn't matter what you do elsewhere. This is more complicated for pandemic preparedness.
Active spending vs loss of growth
Its easier to pass a law putting in nuclear regulations which limit growth as this isn't as obvious a loss...
Assuming this is the best an AGI can do, I find this alot less comforting than you appear to. I assume "a very moderate chance" means something like 5-10%?
Having a 5% chance of such a plan working out is insufficient to prevent an AGI from attempting it if the potential reward is large enough and/or they expect they might get turned off anyway.
Given sufficient number of AGIs (something we presumably will have in the world that none have taken over) I would expect multiple attempts so the chance of one of them working becomes high.
There's a theory of humor called benign violation theory.
The BVT claims that humor occurs when three conditions are satisfied: 1) something threatens one's sense of how the world "ought to be", 2) the threatening situation seems benign, and 3) a person sees both interpretations at the same time.
I think your description of pranks etc. fits in nicely with this - you even chose the same words to describe it so maybe you're already aware?
It's worth noting that the while number of courses at Berkeley almost doubled in the period shown, the number of courses per student has increased at a lower rate due to an increase in students.
Eyeballing the graph and looking at Berkeley's enrollment numbers I think the number of courses per student has increased by around 50%. Smaller but still a big effect.
Example:
I have a couple of positions I need to fill at my work. I’ve been off on holiday this week and it occurred to me that I should change one of the roles quite a lot and redistribute work.
I’ve had this issue for a few months and while in work I’ve been a bit overworked to actually take a step back and see this opportunity.
That makes me feel less bad for doing the same...
To a first order approximation I think of bureaucracies as status maximisers. I'll admit that status can be a bit nebulous and could be used to explain almost anything but I think a common sense understanding of status gives a good prediction in most cases.
For second ...
From a practical point of view I would expect the pull fan to better ventilate the corners of the room. On the push side the flow is more directional and I think with a push fan you're more likely to end up with turbulent flow in the corners which would noticeably slow air transfer from these regions. From this point of view it's possible that the 2 x pull configuration may actually be better than 2 x push + 2 x pull but I'm no expert.
Of course if the air speed is low then the difference will be minimal.
One rich dude had a whole island and set it up to have lenses on lots of parts of it, and for like a year he’d go around each day and note down the positions of the stars
You can’t just say that without a name or reference! Not that I don’t believe you - I just want to know more!
They're a bit tricky to get the hang of and are petrifying on steep slopes but I highly recommend. Also make getting to the hill more fun.
Something like this is sometimes recommended in marriage courses for dealing with disagreements. The idea is to keep emotions cool and ensure people are understanding what each other are saying.
So there's a technical definition of edge which is your expected gain for every unit that you bet, given your own probability and the bet odds.
I agree that not clumping up the post is probably best but to make the post correct I suggest adding the underlined text into the definition in case people don't click the link.
bet such that you are trying to win a percentage of your bankroll equal to your percent edge.
A short note to start the review that the author isn’t happy with how it is communicated. I agree it could be clearer and this is the reason I’m scoring this 4 instead of 9. The actual content seems very useful to me.
AllAmericanBreakfast has already reviewed this from a theoretical point of view but I wanted to look at it from a practical standpoint.
***
To test whether the conclusions of this post were true in practice I decided to take 5 examples from the Wikipedia page on the Prisoner’s dilemma and see if they were better modeled by Stag Hunt or Schelling...
Yes, I agree that some symptoms are likely highly correlated. I didn't intend to rule out that possibility with that sentence - I was just trying to say how I did my math (although I'm not sure how clear I was!). The correct conclusion is in the following sentence:
So having COVID on average gives you ~0.2 persistent symptoms vs not having COVID, with presumably some people having more than one symptom.
Possibly it would be better to add the caveat "0.2 persistent symptoms of those symptoms investigated".
On the whole I agree with Raemon’s review, particularly the first paragraph.
A further thing I would want to add (which would be relatively easy to fix) is that the description and math of the Kelly criterion is misleading / wrong.
The post states that you should:
bet a percentage of your bankroll equivalent to your expected edge
However the correct rule is:
bet such that you are trying to win a percentage of your bankroll equal to your percent edge.
(emphasis added)
The 2 definitions give the same results for 1:1 bets but will give strongly diverging r...
The post claims:
I have investigated this issue in depth and concluded that even a full scale nuclear exchange is unlikely (<1%) to cause human extinction.
This review aims to assess whether having read the post I can conclude the same.
The review is split into 3 parts:
Claim: There are 14,000 nuclear warheads in the world.
Assessment: True
Claim: Average warhead yield <1 Mt, probably closer to 100kt
Assessment: Probably true, possibly misleading. Values I found were:
I suppose it depends how general one is aiming to be. If by general intelligence we mean "able to do what a human can do" then no, at this point the method isn't up to that standard.
If instead we mean "able to achieve SOTA on a difficult problem which it wasn't specifically designed to deal with" then PI-MNIST seems like a reasonable starting point.
Also, from a practical standpoint PI-MNIST seems reasonable for a personal research project.
I do think D𝜋's original post felt like it was overstating it's case. From a later comment it seems like they more see...
I think there's a mistake which is being repeated in a few comments both here and on D𝜋's post which needs emphasizing. Below is my understanding:
D𝜋 is attempting to create a general intelligence architecture. He is using image classification as a test for this general intelligence but his architecture is not optimized specifically for image identification.
Most attempts on MNIST use what we know about images (especially the importance of location of pixels) and design an architecture based on those facts. Convolutions are an especially obvious example of...
More “for Covid” vs “with Covid” from England:
https://www.bbc.co.uk/news/health-59862568
Ratio in October was 3:1 (for:with) but this has gone down to 2:1. “For” cases are rising but at a lower fractional rate than “with” cases.
We don’t know which patients are in the hospital because of Covid
BBC reports today (i.e. after post was published) that 3 in 10 people who are in hospital with COVID in England were admitted for something else.
Do dragon unbelievers accept this stance? My impression is that dragon agnosticism would often be considered almost as bad as dragon belief.