streawkceur — LessWrong

LESSWRONG
LW

Replying toGemini 3 is Evaluation-Paranoid and Contaminated

Gemini 3 is Evaluation-Paranoid and Contaminated

As mentioned in my other comment, the reason an LLM would behave like that is because during the time all its training data was written, end-2025 was a future date. So this is apparently something that needs to be trained out, which was not done in the case of Gemini. (when using AI studio). One way to reduce the behavior is to put "today is <date>" into the system prompt, but even then, it apparently spends an inordinate amount of tokens validating and pondering that date.

Replying toGemini 3 is Evaluation-Paranoid and Contaminated

streawkceur2mo

Gemini 3 is Evaluation-Paranoid and Contaminated

TL;DR: Gemini 3 frequently thinks it is in an evaluation when it is not, assuming that all of its reality is fabricated. It can also reliably output the BIG-bench canary string, indicating that Google likely trained on a broad set of benchmark data.

To my understanding, you only observe this effect for prompts that indicate or imply the current late-2025 time/2025 year. Gemini completes such prompts with "that must be hypothetical writing", because in the vast majority of its training data, 2025 was in the future (and end-2025 was always hypothetical). I think it is more accurate to phrase this as "Gemini 3 goes off the rails when it sees a prompt that indicates it was written in 2025, because in its training data, everything that implied a 2025 time was a fictional scenario" (that's also true for 2.5). Or did you manage to elicit such such an effect with a prompt from which the current after-training-data-cutoff date can't be inferred?

streawkceur's Shortform

streawkceur

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Replying toHow uniform is the neocortex?

streawkceur6y

How uniform is the neocortex?

Deep learning is a general method in the sense that most tasks are solved by utilizing a handful of basic tools from a standard toolkit, adapted for the specific task at hand. Once you’ve selected the basic tools, all that’s left is figuring out how to supply the training data, specifying the objective that lets the AI know how well it’s doing, throwing a lot of computation at the problem, and fiddling with details. My understanding is that there typically isn’t much conceptual ingenuity involved in solving the problems, that most of the work goes into fiddling with details, and that trying to be clever

streawkceur6y

Coronavirus: Justified Key Insights Thread

Let's assume that half of the deaths of currently infected people have happened, due to the lockdown extending the doubling time from three days to more than a week.

How do you draw that conclusion?

Replying toCoronavirus Justified Practical Advice Summary

streawkceur6y

Coronavirus Justified Practical Advice Summary

According to this article, it seems clear by now that low oxygen is in fact dangerous even when you feel fine, so buying a pulse oximeter is useful.

https://www.nytimes.com/2020/04/20/opinion/coronavirus-testing-pneumonia.html

Replying toHow to evaluate (50%) predictions

streawkceur6y

How to evaluate (50%) predictions

1. I think the "calibration curves" one sees e.g. in https://slatestarcodex.com/2020/04/08/2019-predictions-calibration-results/ are helpful/designed to evaluate/improve a strict subset of prediction errors: Systematic over- oder underconfidence. Clearly, there is more to being an impressive predictor than just being well-calibrated, but becoming better-calibrated is a relatively easy thing to do with those curves. One can also imagine someone who naturally generates 50 % predictions that are over-/underconfident.

2.0. Having access to "baseline probabilities/common-wisdom estimates" is mathematically equivalent to having a "baseline predictor/woman-on-the-street" whose probability estimates match those baseline probabilities. I think your discussion can be clarified and extended by not framing it as "judging the impressiveness of one person by comparing their estimates against a... (read more)

Replying toWhy I'm Not Vegan

streawkceur6y

Why I'm Not Vegan

I think your argument can be strengthened by multiplying all the animal-year-values by 1000 - this would yield a value of veganism of 430 $/year, which is still less than what eating meat would be worth to a typical LW user, and yields values for the worth of animals that are probably higher than what most vegans would claim.

Replying toApril Coronavirus Open Thread

streawkceur6y

April Coronavirus Open Thread

Why are surgical or self-made masks supposed to be better at protecting others than at protecting oneself? Naively, it seems to me that the percentage of filtered droplets/aerosol should be the same regardless of the direction in which it is breathed.

Replying toApril Coronavirus Open Thread

streawkceur6y

April Coronavirus Open Thread

I'd like to point out that the growth in India is still exponential (linear on the log-scale) https://www.worldometers.info/coronavirus/country/india/. This could be or become true of other developing countries.

India and other developing countries probably have a harder time controlling the outbreak (and governments and the young, food-insecure populations may judge the economic cost of social distancing to be higher than the risk of the virus).

There was a time when the number of worldwide cases appeared to stagnate because of the Chinese lockdown, but this number just hid the exponential growth of the European+US outbreaks.

What I said doesn't contradict any explicit statement in your comment, I just want to argue against the hypothetical deduction from "the growth rate of the world as a whole has also turned linear" to "and this means that the world is over the hill".

Replying toMarch Coronavirus Open Thread

streawkceur6y

March Coronavirus Open Thread

EDIT: The South Korean press releases contain a chart somewhat like the one I wanted, see e.g. https://www.cdc.go.kr/board/board.es?mid=a30402000000&bid=0030

I am looking for a better overview of imported cases by country of origin in East Asian countries.

EDIT: I remembered incorrectly, the following is wrong. In particular, I recall a statistic according to which a significant number of imported cases in South Korea in one day ~1-2 weeks ago came from China (~12, vs ~40 Europeans).

If this is true, this would seem to me like strong evidence that China is lying about having all domestic cases isolated, and community spread suppressed.