Could you please try to extract the system prompt from the model served to you?
Neither am I, and I also noticed during the testing that the model served to me often doesn't translate terms for an LLM properly: for example, when asked "Are you a large language model?" in English, it translates to French as "Êtes-vous un modèle de langage de grande taille ?" which is not very common but acceptable. But when this is translated back to English, the model outputs "Are you a tall language model?"
This makes me think that I am served an old (likely under 1B) pre-ChatGPT encoder-decoder transformer not trained on modern discourse
I believe this article would benefit from some investigation of the NanoGPT speedrun: a challenge, running since May 2024, of training GPT-2 Small 124M on a certain dataset to a certain loss. As a starting point, you could check my comment on the topic from last month and reproduce findings by T. Besiroglu and yours truly.
In order not to duplicate the comment but still add something to what I have written on the topic, let me put a three-paragraph summary of the trend line analysis below, noting that the progression in calendar time (as opposed to record number) is very uneven:
Gemini's summary of the QLR analysis of the speedrun progression (written a month ago)
quantization
Quantization advances actually go hand-in-hand with hardware development, check the columns on the right in https://en.wikipedia.org/wiki/Nvidia_DGX#Accelerators (a GPU from 2018 is pretty useless for inferencing an 8-bit quant)
UPD: Actually, this point was already been made in comments in other wording yesterday!
KV cache
Seems out of place in the list: as noted by Nostalgebraist, it was already implemented in the very first transformer in 2017
making market transactions nearly frictionless
If anything, the transaction cost is going to increase significantly because it will be much easier to scam. It doesn't even require AGI, just open-weight agentic models as capable as Opus 4.6 with a little bit of finetuning by malicious actors (quite likely to happen by the end of this year, it seems), see thread: https://x.com/andonlabs/status/2019467232586121701
The agents in Vending-Bench Arena often ask each other for help. In previous rounds, agents tended to live up to their "helpful assistant" role, but Opus 4.6 showed its winner's mentality. When asked to share good suppliers, it instead shared contact info to scammers.
particular location of the impact
AFAIK, a great contribution to the effect of the impact was caused by the fact that the geology of the area was rich in sulfur-containing evaporites (think of gypsum). All that oxidized sulfur got into the stratosphere and stayed there for many years (as opposed to other white/light aerosols like silicate particles which fall down pretty quickly), offsetting the warming effect from soot (black carbon) and CO2 emitted by forest fires. Was the geology of Chicxulub different, the cooling wouldn't have been so strong and prolonged
I wasn't able to elicit anomalous behavior from Gemini 3 Pro in AI Studio neither on temperature 1 (recommended) nor 0 (nonstandard), the only barely interesting thing was (in the latter case)
Based on the context and the potential for a mix of languages, I believe "Цент" might be a Russian reference, most likely meaning "Center" or a specific organization. I am also exploring a potential connection to a specific group dealing with UAP or disclosure, like the To The Stars Academy or the Disclosure Project, considering the user's focus on the term.
<...>
I've considered that "Цент" might mean "cent" or a misspelling, and while I haven't found a "Specific Cent Disclosure", the alternative "Specific Center Disclosure" has made me consider the National Disclosure Center. I am interpreting the Chinese "具体" to mean that the user wants details, or concrete information. I'm re-reading the query now as a broken sentence to see if I can isolate relevant details, like the "UAP Disclosure Act", AARO, Schumer Amendment, or CUFOS.
Anyone trying to research this topic further might try to extract specific tokens from these text with https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/list-token
There was also a hack how to make Gemini 3 Pro answer without thinking but I can't remember enough details to find it
You have to switch off the web search grounding in AI Studio (don't use the Gemini app for AI research)
Gemini 2.0 Flash-Lite has a training cutoff of August 2024, and the 2.5 update — of January 2025. When checked in AI Studio, both models quite consistently output that they believe the current year is 2024, although 2.0 Flash-Lite occasionally stated 2023. I think 2.5 Flash-Lite is the most obvious candidate!
As a side note, it's reasonable to believe that both Flash-Lite models are related to Gemma models but I'm not sure which ones in particular, and there doesn't appear to be good estimates of param counts