If you get an email from aisafetyresearch@gmail.com , that is most likely me. I also read it weekly, so you can pass a message into my mind that way.
Other ~personal contacts: https://linktr.ee/uhuge
No matter how I stretch or compress the digit 0, I can never achieve the two loops that are present in the digit 8.
0 when it's deformed by left and right pressure so that the sides meet seems to contradict?
Comparing to Gemma1, classic BigTech😅
And I seem to miss info on the effective context length..?
read spent the time to read
typo?
AI development risks are existential(/crucial/critical).—Does this statement quality for Extraordinary claims require extraordinary evidence?
Counterargument stands on the sampling of analogous (breakthrough )intentions, some people call those *priors* here. Which inventions do we allow in here would strongly decide if the initial claim is extraordinary or just plain and reasonable, well fit in the dangerously powerful inventions*.
My set of analogies: nuclear energy extraction; fire; shooting; speech/writing;;
Other set: Nuclear power, bio-engineering/weapons - as those are the only two endangering whole civilised biome significantly.
Set of *all* inventions: Renders the claim extraordinary/weird/out of scope.
Does it really work on RULER( benchmark from Nvidia)?
Not sure where but saw some controversies, https://arxiv.org/html/2410.18745v1#S1 is best I did find now...
Edit: Aah, this was what I had on mind: https://www.reddit.com/r/LocalLLaMA/comments/1io3hn2/nolima_longcontext_evaluation_beyond_literal/
I'd vote to remove the AI capabilities here, although I've not read the article yet, just roughly grasped the topic.
It's likely not about expanding the currently existing capabilities or something like that.
Oh, I did not know, thanks.
https://huggingface.co/spaces/deepseek-ai/Janus-Pro-7B seems to show DS is still merely clueless in the visual domain, at least IMO they are loosing there to Qwen and many others.
draft:
Can we theoretically quantify the representational capacity of a Transformer (or other neural network architecture) in terms of the "number of functions" it can ingest&embody?
Counting Functions (Upper Bound)
The Role of Parameters and Architecture (Constraining the Space)
VC Dimension and Rademacher Complexity - existing tools/findings
6. Practical Implications and Open Questions
Would be fun to even have a practical study where we'd fine-tune fns into various sized models and see if/where a limit is getting/being hit.
link to https://www.alignmentforum.org/users/ryan_greenblatt seems malformed, - instead of _, that is.
Yeah, I've met the concept during my studies and was rather teasing for getting a great popular, easy to grasp, explanation which would also fit the definition.
It's not easy to find a fitting visual analogy TBH, which I'd find generally useful as I hold the concept to enhance general thinking.