Martin Vlach

If you get an email from aisafetyresearch@gmail.com , that is most likely me. I also read it weekly, so you can pass a message into my mind that way.
Other ~personal contacts: https://linktr.ee/uhuge 

Wikitag Contributions

Comments

Sorted by

Snapshot of a local(=Czech) discussion detailing motivations and decision paths of GAI actors, mainly the big developers:

Contributor A, initial points:

For those not closely following AI progress, two key observations:

  1. Public Models vs. True Capability: Publicly accessible AI models will become increasingly poor indicators of the actual state-of-the-art in AI. Competitive AI labs will likely prioritize using their most advanced models internally to accelerate their own research and gain a dominant position, rather than releasing these top models for potentially temporary revenue gains.
  2. Recursive Self-Improvement Timeline: The onset of recursive self-improvement (leading to an "intelligence explosion," where AI significantly accelerates its own research and development) is projected by some authors to potentially begin around the end of 2025.

Analogy to Exponential Growth: The COVID-19 pandemic demonstrated how poorly humans perceive and react to exponential phenomena (e.g., ignoring low initial numbers despite a high reproduction rate). AI development is also progressing exponentially. This means it might appear that little is happening from a human perspective, until a period of rapid change occurs over just a few months, potentially causing socio-technical shifts equivalent to a century of normal development. This scenario underpins the discussion.

Contributor C:

  • Raises a question regarding point 1: Since AI algorithm and hardware development are relatively narrow domains, couldn't their progress occur somewhat in parallel with the commercial release of more generally focused models?

Contributor A:

  • Predicts this is unlikely. Assumes computational power ("compute") will remain the primary bottleneck.
  • Believes that with sufficient investment, the incentive will be to dedicate most inference compute to AI-driven AI research (or synthetic data, etc.) once recursive self-improvement starts. Notes this might already be happening, with the deployment of the strongest models possibly delayed or only released experimentally.
  • Acknowledges hardware development and token cost reduction will continue rapidly, but chip production might lag. Considers this an estimate based on discussions. Asks Contributor C if they would bet on advanced models being released soon.

Contributor C:

  • Agrees that recursive AI improvements are occurring to some degree.
  • Finds Contributor B's initial statement about the incentive structure less clear-cut, suggesting it lacks strong empirical or theoretical backing.
  • Clarifies their point regarding models: They believe different models will be released publicly compared to those used internally for cutting-edge research.

Contributor A, clarifying reasoning and premises:

  • Confirms understanding of C's view: Labs would run advanced AI research models internally while simultaneously releasing and training other generations of general models publicly.
  • Explains their reasoning regarding the incentive to dedicate inference compute to AI research is based on a theoretical argument founded on the following premises:
    1. the lab has limited compute
    2. the lab has sufficient funds
    3. the lab wants to maximize long-term profit
    4. AI development is exponential and its pace depends on the amount of compute dedicated to AI development
    5. winner takes all
  • Concludes from these premises that the optimal strategy is to devote as much compute to AI development as affordable. If premise 2 (sufficient funds) holds, labs don't need to prioritize current revenue streams from deployed models as heavily.

Contributor C, response to A's premises:

  • Agrees this perspective (parallel development of internal research models and public general models) makes the most sense, as larger firms try not to bet on a single (risky) path (mentions Sutskever's venture as a possible exception).
  • Identifies a problem or potential pitfall specifically with premise 4. Argues the dependency is much more complex or less direct, certainly not a smooth exponential curve. (Lacks capacity to elaborate further).
  • Adds nuance regarding premise 2: Continuous revenue from public models could increase the "sufficient funds," making parallel tracks logical. Considers Contributor B's premise reasonable otherwise.
  • Notes that any optimal strategy must also include budgets for defense or Operational Security (OpSec).
  • Offers a weak hypothesis: Publishing might improve understanding of intelligence or research directions, but places limited confidence in this.
     

Yeah, I've met the concept during my studies and was rather teasing for getting a great popular, easy to grasp, explanation which would also fit the definition.

It's not easy to find a fitting visual analogy TBH, which I'd find generally useful as I hold the concept to enhance general thinking.

No matter how I stretch or compress the digit 0, I can never achieve the two loops that are present in the digit 8.

0 when it's deformed by left and right pressure so that the sides meet seems to contradict?

Comparing to Gemma1, classic BigTech😅

 

And I seem to miss info on the effective context length..?

AI development risks are existential(/crucial/critical).—Does this statement quality for Extraordinary claims require extraordinary evidence?

Counterargument stands on the sampling of analogous (breakthrough )intentions, some people call those *priors* here. Which inventions do we allow in here would strongly decide if the initial claim is extraordinary or just plain and reasonable, well fit in the dangerously powerful inventions*. 

My set of analogies: nuclear energy extraction; fire; shooting; speech/writing;;

Other set: Nuclear power, bio-engineering/weapons - as those are the only two endangering whole civilised biome significantly.

Set of *all* inventions: Renders the claim extraordinary/weird/out of scope.

Does it really work on RULER( benchmark from Nvidia)?
Not sure where but saw some controversies, https://arxiv.org/html/2410.18745v1#S1 is best I did find now...

Edit: Aah, this was what I had on mind: https://www.reddit.com/r/LocalLLaMA/comments/1io3hn2/nolima_longcontext_evaluation_beyond_literal/ 

I'd vote to remove the AI capabilities here, although I've not read the article yet, just roughly grasped the topic.

It's likely not about expanding the currently existing capabilities or something like that.

Oh, I did not know, thanks.
https://huggingface.co/spaces/deepseek-ai/Janus-Pro-7B seems to show DS is still merely clueless in the visual domain, at least IMO they are loosing there to Qwen and many others.

Load More