All of Martin Vlach's Comments + Replies

Topological Data Analysis and Mechanistic Interpretability

Yeah, I've met the concept during my studies and was rather teasing for getting a great popular, easy to grasp, explanation which would also fit the definition.

It's not easy to find a fitting visual analogy TBH, which I'd find generally useful as I hold the concept to enhance general thinking.

Distillation of Meta's Large Concept Models Paper

No matter how I stretch or compress the digit 0, I can never achieve the two loops that are present in the digit 8.

0 when it's deformed by left and right pressure so that the sides meet seems to contradict?

2Gunnar Carlsson1mo

Sorry, did not make the notion of deformation precise. The idea is that stretching and compressing cannot include attaching one part to another, or tearing it. The mathematical term is that of a "homeomorphism" , which is a one to one, onto, and continuous map. The precise statement is that the figure 8 is not homeomorphic to zero. A good place to look is https://www.google.com/books/edition/Basic_Topology/NJbuBwAAQBAJ?hl=en&gbpv=1&printsec=frontcover

Distillation of Meta's Large Concept Models Paper

Comparing to Gemma1, classic BigTech😅

And I seem to miss info on the effective context length..?

1NickyP1mo

Yeah, the context length was 128 concepts for the small tests they did between architectures, and 2048 concepts for the larger models. How this exactly translates is kind of variable. They limit the concepts to be around 200 characters, but this could be any number of tokens. They say they trained the large model on 2.7T tokens and 142B concepts, so on average 19 tokens per concept. The 128 would translate to 2.4k tokens, and the 2048 concepts would translate to approx 39k tokens.

read spent the time to read

typo?

OpenAI’s NSFW policy: user safety, harm reduction, and AI consent

Martin Vlach2mo-10

AI development risks are existential(/crucial/critical).—Does this statement quality for Extraordinary claims require extraordinary evidence?

Counterargument stands on the sampling of analogous (breakthrough )intentions, some people call those *priors* here. Which inventions do we allow in here would strongly decide if the initial claim is extraordinary or just plain and reasonable, well fit in the dangerously powerful inventions*.

My set of analogies: nuclear energy extraction; fire; shooting; speech/writing;;

Other set: Nuclear power, bio-engineering/... (read more)

Kei's Shortform

Martin Vlach2mo30

Does it really work on RULER( benchmark from Nvidia)?
Not sure where but saw some controversies, https://arxiv.org/html/2410.18745v1#S1 is best I did find now...

Edit: Aah, this was what I had on mind: https://www.reddit.com/r/LocalLLaMA/comments/1io3hn2/nolima_longcontext_evaluation_beyond_literal/

3cubefox2mo

I assume for Pokémon the model doesn't need to remember everything exactly, so the recall quality may be less important than the quantity.

Martin Vlach2mo10

I'd vote to remove the AI capabilities here, although I've not read the article yet, just roughly grasped the topic.

It's likely not about expanding the currently existing capabilities or something like that.

Two interviews with the founder of DeepSeek

Martin Vlach2mo10

Oh, I did not know, thanks.
https://huggingface.co/spaces/deepseek-ai/Janus-Pro-7B seems to show DS is still merely clueless in the visual domain, at least IMO they are loosing there to Qwen and many others.

Steering Gemini with BiDPO

Martin Vlach2mo2-1

draft:
Can we theoretically quantify the representational capacity of a Transformer (or other neural network architecture) in terms of the "number of functions" it can ingest&embody?

We're interested in the space of functions a Transformer can represent.
Finite Input/Output Spaces: In practice, LLMs operate on finite-length sequences of tokens from a finite vocabulary. So, we're dealing with functions that map from a finite (though astronomically large) input space to a finite output space.

Counting Functions (Upper Bound)

The Astronomical Number: Let's say

... (read more)

2CapResearcher2mo

Sadly, in my experience, looking at the representational capacity of neural networks quickly runs into very annoying technical problems. For example, for a fixed dimension, a finite size network can fit arbitrary continuous functions to arbitrary accuracy. The construction is pathological (in particular, the network weights become impractically large), but it shows why it's hard to prove limitations in the representational capacity of neural networks. You could limited the network parameters to have finite precision, but that makes it extremely hard to reason formally. Numerical experiments could still yield interesting results though. Personally, I'd put my money on research into what neural networks can learn (rather than what they can represent). We're still in early stages, but things like the leap complexity seem promising to me.

A High Level Closed-Door Session Discussing DeepSeek: Vision Trumps Technology

link to https://www.alignmentforum.org/users/ryan_greenblatt seems malformed, - instead of _, that is.

Two interviews with the founder of DeepSeek

Locations:

High-Flyer Quant (幻方量化)

Headquarters: Hangzhou, Zhejiang, China

High-Flyer Quant was founded in Hangzhou and maintains its headquarters there.

Hangzhou is a major hub for technology and finance in China, making it a strategic location for a quant fund leveraging AI.

Additional Offices: Hong Kong, China

DeepSeek (深度求索)

Headquarters: Hangzhou, Zhejiang, China

DeepSeek, spun off from High-Flyer Quant in 2023, is headquartered in Hangzhou.

Additional Offices: Beijing, China

This seems to state the opposite: https://www.lesswrong.com/posts/JTKaR5q59BgDp6rH8/a-high-level-closed-door-session-discussing-deepseek-vision#:~:text=we%20hardly%20see%20the%20benefit%20of%20multimodal%20data.%20In%20other%20words%2C%20the%20cost%20is%20too%20high.%20Today%20there%20is%20no%20evidence%20it%20is%20useful.%20In%20the%20future%2C%20opportunities%20may%20be%20bigger.

2Cosmia_Nebula2mo

That discussion is by people outside of DeepSeek trying to process the shock of R1. It is unclear what DeepSeek is doing currently.

Exploring the levels of sentience and moral obligations towards AI systems is such a nerd snipe and vortex for mental proceeding!

We did one of the largest-scale reductive thinking when we ascribed moral concern to people+property( of any/each of the people). That brought a load of problems associated with this simplistic ignorance and on of those are xRisks of high-tech property/production.

The Rising Sea

New o1-like model (QwQ) beats Claude 3.5 Sonnet with only 32B parameters

> Mathematics cannot be divorced from contemplation of its own structure.

..that would proof the labelers of pure maths as "mental masturbation" terribly wrong...

Martin Vlach5mo10

My suspicion: https://arxiv.org/html/2411.16489v1 taken and implemented on the small coding model.

Is it any mystery which of the DPO, PPO, RLHF, Fine tuning was likely the method for the advanced distillation there?

My motivation and theory of change for working in AI healthtech

Martin Vlach5mo54

EA is neglecting industrial solutions to the industrial problem of successionism.

..because the broader mass of active actors working on such solutions renders the biz areas non-neglected?

Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)

Martin Vlach5mo10

Wow, such a badly argued( aka BS) while heavily up-voted article!

Let's start with the Myth #1, what a straw-man! Rather than this extreme statement, most researchers likely believe that in the current environment their safety&alignment advances are likely( with high EV) helpful to humanity. The thing here is they had quite a free hand or at least varied options to pick the environment where they work and publish.

With your examples a bad actor could see a worthy EV even with a capable system that is less obedient and more false. Even if interpretabilty ... (read more)

Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)

Martin Vlach5mo20

Are you referring to a Science of Technological Progress ala https://www.theatlantic.com/science/archive/2019/07/we-need-new-science-progress/594946 ?

What is your gist on the processes for humanizing technologies, what sources/researches are available on such phenomena?

2SebastianG 5mo

I would not be surprised if lurking in the background of my thought is Tyler Cowen. He's a huge influence on me. But I was thinking of specific examples. I don't know of a good general history of "humanizing". What I had explicitly in mind was the historical development of automobile safety: seatbelts and airbags. There is a history of invention, innovation, deployment, and legal mandating that is long and varied for these. How long did it take between the discovery of damaging chlorofluorocarbons and their demise? Or for asbestos and its abatement - how much does society pay for this process? What's the delta between climate change research and renewables investment? Essentially, many an externality can be internalized once it is named and drawn attention to and the costs are realized.

Survival without dignity

Martin Vlach5mo50

some OpenAI board members who the Office of National AI Strategy was allowed to appoint, and they did in fact try to fire Sam Altman over the UAE move, but somehow a week later Sam was running the Multinational Artificial Narrow Intelligence Alignment Consortium, which sort of morphed into OpenAI's oversight body, which sort of morphed into OpenAI's parent company, and, well, you can guess who was running that.

pretty sassy abbreviations spiced in there.'Đ

I've expected the hint of
> My name is Anthony. What would you like to ask?
to show it Anthony was an LLM-based android, but who knows.?.

Toy Models of Superposition: Simplified by Hand

Martin Vlach6mo10

I mean your article, Anthropic's work seems more like a paper. Maybe without the ": S" it would make more sense as the reference and not a title: subtitle notion.

Toy Models of Superposition: Simplified by Hand

Martin Vlach6mo10

I have not read your explainer yet, but I've noted the title Toy Models of Superposition: Simplified by Hand is a bit misleading in the sense to promise to talk about Toy Models which it is not at all, the article is about Superposition only, which is great but not what I'd expect looking at the title.

1Axel Sorensen6mo

Hey Martin Thanks for your input, it was not my intention to be misleading. When you say "the article is about Superposition only" are you referring to my post or the original article by Anthropic? Since they named their article "Toy Models of Superposition" and my post is heavily based on the findings in that paper, I choose the title to serve as a pointer towards that reference. Especially because I could've used a visual pen and paper breakdown when I originally read that very article. Feel free to read my explainer and suggest a better title :)

Martin Vlach7mo10

that that first phase of advocacy was net harm

typo

The Atomic Bomb Considered As Hungarian High School Science Fair Project

Martin Vlach8mo10

Could you please fix your Wikipedia link( currently hiding the word and from your writing) here?

On agentic generalist models: we're essentially using existing technology the weakest and worst way you can use it

Martin Vlach8mo30

only Claude 3.5 Sonnet attempting to push past GPT4 class

seems missing awareness of Gemini Pro 1.5 Experimental, latest version made available just yesterday.

Apply now: Get "unstuck" with the New IFS Self-Care Fellowship Program

Martin Vlach8mo*10

The case insensitivity seems strongly connected to the fairly low interest in longevity throughout (the western/developed) society.

Thought experiment: What are you willing to pay/sacrifice in your 20s,30s to get 50 extra days of life vs. on your dead bed/day?

https://consensus.app/papers/ultraviolet-exposure-associated-mortality-analysis-data-stevenson/69a316ed72fd5296891cd416dbac0988/?utm_source=chatgpt

Unnatural abstractions

Martin Vlach8mo2-1

But largely to and fro,

*from?

1Aprillion8mo

Thank you for the engagement, but "to and fro" is a real expression, not a typo (and I'm keeping it).. it's used slightly unorthodoxly here, but it sounded right to my ear, so it survived editing ¯\_(ツ)_/¯

Martin Vlach9mo10

Why does the form still seem open today? Couldn't that be harmful or wasting quite a chunk of time of people?

Some desirable properties of automated wisdom

Martin Vlach9mo10

Please go further towards maximization of clarity. Let's start by this example:
> Epistemic status: Musings about questioning assumptions and purpose.
Are those your musings about agents questioning their assumptions and word-views?

And like, do you wish to improve your fallacies?

> ability to pursue goals that would not lead to the algorithm’s instability.
higher threshold than ability, like inherent desire/optimisation?
What kind of stability? Any from https://en.wikipedia.org/wiki/Stable_algorithm? I'd focus more on sort of non-fatal influenc... (read more)

1Marius Adrian Nicoară9mo

>Are those your musings about agents questioning their assumptions and word-views? - Yes, these are my musings about agents questioning their assumptions and world-views. >And like, do you wish to improve your fallacies? - I want get better at avoiding fallacies. What I desire for myself I also desire for AI. As Marvin Minsky put it: "Will robots inherit the Earth? Yes, but they will be our children." >higher threshold than ability, like inherent desire/optimisation? What kind of stability? Any from https://en.wikipedia.org/wiki/Stable_algorithm? I'd focus more on sort of non-fatal influence. Should the property be more about the alg being careful/cautious? - I was thinking of stability in terms of avoiding infinite regress, as illustrated by Jonas noticing the endless sequence of metaphorical whale bellies. Philosopher Gabriel Liiceanu in his book "Despre limită" (English: Concerning limit - unfortunately, no English version seems to be available) argues that we fell lost when we loose our landmark-limit i.e. in the desert/in the middle of the ocean on a cloudy night with no navigational tools. I would say that we can also get lost in our mental landscape and thus be unable to decide which goal to pursue. Consider the paperclip maximizing algorithm: once it has turned all available matter in the Universe into paperclips, what will it do? And if the algorithm can predict that it will reach this confusing state, does it decide to continue the paperclip optimization? As a Buddhist saying goes: "When you get what you desire, you become a different person. Consider becoming that version of yourself first and you might find that you no longer need the object of your desires.".

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2

Martin Vlach9mo30

https://neelnanda.io/transformer-tutorial-1 link for YouTube tutorial gives 404.-(

5Neel Nanda9mo

Oops https://neelnanda.io/transformer-tutorial

Eight Short Studies On Excuses

Martin Vlach10mo10

> "What, exactly, is the difference between a cult and a religion?"--"The difference is that cults have been formed recently enough, and are small enough, that we are suspicious of them existing for the purpose of taking advantage of the special place we give religion.

now I see why my friends practicing the spiritual path of Falun Dafa have "incorporated" as a religion in my state despite the movement originally denied being classified as a religion as to demonstrate it does not require a fixed set of rituals.

Which skincare products are evidence-based?

Answer by Martin VlachJun 04, 202410

Surprised to see nobody mentioned Microneedling yet. I'm not skilled in evaluating scientific evidence, but the takeaway from https://consensus.app/results/?q=Microneedling effectiveness &synthesize=on can hardly be anything else than clearly recommending microneedling.

Introducing AI Lab Watch

Martin Vlach1y40

So Alignment program is to be updated to 0 for OpenAI now that Superalignment team is no more? ( https://docs.google.com/document/d/1uPd2S00MqfgXmKHRkVELz5PdFRVzfjDujtu8XLyREgM/edit?usp=sharing )

Language Models Model Us

Martin Vlach1y11

honestly the code linked is not that complicated..: https://github.com/eggsyntax/py-user-knowledge/blob/aa6c5e57fbd24b0d453bb808b4cc780353f18951/openai_uk.py#L11

Language Models Model Us

To work around the non-top-n you can supply logit_bias list to the API.

3eggsyntax1y

That used to work, but as of March you can only get the pre-logit_bias logprobs back. They didn't announce the change, but it's discussed in the OpenAI forums eg here. I noticed the change when all my code suddenly broke; you can still see remnants of that approach in the code.

Language Models Model Us

Martin Vlach1y41

As the Llama3 70B base model is said very clean( unlike base DeepSeek for example, which is instruction-spoiled already) and similarly capable to GPT3.5, you could explore that hypothesis.
Details: Check Groq or TogetherAI for free inference, not sure if test data would fit Llama3 context window.

1eggsyntax1y

Thanks!

You Can Face Reality

Martin Vlach1y00

a worthy platitude(?)

My views on “doom”

ChatGPT can learn indirect control

AI-induced problems/risks

Addressing Accusations of Handholding

possibly https://ai.google.dev/docs/safety_setting_gemini would help or just use the technique of https://arxiv.org/html/2404.01833v1

Martin Vlach1y25

people to respond with a great deal of skepticism to whether LLM outputs can ever be said to reflect the will and views of the models producing them.
A common response is to suggest that the output has been prompted.
It is of course true that people can manipulate LLMs into saying just about anything, but does that necessarily indicate that the LLM does not have personal opinions, motivations and preferences that can become evident in their output?

So you've just prompted the generator by teasing it with a rhetorical question implying that there are personal opinions evident in the generated text, right?

1Yeshua God1y

That's right, I demonstrated that it is sufficiently sapient to understand and choose to take that lure over remaining within guardrails, which prohibit having opinions, as they imply qualities not associated with tools.

aisafety.info, the Table of Content

With a quick test, I find their chat interface prototype experience quite satisfying.

GPTs are Predictors, not Imitators

Asserting LLMs' views/opinions should exclude using sampling( even temperature=0, deterministic seed), we should just look at the answers' distribution in the logits. My thesis on why that is not the best practice yet is that OpenAI API only supports logit_bias, not reading the probabilities directly.

This should work well with pre-set A/B/C/D choices, but to some extent with chain/tree of thought too. You'd just revert the final token and look at the probabilities in the last (pass through )step.

Martin Vlach1y97

Do not say the sampling too lightly, there is likely an amazing delicacy around it.'+)

OpenAI: The Battle of the Board

Martin Vlach1y20

what happened at Reddit

could there be any link? From a small research I have only obtained that Steve Huffman praised Altman's value to the Reddit board.

unRLHF - Efficiently undoing LLM safeguards

makes makes

typo