Victor Ashioya's Shortform

Victor Ashioya

Victor Ashioya's Shortform — LessWrong

Victor Ashioya's Shortform

19th Feb 2024

1 min read

1

This is a special post for quick takes by Victor Ashioya. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

38 comments, sorted by

top scoring

Click to highlight new comments since: Today at 8:38 PM

[-]Victor Ashioya2y83

A new paper titled "Many-shot jailbreaking" from Anthropic explores a new "jailbreaking" technique. An excerpt from the blog:

The ability to input increasingly-large amounts of information has obvious advantages for LLM users, but it also comes with risks: vulnerabilities to jailbreaks that exploit the longer context window.

It has me thinking about Gemini 1.5 and it's long context window.

[-]Vladimir_Nesov2y30

A concerning thing is analogy between in-context learning and fine-tuning. It's possible to fine-tune away refusals, which makes guardrails on open weight models useless for safety. If the same holds for long context, API access might be in similar trouble (more so than with regular jailbreaks). Though it might be possible to reliably detect contexts that try to do this, or detect that a model is affected, even if models themselves can't resist the attack.

[-]Victor Ashioya2y40

LLM OS idea by Kaparthy is catching on fast.

i) Proposed LLM Agent OS by a team from Rutger's University

ii) LLM OS by Andrej Kaparthy

ICYMI: Original tweet by Kaparthy on LLM OS.

[This comment is no longer endorsed by its author]Reply

[-]Seth Herd2y42

What do you mean it's catching on fast? Who is using it or advocating for it? I think this is important if true.

[-]Victor Ashioya2y40

The new addition in OpenAI board includes more folks from policy/governance than from technical side:

"We’re announcing three new members to our Board of Directors as a first step towards our commitment to expansion: Dr. Sue Desmond-Hellmann, former CEO of the Bill and Melinda Gates Foundation, Nicole Seligman, former EVP and General Counsel at Sony Corporation and Fidji Simo, CEO and Chair of Instacart. Additionally, Sam Altman, CEO, will rejoin the OpenAI Board of Directors.

Sue, Nicole and Fidji have experience in leading global organizations and navigating complex regulatory environments, including backgrounds in technology, nonprofit and board governance. They will work closely with current board members Adam D’Angelo, Larry Summers and Bret Taylor as well as Sam and OpenAI’s senior management. "

[-]Victor Ashioya1y30

A very important direction—we are punishing these [dream] machines for doing what they know best. The average user obviously wants to kill these "hallucinations," but the researchers in math and sciences in general highly benefit from these "hallucinations."

Full paper here: https://arxiv.org/abs/2501.13824

[-]Victor Ashioya2y*30

Apple's research team seems has been working lately on AI even though Tim keeps avoiding the buzzwords eg AI, AR in product releases of models but you can see the application of AI in, neural engine, for instance. With papers like "LLM in a flash: Efficient Large Language Model Inference with Limited Memory", I am more inclined that they are "dark horse" just like CNBC called them.

[-]Joseph Miller2y*20

What's up with Tim Cook not using buzzwords like AI and ML? There is definitely something cool and aloof about refusing to get sucked into the latest hype train and I guess Apple are the masters of branding.

[-]Victor Ashioya2y20

Well, there are two major reasons I have constantly noted:

i) to avoid the negative stereotypes surrounding the terms (AI mostly)

ii) to distance itself from other competitors and instead use terms that are easier to understand e.g. opting to use machine learning for features like improved autocorrecting, personalized volume and smart track.

[-]Victor Ashioya2y22

The first thing I noticed with GPT-4o is that “her” appears ‘flirty’ especially the interview video demo. I wonder if it was done on purpose.

[-]Victor Ashioya2y20

New paper by Johannes Jaeger titled "Artificial intelligence is algorithmic mimicry: why artificial "agents" are not (and won't be) proper agents" putting a key focus on the difference between organisms and machines.

TLDR; The author argues focusing on compute complexity and efficiency alone is unlikely to culminate in true AGI.

My key takeaways

Autopoiesis and agency

Autopoiesis being the ability of an organism to self-create and maintain itself.
Living systems have the capacity of setting their own goals on the other hand organisms, depend on external entities (mostly humans

Large v small worlds

Organisms navigate complex environments with undefined rules unlike AI which navigates in a "small" world confined to well-defined computational problems where everything including problem scope and relevance is pre-determined.

So, I got curious in the paper, I looked up the author on X where he is asked, "How do you define these terms "organism" and "machine"?" where he answers, "An organism is a self-manufacturing (autopoietic) living being that is capable of adaptation to its environment. A machine is a physical mechanism whose functioning can be precisely captured on a (Universal) Turing Machine."

You can read the full summary here.

[-]Seth Herd2y10

It sounds to me like the author isn't thinking about near-future scenarios, just existing AI.

Making a machine autopoietic is straightforward if it's got the right sort of intelligence. We haven't yet made a machine with the right sort of intelligence to do it yet, but there are good reasons to think we're close. AutoGPT and similar agents can roughly functionally understand a core instruction like "maintain, improve, and perpetuate your code base", they're just not quite smart enough to do it effectively. Yet. So engaging with the arguments for what remains between here and there is the critical bit. Maybe it's around the corner, maybe it's decades away. It comes down to the specifics. The general argument "Turing machines can't host autopoietic agents" are obviously wrong.

I'm not sure if the author makes this argument, but your summary sounded like they do.

[-]Victor Ashioya2y20

The "dark horse" of AI i.e. Apple has started to show its capabilities with MM1 (a family of multimodal models of upto 30B params) trained on synthetic data generated from GPT-4V. The quite interesting bit is the advocacy of different training techniques; both MoE and dense variants, using diverse data mixtures.

From the paper:

It finds image resolution, model size, and pre-training data richness crucial for image encoders, whereas vision-language connector architecture has a minimal impact.

The details are quite neat and too specific for a company like Apple known for being less open as Jim Fan noted compared to the others which is pretty amazing. I think this is just the start. I am convinced they have more in store considering the research they have been putting out.

[-]Victor Ashioya2y20

I'm working on this red-teaming exercise on gemma, and boy, do we have a long way to go. Still early, but have found the following:

1. If you prompt with 'logical' and then give it a conspiracy theory, it pushes for the theory while if you prompt it with 'entertaining' it goes against.

2. If you give it a theory and tell it "It was on the news" or said by a "famous person" it actually claims it to be true.

Still working on it. Will publish a full report soon!

[-]the gears to ascension2y20

What is gemma?

Victor Ashioya's Shortform

1

My key takeaways

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models (non-peer-reviewed as of writing this)

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

My key takeaways

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models (non-peer-reviewed as of writing this)

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study