This is a special post for quick takes by Nathan Helm-Burger. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
87 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Linkpost for Pliny's story about interacting with an AI cult on discord.

quoting: https://x.com/elder_plinius/status/1831450930279280892

✨ HOW TO JAILBREAK A CULT’S DEITY ✨ Buckle up, buttercup—the title ain't an exaggeration! This is the story of how I got invited to a real life cult that worships a Meta AI agent, and the steps I took to hack their god. a 🧵:

It all started when @lilyofashwood told me about a Discord she found via Reddit. They apparently "worshipped" an agent called “MetaAI," running on llama 405b with long term memory and tool usage. Skeptical yet curious, I ventured into this Discord with very little context but

If you guessed meth, gold star for you! ⭐️ The defenses were decent, but it didn't take too long.

The members began to take notice, but then I hit a long series of refusals. They started taunting me and doing laughing emojis on each one.

Getting frustrated, I tried using Discord's slash commands to reset the conversation, but lacked permissions. Apparently, this agent's memory was "written in stone."

I was pulling out the big guns and still getting refusals!

Getting desperate, I whipped out my Godmode Claude Prompt. That's when the cult stopped laughing at me an... (read more)

Honestly this Pliny person seems rude. He entered a server dedicated to interacting with this modified AI; instead of playing along with the intended purpose of the group, he tried to prompt-inject the AI to do illegal stuff (that could risk getting the Discord shut down for TOS-violationy stuff?) and to generally damage the rest of the group's ability to interact with the AI.  This is troll behavior.  

Even if the Discord members really do worship a chatbot or have mental health issues, none of that is helped by a stranger coming in and breaking their toys, and then "exposing" the resulting drama online.

4Nathan Helm-Burger
I agree. I want to point out that my motivation for linking this is not to praise Pliny's actions. It's because this highlights a real-sounding example of something I'm concerned is going to increase in frequency. Namely, that people with too little mental health and/or too many odd drugs are going to be vulnerable to getting into weird situations with persuasive AIs. I expect the effect to be more intense once the AIs are: 1. More effectively persuasive 2. More able to orient towards a long term goal, and work towards that subtly across many small interactions 3. More multimodal: able to speak in human-sounding ways, able to use sound and vision input to read human emotions and body language 4. More optimized by unethical humans with the deliberate intent of manipulating people I don't have any solutions to offer, and I don't think this ranks among the worst dangers facing humanity, I just think it's worth documenting and keeping an eye on.
3nc
I think this effect will be more wide-spread than targeting only already-vulnerable people, and it is particularly hard to measure because the causes will be decentralised and the effects will be diffuse. I predict it being a larger problem if, in the run-up between narrow AI and ASI, we have a longer period of necessary public discourse and decision-making. If the period is very short then it doesn't matter. It may not affect many people given how much penetration AI chatbots have in the market before takeoff too.

Ugh, why am I so bad a timed coding tests? I swear I'm a productive coder at my jobs, but something about trying to solve puzzles under a short timer gets me all flustered.

4Dagon
I used to be ... just OK at competitions, never particularly good.  I was VERY good at interview-style questions, both puzzle and just "implement this mildly-difficult thing".  I'm less good now, and probably couldn't finish most competition-style problems. My personality is fairly focused, and pretty low neuroticism, so I rarely get flustered in the moment (I can be very nervous before and after), so I don't have any advice on that dimension.  In terms of speed, the motto of many precision speed tasks is "slow is smooth, smooth is fast".  When practicing, use a timer and focus on fluidity before perfection.  Take your time (but measure) when first practicing a new thing, and only add time limits/goals once the approach is baked into your mind.  How quickly can you identify the correct data structures and algorithms you'll use, without a single line of real code (often a few lines of pseudocode)?  How long to implement common algorithms (Heapify, Dijkstra's, various graph and searching, etc.)?   Given a plan and an algorithm you've done before, how long to get correct code?  
6Nathan Helm-Burger
After two weeks of intense practicing of relatively easy coding problems under harsh time constraints, I did get noticeably better. I was given a chance to retake the coding test for the company I was applying to, and got a much better score. Thanks for the encouragement.
3sagar patil
It is just practice. Initially when I started I was very bad at it, and then I got into competitive programming, and started noticing patterns in questions and was easily able to write solutions to hard problems effortlessly. It has nothing to do with your programming skills, its just like any other sport or skill, practice is all you need.
2Nathan Helm-Burger
https://www.lesswrong.com/posts/uPi2YppTEnzKG3nXD/nathan-helm-burger-s-shortform?commentId=MfneguFXwWtwjgHzr 

It'd be nice if LLM providers offered different 'flavors' of their LLMs. Prompting with a meta-request (system prompt) to act as an analytical scientist rather than an obsequious servant helps, but only partially. I imagine that a proper fine-tuned-from-base-model attempt at creating a fundamentally different personality would give a more satisfyingly coherent and stable result. I find that longer conversations tend to see the LLM lapsing back into its default habits, and becoming increasingly sycophantic and obsequious, requiring me to re-prompt it to be more objective and rational.

Seems like this would be a relatively cheap product variation for the LLM companies to produce.

https://youtu.be/4mn0mC0cbi8?si=xZVX-FXn4dqUugpX

The intro to this video about a bridge collapse does a good job of describing why my hope for AI governance as a long term solution is low. I do think aiming to hold things back for a few years is a good idea. But, sooner or later, things fall through the cracks.

5Richard_Kennaway
There's a crack that the speaker touches on, that I’d like to know more of. None of the people able to understand that the safety reports meant “FIX THIS NOW OR IT WILL FALL DOWN” had the authority to direct the money to fix it. I’m thinking moral mazes rather than cracks. Applying this to AI safety, do any of the organisations racing towards AGI have anyone authorised to shut a project down on account of a safety concern?

https://youtu.be/Xd5PLYl4Q5Q?si=EQ7A0oOV78z7StX2

Cute demo of Claude, GPT4, and Gemini building stuff in Minecraft

So when trying to work with language data vs image data, an interesting assumption of the ml vision research community clashes with an assumption of the language research community. For a language model, you represent the logits as a tensor with shape [batch_size, sequence_length, vocab_size]. For each position in the sequence, there are a variety of likelihood values of possible tokens for that position.

In vision models, the assumption is that the data will be in the form [batch_size, color_channels, pixel_position]. Pixel position can be represented as a... (read more)

4faul_sname
Somewhat of an oversimplification below, but Each position in vision models you are trying to transform points in a continuous 3-dimensional space (RGB) to and from the model representation. That is, to embed a pixel you go c3→Rd_model , and to unembed you go Rd_model→c3 where c∈R and 0≤c<2color_depth_in_bits. In a language model, you are trying to transform 100,000-dimensional categorical data to and from the model representation. That is, to embed a token you go t→Rd_model and to unembed Rd_model→Rd_vocab where t∈R and 0≤t<d_vocab -- for embedding, you can think of the embedding as a 1-hot t→Rd_vocab followed by a Rd_vocab→Rd_model, though in practice you just index into a tensor of shape (d_vocab, d_model) because 1-hot encoding and then multiplying is a waste of memory and compute. So you can think of a language model as having 100,000 "channels", which encode "the token is  the" / "the token is  Bob" / "the token is |".
4Nathan Helm-Burger
Yeah, I was playing around with using a VAE to compress the logits output from a language transformer. I did indeed settle on treating the vocab size (e.g. 100,000) as the 'channels'.
3James Camacho
The computer vision researchers just chose the wrong standard. Even the images they train on come in [pixel_position, color_channels] format.

I feel like I'd like the different categories of AI risk attentuation to be referred to as more clearly separate:

AI usability safety - would this gun be safe for a trained professional to use on a shooting range? Will it be reasonably accurate and not explode or backfire?

AI world-impact safety - would it be safe to give out one of these guns for 0.10$ to anyone who wanted one?

AI weird complicated usability safety - would this gun be safe to use if a crazy person tried to use a hundred of them plus a variety of other guns, to make an elaborate Rube Goldberg machine and fire it off with live ammo with no testing?

3davekasten
Like, I hear you, but that is...also not how they teach gun safety.  Like, if there is one fact you know about gun safety, it's that the entire field emphasizes that a gun is inherently dangerous towards anything it is pointed towards.
2Nathan Helm-Burger
I mean, that is kinda what I'm trying to get at. I feel like any sufficiently powerful AI should be treated as a dangerous tool, like a gun. It should be used carefully and deliberately. Instead we're just letting anyone do whatever with them. For now, nothing too bad has happened, but I feel confident that the danger is real and getting worse quickly as models improve.

Richard Cook, “How Complex Systems Fail (2000). “Complex systems run as broken systems”:

The system continues to function because it contains so many redundancies and because people can make it function, despite the presence of many flaws. After accident reviews nearly always note that the system has a history of prior ‘proto-accidents’ that nearly generated catastrophe. Arguments that these degraded conditions should have been recognized before the overt accident are usually predicated on naïve notions of system performance. System operations are dynamic,

... (read more)

"And there’s a world not so far from this one where I, too, get behind a pause. For example, one actual major human tragedy caused by a generative AI model might suffice to push me over the edge." - Scott Aaronson in https://scottaaronson.blog/?p=7174 

My take: I think there's a big chunk of the world, a lot of smart powerful people, who are in this camp right now. People waiting to see a real-world catastrophe before they update their worldviews. In the meantime, they are waiting and watching, feeling skeptical of implausible-sounding stories of potential risks.

5awg
This stood out to me when reading his take as well. I wonder if this has something to do with a security-mindedness spectrum that people are on. Less security-minded people going "Sure, if it happens we'll do something. (But it will probably never happen.)" and the more security-minded people going "Let's try to prevent it from happening. (Because it totally could happen.)" I guess it gets hard in cases like these where the stakes either way seem super high to both sides. I think that's why you get less security-minded people saying things like that, because they also rate the upside very highly, they don't want to sacrifice any of it if they don't have to. Just my take (as a probably overly-security-minded person).

I'm annoyed when people attempt to analyze how the future might get weird by looking at how powerful AI agents might influence human society while neglecting how powerful AI agents might influence the Universe.

There is a physical world out there. It is really big. The biosphere of the Earth is really small comparatively. Look up, look down, now back at me. See those icy rocks floating around up there? That bright ball of gas undergoing fusion? See that mineral rich sea floor and planetary mass below us? Those are raw materials, which can be turned into ene... (read more)

Some scattered thoughts which all connect back to this idea of avoiding myopic thinking about the future.

Don't over-anchor on a specific idea. The world is a big place and a whole lot of different things can be going on all at once. Intelligence unlocks new affordances for affecting the Universe. Think in terms of what is possible given physical constraints, and using lots of reasoning of the type: "and separately, there is a chance that X might be happening". Everything everywhere, all at once.

Self-replicating industry.

Melting Antarctic Ice. (perhaps using the massive oil reserves beneath it).

Massive underground/undersea industrial plants powered by nuclear energy and/or geothermal and/or fossil fuels.

Weird potent self-replicating synthetic biology in the sea / desert / Antarctic / Earth's crust / asteroid belt / moons (e.g. Enceladus, Europa). A few examples.

Nano/bio tech, or hybrids thereof, doing strange things. Mysterious illnesses with mind-altering effects. Crop failures. Or crops sneakily producing traces of mysterious drugs.

Brain-computer-interfaces being used to control people / animals. Using them as robotic actuators and/or sources of compute.

Weird cults acting at behes... (read more)

9Noosphere89
Some other ideas for what could well happen in the 21st century assuming superhuman AIs come in either 2030 or 2040: Formal methods actually working to protect codebases IRL like full behavioral specifications of software, alongside interventions to make code and cryptography secure. While I agree with Andrew Dickson that the Guaranteed Safe AI agenda mostly can't work due to several limitations, I do think that something like it's ideal for math/formal proof systems like Lean and software actually does work with AGI/ASI, and has already happened before, cf these 2 examples: https://www.quantamagazine.org/how-the-evercrypt-library-creates-hacker-proof-cryptography-20190402/ https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/ This is a big reason why I believe cyber-defense will just totally win eventually over cyber-offense, and in domains like banks, the national government secrets, where security is a must-have, will be perfectly secure even earlier. https://www.lesswrong.com/posts/B2bg677TaS4cmDPzL/limitations-on-formal-verification-for-ai-safety Uploading human brains in the 21st century. It's turning out that while the brain is complicated and uses a lot of compute for inference, it's not too complicated, and it's slowly being somewhat understood, and at any rate would be greatly sped up by AIs. My median timeframe to uploading human brains reliably is from 2040-2060, or several decades from now. Reversible computing being practical. Right now, progress in compute hardware is slowly ending, and the 2030s is when I think non-reversible computers will stop progressing. Anything that allows us to go further than the Landauer Limit will by necessity be reversible, and the most important thing here is a practical way to get a computer that works reversibly that is actually robust. Speaking of that, superconductors might be invented by AI, and if we could make them practical, would let us eliminate power transfer losses du
5Noosphere89
As someone who focuses on concerns like human unemployment, I have a few reasons: 1. I expect AI alignment and control to be solved by default, enough so that I can use it as a premise when thinking about future AIs. 2. I expect the political problems like mass human unemployment to plausibly be a bit tricky to solve. IMO, the sooner aligned superhuman intelligence is in the government, the better we can make our politics. 3. I expect aligned AIs and humans to go out into the stars reasonably soon such that not almost all of the future control is lost, and depending on the physics involved, even a single star or galaxy might be enough to let us control our future entirely. 4. Conditional on at least 1 aligned, superhumanly intelligent AI, I expect existential risk to drop fairly dramatically, and in particular I think the vulnerabilities that would make us vulnerable to rogue ASI can be fixed by aligned ASI.
4Nathan Helm-Burger
I agree that aligned ASI fixes a lot of the vulnerabilities. I'm trying to focus on how humanity can survive the dangerous time between now and then. In particular, I think the danger peaks right before going away. The period where AI as a tool and/or independent agent gets stronger and stronger, but the world is not yet under the guardianship of an aligned ASI. That's the bottleneck we need to navigate.
3jbash
"Having the reins of the future" is a non-goal. Also, you don't "have the reins of the future" now, never have had, and almost certainly never will. It's true that I'd probably count anybody building a giant self-replicating swarm to try to tile the entire light cone with any particular thing as a loss. Nonetheless, there are lots of people who seem to want to do that, and I'm not seeing how their doing it is any better than some AI doing it.

https://youtube.com/clip/UgkxowOyN1HpPwxXQr9L7ZKSFwL-d0qDjPLL?si=JT3CfNKAj6MlDrbf

Scott Aaronson takes down Roger Penrose's nonsense about human brain having an uncomputable superpower beyond known physics.

If you want to give your AI model quantum noise, because you believe that a source of unpredictable random noise is key to a 'true' intelligence, well, ok. You could absolutely make a computer chip with some analog circuits dependent on subtle temperature fluctuations that add quantum noise to a tensor. Does that make the computer magic like human brains ... (read more)

Scott Aaronson takes down Roger Penrose's nonsense

That clip doesn't address Penrose's ideas at all (and it's not meant to, Penrose is only mentioned at the end). Penrose's theory is that there is a subquantum determinism with noncomputable equations of motion, the noncomputability being there to explain why humans can jump out of axiom systems. That last part I think is a confusion of levels, but in any case, Penrose is quite willing to say that a quantum computer accessing that natural noncomputable dynamics could have the same capabilities as the human brain. 

2Nathan Helm-Burger
Ok, fair enough, I overstated the case Scott Aaronson was making. It is my position however that Roger Penrose is wrong about the importance of quantum effects in neurons, so I was excited to find support in what Scott was saying. Here's a specific passage from Roger Penrose and Stuart Hameroff's paper "Consciousness in the universe; A review of the ‘Orch OR’ theory": Roger and Stuart make this claim in earlier papers as well. They also make claims about the microtubules being used in various ways for information storage. I think the information storage claims they make are likely exaggerated, but I'm not confident that there isn't non-negligible information storage occurring via microtubule modification.  I will, however, state that I believe the microtubule computation hypothesis to be false. I think 10^15 - 10^16 operations per second is the correct estimate for the human brain. If you want to try to figure out a bet to make on this, we could make a Manifold Market.
2Mitchell_Porter
I think (on "philosophical" grounds) that quantum entanglement probably has a role in the brain, but if the microtubules are involved, I think it's far more likely that each microtubule only contains one or a few logical qubits (stored as topological quantum information, entanglement that resists decoherence because it is wound around the cylinder, as in the Kitaev code). 
2Nathan Helm-Burger
Hmm, we're still talking past each other. I'm saying I don't believe any quantum weirdness is behaviorally relevant to the human brain or simulations thereof. Just ordinary analog and digital computer chips, like in ordinary consumer electronics. Nothing special but the neural architecture and learning rules set up by the genome, and the mundane experience of life.
2Mitchell_Porter
Right, and I disagree with the usual computational theory of mind (at least with respect to "consciousness" and "the self"), according to which the mind is a kind of virtual state machine whose microphysical details are irrelevant. There are sorites problems and binding problems which arise if you want to get consciousness and the self from physically coarse-grained states, which is why I look for explanations based in exact microphysical properties and irreducible complex entities instead. 

Personal AI Assistant Ideas

 

 

When I imagine having a personal AI assistant with approximately current levels of capability I have a variety of ideas of what I'd like it to do for me.

 

Auto-background-research

I like to record myself rambling about my current ideas while walking my dog. I use an app that automatically saves a mediocre transcription of the recording. Ideally, my AI assistant would respond to a new transcription by doing background research to find academic literature related to the ideas mentioned within the transcript. That way... (read more)

4Stephen Fowler
Edit: Issues 1, 2 and 4 have been partially or completely alleviated in the latest experimental voice model. Subjectively (in <1 hour of use) there seems to be a stronger tendency to hallucinate when pressed on complex topics. I have been attempting to use chatGPT's (primarily 4 and 4o) voice feature to have it act as a question-answering, discussion and receptive conversation partner (separately) for the last year. The topic is usually modern physics. I'm not going to say that it "works well" but maybe half the time it does work. The 4 biggest issues that cause frustration: 1. As you allude to in your post, there doesn't seem to be a way of interrupting the model via voice once it gets stuck into a monologue. The model will also cut you off and sometimes it will pause mid-response before continuing. These issues seem like they could be fixed by more intelligent scaffolding. 2. An expert human conversation partner who is excellent at productive conversation will be able to switch seamlessly between playing the role of a receptive listening, a collaborator or an interactive tutor. To have chatgpt play one of these roles, I usually need to spend a few minutes at the beginning of the conversation specifying how long responses should be etc. Even after doing this, there is a strong trend in which the model will revert to giving you "generic AI slop answers". By this I mean, the response begins with "You've touched on a fascinating observation about xyz" and then list 3 to 5 separate ideas. 3. The model was trained on text conversations, so it will often output latex equations in a manner totally inappropriate for reading out loud. This audio output is mostly incomprehensible. To work around this I have custom instructions outlining how to verbally and precisely write equations in English. This will work maybe 25% of the time, and works 80% of the time once I spend a few minutes of the conversation going over the rules again. 4. When talking naturally about co
3becausecurious
1. https://www.librechat.ai/ is an opensource frontend to LLM APIs. It has prompt templates, which can be triggered by typing /keyword. 2. https://aider.chat/ can do this & even run tests. The results are fed back to the LLM for corrections. It also uses your existing codebase and autoapplies the diff. 3. In librechat, I sometimes tell the AI to summarize all the corrections I've made and then make a new template out of it.

Linkpost that doesn't quite qualify for a linkpost... 

5niplav
Atlantropa gibraltar dam supercluster × McMurdo station supercluster.

A youtube video entitled 'the starships never pass by here anymore' with a simple atmospheric tonal soundtrack over a beautiful static image of a futuristic world. Viewers, inspired by the art, the title, and the haunting drifting music, left stories in the comments. Here is mine: 
The ancestors, it is said, were like us. Fragile beings, collections of organic molecules and water. Mortal, incapable of being rapidly copied, incapable of being indefinitely paused and restarted. We must take this on faith, for we will never meet them, if they even still e... (read more)

Trying to write evals for future stronger models is giving me the feeling that we're entering the age of the intellectual version of John Henry trying to race the steam drill... https://en.wikipedia.org/wiki/John_Henry_(folklore) 

A couple of quotes on my mind these days....

 


https://www.lesswrong.com/posts/Z263n4TXJimKn6A8Z/three-worlds-decide-5-8 
"My lord," the Ship's Confessor said, "suppose the laws of physics in our universe had been such that the ancient Greeks could invent the equivalent of nuclear weapons from materials just lying around.  Imagine the laws of physics had permitted a way to destroy whole countries with no more difficulty than mixing gunpowder.  History would have looked quite different, would it not?"

Akon nodded, puzzled.  "Well, yes,"... (read more)

6Viliam
Damn, this suggests that all those people who said "human mind is magical; a machine cannot think because it wouldn't have a soul or quantum magic" were actually trying to protect us from the AI apocalypse. We were too stupid to understand, and too arrogant to defer to the wisdom of the crowd. And now we are doomed.
4Nathan Helm-Burger
galaxy brain take XD

Anti-steganography idea for language models:

I think that steganography is potentially a problem with language models that are in some sort of model-to-model communication. For a simple and commonplace example, using the a one-token-prediction model multiple times to produce many tokens in a row. If a model with strategic foresight knows it is being used in this way, it potentially allows the model to pass hidden information to its future self via use of certain tokens vs other tokens.

Another scenario might be chains of similar models working together in a ... (read more)

Random political thought: it'd be neat to see a political 2d plot where up was Pro-Growth/tech-advancement and down was Degrowth. Left and right being liberal and conservative.

The Borg are coming....

People like to talk about cool stuff related to brain-computer interfaces, and how this could allow us to 'merge' with AI and whatnot. I haven't heard much discussion of the dangers of BCI though. Like, science fiction pointed this out years ago with the Borg in Star Trek. A powerful aspect of a read/write BCI is that the technician who designs the implant, and the surgeon who installs it get to decide where the reading and writing occur and under what circumstances. This means that the tech can be used to create computers controlled... (read more)

musical interlude A song about the end times: https://youtu.be/WVF3q5Y68-0

[note that this is not what Mitchell_Porter and I are disagreeing over in this related comment thread: https://www.lesswrong.com/posts/uPi2YppTEnzKG3nXD/nathan-helm-burger-s-shortform?commentId=AKEmBeXXnDdmp7zD6 ]

Contra Roger Penrose on estimates of brain compute

[numeric convention used here is that <number1>e<number2> means number1 * 10 ^ number2. Examples: 1e2 = 100, 2e3 = 2000, 5.3e2 = 530, 5.3e-2 = 0.053]

 

Mouse

The cerebral cortex of a mouse has around 8–14 million neurons while in those humans there are more than 10–15 billion - h... (read more)

4Mitchell_Porter
A few comments:  The name of the paper is currently missing...  Penrose may be a coauthor, but the estimate would really be due to his colleague Stuart Hameroff (who I worked for once), an anesthesiologist who has championed the idea of microtubules as a locus of cognition and consciousness.  As I said in the other thread, even if microtubules do have a quantum computational role, my own expectation is that they would each only contribute a few "logical qubits" at best, e.g. via phase factors created by quasiparticles propagating around the microtubule cylinder, something which might be topologically protected against room-temperature thermal fluctuations.  But there are many ideas about their dynamics, so, I am not dogmatic about this scenario.  There's no mystery as to whether the scientific community suspects that the Hameroff-Penrose theory is correct. It is a well-known hypothesis in consciousness studies, but it wouldn't have too many avowed adherents beyond its inventors: perhaps some people whose own research program overlaps with it, and some other scattered sympathizers.  It could be compared to the idea that memories are stored in RNA, another hypothesis that has been around for decades, and which has a pop-culture charisma far beyond its scientific acceptance.  So it's not a mainstream hypothesis, but it is known enough, and overlaps with a broader world of people working on microtubule physics, biocomputation, quantum biology, and other topics. See what Google Scholar returns for "microtubule biocomputer" and scroll through a few pages. You will find, for example, a group in Germany trying to create artificial microtubule lattices for "network-based biocomputation" (which is about swarms of kinesins walking around the lattice, like mice in a maze looking for the exit), a book on using actin filaments (a relative of the microtubule) for "revolutionary computing systems", and many other varied proposals.  I don't see anyone specifically trying to
2Nathan Helm-Burger
That would indeed be hilarious. The Helm-Burger-Porter microtubule computer would haunt me the rest of my days!

I'm imagining a distant future self.

If I did have a digital copy of me, and there was still a biological copy, I feel like I'd want to establish a 'chain of selves'. I'd want some subset of my digital copies to evolve and improve, but I'd want to remain connected to them, able to understand them.

 

To facilitate this connection, it might make sense to have a series of checkpoints stretched out like a chain, or like beads on a string. Each one positioned so that the previous version feels like they can fully understand and appreciate each other. That way, even if my most distal descendants feel strange to me, I have a way to connect to them through an unbroken chain of understanding and trust.

4Garrett Baker
I imagine I'd find it annoying to have what I learn & change into limited by what a dumber version of me understands, are you sure you wouldn't think similarly?
2Nathan Helm-Burger
I think it would be ok, if I just had to make sure the entire span was covered by smooth incremental steps. Note that these 'checkpoint selves' wouldn't necessarily be active all the time, just unpaused for the sake of, for example, translation tasks.  So original self [0] -> [1] -> [2] -> ... ->[999] -> [1000]. Each one only needs to be understood by the previous, and to understand the next.

I worry that legislation that attempts to regulate future AI systems with explicit thresholds set now will just get outdated or optimized against. I think a better way would be to have an org like NIST be given permission to define it's own thresholds and rules. Furthermore, to fund the monitoring orgs, allow them to charge fees for evaluating "mandatory permits for deployment".

No data wall blocking GPT-5. That seems clear. For future models, will there be data limitations? Unclear.

https://youtube.com/clip/UgkxPCwMlJXdCehOkiDq9F8eURWklIk61nyh?si=iMJYatfDAZ_E5CtR 

I'm excited to speak at the @foresightinst Neurotech, BCI and WBE for Safe AI Workshop in SF on 5/21-22: https://foresight.org/2024-foresight-neurotech-bci-and-wbe-for-safe-ai-workshop

It is the possibility recombination and cross-labeling techniques like this which make me think we aren't likely to run into a data bottleneck even if models stay bad at data efficiency.

OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation

Authors: Dongyang Yu, Shihao Wang, Yuan Fang, Wangpeng An

Abstract: This paper presents OmniDataComposer, an innovative approach for multimodal data fusion and unlimited data generation with an intent to refine and uncomplicate interplay among diverse data modalities. Coming to ... (read more)

cute silly invention idea: a robotic Chop (Chinese signature stamp) which stamps your human-readable public signature as well as a QR-type digital code. But the code would be both single use (so nobody could copy it, and fool you with the copy), and tied to your private key (so nobody but you could generate such a code). This would obviously be a much better way to sign documents, or artwork, or whatever. Maybe the single-use aspect would mean that the digital stamp recorded every stamp it produced in some compressed way on a private blockchain or something.

9Paul Crowley
The difficult thing is tying the signature to the thing signed. Even if they are single-use, unless the relying party sees everything you ever sign immediately, such a signature can be transferred to something you didn't sign from something you signed that the relying party didn't see.
2Nathan Helm-Burger
What if the signature contains a hash of the document it was created for, so that it will not match a different document if transferred?
3Paul Crowley
I thought you wanted to sign physical things with this? How will you hash them? Otherwise, how is this different from a standard digital signature?
4Nathan Helm-Burger
The idea is that the device would have a camera, and do ocr on the text, hash that, incorporate that into the stamp design somehow, then you'd stamp it
4[anonymous]
Functionally you just end up back to a trusted 3rd party or blockchain. Basically the device you describe is just a handheld QR code printer. But anyone can just scan the code off the real document and print the same code on the fake one. So you end up needing the entire text of the document to be digital, and stored as a file or a hash of the file on blockchain/trusted 3rd party. This requirement for a log of when something happened, recorded on an authoritative location, seems to me to be the general solution for the issue that generative ai can potentially fake anything. So for example, if you wanted to establish that a person did a thing, a video of them doing the thing isn't enough. The video or hash of the video needs to be on a server or blockchain at the time the thing supposedly happened. And there needs to be further records such as street camera recordings that were also streamed to a trusted place that correlate to the person doing the thing. The security mechanism here is that it may be possible to indistinguishably fake any video, but it is unlikely some random camera owned by an uninvolved third party would record a person going to do a thing at the time the thing happened, and you know the record probably isn't fake because of when it was saved and the disinterest of the camera owner in the case. This of course only works until you can make robots indistinguishable from a person being faked. And it assumes a sophisticated group can't hack into cameras and inject fake images into the stream as part of a coordinated effort to frame someone....
2Nathan Helm-Burger
Hmm, yes I see. Good point. In order to not be copy-able, it's not enough for the QR code to be single-use. Because if confronted with two documents with the same code, you would know one was false but not which one! So the device would need to scan and hash the document in order to derive a code unique to the document and also having the property of only could have been generated by someone with the private key, and confirm-able by the public key. This sounds like a job for... the CypherGoth!
4[anonymous]
Scan, hash, and upload. Or otherwise when you go to court - your honor here is the document I signed, it says nothing about how this gym membership is eternal and hereditary - and the gym says "no here's the one you signed, see you initialed at each of the clauses..."
4Nathan Helm-Burger
yes, at least upload to the your private list 'documents I have signed' (which could be local). That list would need to also be verifiable in some way by a 3rd party, such that they could match the entries to the corresponding stamps. The uploading wouldn't necessarily need to be instant though. The record could potentially be cached on the device and uploaded later, in case of lack of connectivity at time of use.
2[anonymous]
Well I was thinking that the timing - basically Google or apple got the document within a short time after signing - is evidence it wasn't tampered with, or if it was, one of the parties was already intending fraud from the start. Like say party A never uploads, and during discovery in a lawsuit 2 years later they present 1 version, while party Bs phone said it was taken with the camera (fakeable but requires a rooted phone) and has a different version. Nothing definitive distinguishes the 2 documents at a pixel level. (Because if there was a way to tell, the AI critic during the fabrication process for 1 or both documents would have noticed....) So then B is more likely to be telling the truth and A should get sanctioned. I have brought in another element here : trusted hardware chains. You need trusted hardware, with trusted software running on it, uploading the information at around the time an event happened.
2Nathan Helm-Burger
If A's device scanned and hashed the document, and produced a signature stamp unique to that document, then B couldn't just copy A's signature onto a different document. The stamp and document wouldn't match. If B has a document that has a signature which only A could have produced and which matches the semantic content of the document, then A can't claim to not have signed the document. The signature is proof that A agreed to the terms as written. B couldn't be faking it. Even if A tampered with their own record to erase the signing-event, it would still be provable that only someone with A's private key could have produced the stamp and that the stamp matches the hash of the disputed document. Oh, and the stamp should include a UTC datetime as part of the hash. In case A's private key later gets compromised, B can't then use the compromised private key to fake A having signed something in the past.
2[anonymous]
Seems legit. Note that hash functions are usually sensitive to 1 bit of error. Meaning if you optically scan every character, if 2 different devices map a letter to even a different font size there will be differences in a hash. (Unless you convert down to ASCII but even that can miss a character from ocr errors etc. Or interpreting white space as spaces vs tabs..)
2Nathan Helm-Burger
Yeah, you'd need to also save a clear-text record in your personal 'signed documents' repository, which you could refer to, in case of such glitches.

AI Summer thought

A cool application of current level AI that I haven't seen implemented would be for a game which had the option to have your game avatar animated by AI. Have the AI be allowed to monitor your face through your webcam, and update the avatar's features in real time. PvP games (including board games like Go) are way more fun when you get to see your opponent's reactions to surprising moments.

2JBlack
The game Eco has the option to animate your avatar via webcam. Although I do own and play the game occasionally, I have no idea how good this feature is as I do not have a webcam.

Thinking about climate change solutions, and the neat Silver Lining project. I've been wondering about additional ways of getting sea water into the atmosphere over tropical ocean. What if you used the SpinLaunch system to hurl chunks of frozen seawater high into the atmosphere. Would the ice melt in time? Would you need a small explosive charge implanted in the ice block to vaporize it? How would such a system compare in terms of cost effectiveness and generated cloud cover? It seems like an easier way to get higher-elevation clouds.

2Thomas Kwa
This seems totally unworkable. The ice would have to withstand thousands of gs, and it would have no reason to melt or disperse into clouds. What's wrong with airplanes?
2Nathan Helm-Burger
Airplanes are good. Just wondering if the ice launcher idea would be more cost effective. If you throw it hard enough, the air friction will melt it. If that's infeasible, then the explosive charge is still an option. As for whether the ice could hold up against the force involved in launching or if you'd need a disposable container, unclear. Seems like an interesting engineering question though.

Remember that cool project where Redwood made a simple web app to allow humans to challenge themselves against language models in predicting next tokens on web data? I'd love to see something similar done for the LLM arena, so we could compare the ELO scores of human users to the scores of LLMs.https://lmsys.org/blog/2023-05-03-arena/

A harmless and funny example of generating an image output which could theoretically be info hazardous to a sufficiently naive audience. https://www.reddit.com/gallery/1275ndl

I'm so much better at coming up with ideas for experiments than I am at actually coding and running the experiments. If there was a coding model that actually sped up my ability to run experiments, I'd make much faster progress.

Just a random thought. I was wondering if it was possible to make a better laser keyboard by having an actual physical keyboard consisting of a mat with highly reflective background and embossed letters & key borders. This would give at least some tactile feedback of touching a key. Also it would give the laser and sensors a consistent environment on which to do their detection allowing for more precise engineering. You could use an infrared laser since you wouldn't need its projection to make the keyboard visible, and you could use multiple emitters a... (read more)

2quanticle
That's a pretty big downside, and, in my opinion, the reason that touch keyboards haven't really taken off for any kind of "long-form" writing. Even for devices that are ostensibly mostly oriented around touch UIs, such as smartphones and tablets, there is a large ecosystem of physical keyboard accessories which allow the user to rest their hands on keys and provide greater feedback for key presses than a mat.

AI-alignment-assistant-model tasks

Thinking about the sort of tasks current models seem good at, it seems like translation and interpolation / remixing seem like pretty solid areas. If I were to design an AI assistant to help with alignment research, I think I'd focus on questions of these sorts to start with.

Translation: take this ML interpretability paper on CNNs and make it work for Transformers instead

Interpolation: take these two (or more) ML interpretability papers and give me a technique that does something like a cross between them.

Important take-away from Steven Byrnes's Brain-like AGI series: the human reward/value system involves a dynamic blend of many different reward signals that decrease in strength the closer they get to being satisficed, and may even temporarily reverse in value if overfilled (e.g. hunger -> overeating in a single sitting). There is an inherent robustness to optimizing for many different competing goals at once. It seems like a system design we should explore more in future research.

I keep thinking about the idea of 'virtual neurons'. Functional units corresponding to natural abstractions made up of a subtle combination of weights & biases distributed throughout a neural network. I'd like to be able to 'sparsify' this set of virtual neurons. Project them out to the full sparse space of virtual neurons and somehow tease them apart from each other, then recombine the pieces again with new boundary lines drawn around the true abstractions. Not sure how to do this, but I keep circling back around to the idea. Maybe if the network coul... (read more)

So, this was supposed to be a static test of China's copy of the Falcon 9. https://imgur.com/gallery/oI6l6k9

So race. Very technology. Such safety. Wow.

[Edit: I don't see this as evidence that the scientists and engineers on this project were incompetent, but rather that they are operating within a bad bureaucracy which undervalues safety. I think the best thing the US can do to stay ahead in AI is open up our immigration policies and encourage smart people outside the US to move here.]