samhealy

Hydrocarbon person in Edinburgh, Scotland

Wiki Contributions

Comments

Sorted by

(Your response and arguments are good, so take the below in a friendly and non-dogmatic spirit)

Good enough for what?

Good enough for time-pressed people (and lazy and corrupt people, but they're in a different category) to have a black-box system do things for them that they might, in the absence of the black-box system, have invested effort to do themselves, and as an intended or unintended result, increased their understanding, opening up new avenues of doing and understanding. 

We're still in the "wow, an AI made this" stage.

I'm pretty sure we're currently exiting the "wow, an AI made this" stage, in the sense that 'fakes' in some domains are approaching undetectability.

We find that people don't value AI art, and I don't think that's because of its unscarcity or whatever, I think it's because it isn't saying anything

I strongly agree, but it's a slightly different point: that art is arguably not art if it was not made by an experiencing artist, whether flesh or artificial. 

My worry about the evaporation of (human) understanding covers science (and every other iterably-improvable, abstractable domain) as well as art, and the impoverishment of creative search space that might result from abstraction-offloading will not be strongly affected by cultural attitudes toward proximately AI-created art — even less so when there's no other kind left.

the machine needs to understand the needs of the world and the audience, and as soon as machines have that...

It's probably not what you were implying, but I'd complete your sentence like so: "as soon as machines have that, we will have missed the slim chance we have now to protect human understanding from being fully or mostly offloaded."

but where that tech is somehow not reliable and independent enough to be applied to ending the world...

I'm afraid I still can't parse this, sorry! Are you referring to AI Doom, or the point at which A(G)I becomes omnipotent enough to be able to end the world, even if it doesn't?

By the time we have to worry about preserving our understanding of the creative process against automation of it, we'll be on the verge of receiving post-linguistic knowledge transfer technologies and everything else, quicker than the automation can wreak its atrophying effects.

I don't have any more expertise or soothsaying power than you, so in the absence of priors to weight the options, I guess your opposing prediction is as likely as mine.

I'd just argue that the bad consequences of mine are bad enough to motivate us to try to stop it coming true, even if it's not guaranteed to do so.

I don't think I'm selling what you're not buying, but correct me if I misrepresent your argument:

The post seems to assume a future version of generative AI that no longer has the limitations of the current paradigm which obligate humans to check, understand, and often in some way finely control and intervene in the output...

Depending on your quality expectations, even existing GenAI can make good-enough content that would otherwise have required nontrivial amounts of human cognitive effort. 

but where that tech is somehow not reliable and independent enough to be applied to ending the world...

Ending the world? Where does that come in?

and somehow we get this long period where we get to feel the cultural/pedagogical impacts of this offloading of understanding, where it's worth worrying about, where it's still our problem. That seems contradictory.

If your main thrust is that by the time GenAI's outputs are reliable enough to trust implicitly we won't need to maintain any endogenous understanding because the machine will cook up anything we can imagine, I disagree. The space of 'anything we can imagine' will shrink as our endogenous understanding of concepts shrinks. It will never not be 'our problem'.

For what it's worth, I think even current, primitive-compared-to-what-will-come LLMs sometimes do a good job of (choosing words carefully here) compiling information packages that a human might find useful in increasing their understanding. It's very scattershot and always at risk of unsolicited hallucination, but in certain domains that are well and diversely represented in the training set, and for questions that have more or less objective answers, AI can genuinely aid insight. 

The problem is the gulf between can and does. For reasons elaborated in the post, most people are disinclined to invest in deep understanding if a shallower version will get the near-term job done. (This is in no way unique to a post-AI world, but in a post-AI world the shallower versions are falling from the sky and growing on trees.)

My intuition is that the fix would have to be something pretty radical involving incentives. Non-paternalistically we'd need to incentivise real understanding and/or disincentivise the short cuts. Carrots usually being better than sticks, perhaps a system of micro-rewards for those who use GenAI in provably 'deep' ways? [awaits a beatdown in the comments]

To clarify: I didn't just pick the figures entirely at random. They were based on the below real-world data points and handwavy guesses.

  • ChatGPT took 3.23 x 10^23 FPOPs to train
  • ChatGPT has a context window of 8K tokens
  • Each token is roughly equivalent to four 8-bit characters = 4 bytes, so the context window is roughly equivalent to 4 x 8192 = 32KB
  • The corresponding 'context window' for AIOS would need to be its entire 400MB+ input, a linear scaling factor of 1.25 x 10^4 from 32KB, but the increase in complexity is likely to be much faster than linear, say quadratic
  • AIOS needs to output as many of the 2 ^ (200 x 8 x 10 ^ 6) output states as apply in its (intentionally suspect) definition of 'reasonable circumstances'. This is a lot lot lot bigger than an LLM's output space
  • (3.23 x 10 ^ 23) x (input scaling factor of 1.56 x 10 ^ 8) x (output scaling factor of a lot lot lot) = conservatively, 3.4 x 10 ^ 44
  • Current (September 2023) estimate of global compute capacity is 3.98 x 10 ^ 21 FLOPS. So if every microprocessor on earth were devoted to training AIOS, it would take about 10 ^ 23 seconds = about 30000000000000000 years. Too long, I suspect.

I'm fully willing to have any of this, and the original post's argument, laughed out of court given sufficient evidence. I'm not particularly attached to it, but haven't yet been convinced it's wrong.

Agreed, largely. 

To clarify, I'm not arguing that AI can't surpass humanity, only that there are certain tasks for which DNNs are the wrong tool and a non-AI approach is and possibly always will be preferred. 

An AI can do such calculations the normal way if it really needs to carry them out

This is a recapitulation of my key claim: that any future asymptotically powerful A(G)I (and even some current ChatGPT + agent services) will have non-AI subsystems for tasks where precision or scalability is more easily obtained by non-AI means, and that there will probably always be some such tasks.

Plucked from thin air, to represent the (I think?) reasonably defensible claim that a neural net intended to predict/synthesise the next state (or short time series of states) of an operating system would need to be vastly larger and require vastly more training than even the most sophisticated LLM or diffusion model.

Is a photographer "not an artist" because the photos are actually created by the camera?

This can be dispensed with via Chalmers' and Clarke's Extended Mind thesis. Just as a violinist's violin becomes the distal end of their extended mind, so with brush and painter, and so with camera and photographer.

As long as AI remains a tool and does not start to generate art on its own, there will be a difference between someone who spends a lot of time carefully crafting prompts and a random bozo who just types "draw me a masterpiece"

I'm not as optimistic as you about that difference being eternal. The pre-to-post-AI leap is fundamentally more disruptive, more kind-not-degree, more revolutionary than the painting-to-photography leap or even the analogue-to-digital-photography leap[1]. With wirehead logic (why bother seeking enjoyable experiences if we can take dopamine-releasing drugs? Why bother with dopamine-acting drugs if we can release dopamine directly?) post-threshold GenAI dangles the promise of circumventing every barrier to 'creation' except the raw want. No longer is time, ten thousand hours' practice, inspiration, sticktoitiveness or talent required. 

In short, I fear that the difference (long may it exist) between the work of the careful craft-prompter and that of the random bozo will shrink with each generation. More specifically, the group of people who can distinguish between one and the other will collapse to a smaller and smaller elite (art historians, forensic AI experts), outside which paranoia and apathy will reign. That is not the democratic utopia we got promised by the CEOs.

  1. ^

    The principles of composition and optics, e.g. vanishing point, golden ratio, focal length, colour balance, depth of field, etc., apply to digital photography (and even rendered 3D graphics) just as they apply to analogue: the knowledge base is the same and a good, non-bozo photographer will excel in either submedium. Contrast this to an armchair prompt engineer who can 'create' a professional-seeming photorealistic image in seconds without knowing a thing about any of these principles. 

    Whether this is good/democratic or bad/obscurantist is open for debate, but I think it shows that the GenAI shift is different from and bigger than the earlier disruptive shifts it gets compared to.

Agreed. However,

  1. Do you think those IRL art forms will always be immune to generative capture and the threshold? 
  2. Even if the analogue original (a sculpture, a particular live dance performance, a particular live theatre performance) remains immune, most people will consume it through digital reproduction (photographs, VR, AR, video, audio) for which the threshold does apply.

Oh I see! So that's just the default slideshow range, padded front or back with zeroes, and you can enter much longer locations manually?

I like this.

It feels related to the assertion that DNNs can only interpolate between training data points, never extrapolate beyond them. (Technically they can extrapolate, but the results are hilarious/nonsensical/bad in proportion to how far beyond their training distribution they try to go.)

Here's how I see your argument 'formalised' in terms of the two spaces (total combinatorial phase space and a post-threshold GenAI's latent space over the same output length), please correct anything you think I've got wrong:

A model can only be trained on what already exists (though it can augment its dataset to reduce overfitting etc.), and what already exists approximately overlaps with what humans consider to be good, or at least sticky. What humans consider to be good or sticky changes over time; the points in phase space that are not in the latent space of and therefore not accessible to GenAI 1.0 are different from those not in the latent space of GenAI 2.0, assuming GenAI 2.0 has had access to new, external training data from human or other agents with ongoing life experiences. GenAI 1.0 therefore cannot claim to have subsumed GenAI 2.0 before the fact, whereas GenAI 1.0 + 2.0 has a provably broader gamut.

Though I like the spirit of this, it doesn't quite refute the certainty of provenance-erasure, at least in theory. With a training dataset that ends at time t1, a sufficiently high-fidelity GenAI could (re-)create anything plausibly or actually human-made at or before t1, and only fresh creativity from agents with subjective experience between t1 and t2, when the model's next generation begins training, is fully 'protected' (unless the fresh creativity is actively excluded from the next model's training data). 

Also: what about the limiting case of a CI-style model that is always in combined training-and-inference mode, so is never more than trivially out-of-date? (I know current architectures aren't designed for this, but it doesn't seem like a lunatic leap.)

Objections aside, the idea of dynamic interaction (and perhaps Hofstadterian feedback loops) between experiential agent and environment being the secret sauce that DNNs can never recreate is appealing. Can the idea be made rigorous and less dualism-adjacent? What exactly is the dynamic setup getting that DNNs aren't? Maximal or optimal transfer of order/entropy between agent and environment? An irreducible procedure in which Kolmogorov complexity = string length, so that any attempt to compress or reduce dimensionality kills its Truly Generative spirit?

Load More