faul_sname

Wikitag Contributions

Comments

Sorted by

I am not one of them - I was wondering the same thing, and was hoping you had a good answer.

If I was trying to answer this question, I would probably try to figure out what fraction of all economically-valuable labor each year was cognitive, the breakdown of which tasks comprise that labor, and the year-on-year productivity increases on those task, then use that to compute the percentage of economically-valuable labor that is being automated that year.

Concretely, to get a number for the US in 1900 I might use a weighted average of productivity increases across cognitive tasks in 1900, in an approach similar to how CPI is computed

  • Look at the occupations listed in the 1900 census records
  • Figure out which ones are common, and then sample some common ones and make wild guesses about what those jobs looked like in 1900
  • Classify those tasks as cognitive or non-cognitive
  • Come to estimate that record-keeping tasks are around a quarter to a half of all cognitive labor
  • Notice that typewriters were starting to become more popular - about 100,000 typewriters sold per year
  • Note that those 100k typewriters were going to the people who would save the most time by using them
  • As such, estimate 1-2% productivity growth in record-keeping tasks in 1900
  • Multiply the productivity growth for record-keeping tasks by the fraction of time (technically actually 1-1/productivity increase but when productivity increase is small it's not a major factor)
  • Estimate that 0.5% of cognitive labor was automated by specifically typewriters in 1900
  • Figure that's about half of all cognitive labor automation in 1900

and thus I would estimate ~1% of all cognitive labor was automated in 1900. By the same methodology I would probably estimate closer to 5% for 2024.

Again, though, I am not associated with Open Phil and am not sure if they think about cognitive task automation in the same way.

What fraction of economically-valuable cognitive labor is already being automated today?

Did e.g. a telephone operator in 1910 perform cognitive labor, by the definition we want to use here?

Oh, indeed I was getting confused between those. So as a concrete example of your proof we could consider the following degenerate example case

def f(N: int) -> int:
    if N == 0x855bdad365f9331421ab4b13737917cf97b5e8d26246a14c9af1adb060f9724a:
        return 1
    else:
        return 0

def check(x: int, y: float) -> bool:
    return f(x) >= y

def argsat(y: float, max_search: int = 2**64) -> int or None:
    # We postulate that we have this function because P=NP
    if y > 1:
        return None
    elif y <= 0:
        return 0
    else:
        return 0x855bdad365f9331421ab4b13737917cf97b5e8d26246a14c9af1adb060f9724a

but we could also replace our degenerate f with e.g. sha256.

Is that the gist of your proof sketch?

Finding the input x such that f(x) == argmax(f(x)) is left as an exercise for the reader though.

Is Amodei forecasting that, in 3 to 6 months, AI will produce 90% of the value derived from written code, or just that AI will produce 90% of code, by volume? It would not surprise me if 90% of new "art" (defined as non-photographic, non-graph images) by volume is currently AI-generated, and I would not be surprised to see the same thing happen with code.

And in the same way that "AI produces 90% of art-like images" is not the same thing as "AI has solved art", I expect "AI produces 90% of new lines of code" is not the same thing as "AI has solved software".

I'm skeptical.

Did the Sakana team publish the code that their scientist agent used to write the compositional regularization paper? The post says

For our choice of workshop, we believe the ICBINB workshop is a highly relevant choice for the purpose of our experiment. As we wrote in the main text, we selected this workshop because of its broader scope, challenging researchers (and our AI Scientist) to tackle diverse research topics that address practical limitations of deep learning, unlike most workshops with a narrow focus on one topic.

This workshop focuses particularly on understanding limitations of deep learning methods applied to real world problems, and encourages participants to study negative experimental outcomes. Some may criticize our choice of a workshop that encourages discussion of “negative results” (implying that papers discussing negative results are failed scientific discoveries), but we disagree, and we believe this is an important topic.

and while it is true that "negative results" are important to report, "we report a negative result because our AI agent put forward a reasonable and interesting hypothesis, competently tested the hypothesis, and found that the hypothesis was false" looks a lot like "our AI agent put forward a reasonable and interesting hypothesis, flailed around trying to implement it, had major implementation problems, and wrote a plausible-sounding paper describing its failure as a fact about the world rather than a fact about its skill level".

The paper has a few places with giant red flags where it seems that the reviewer assumes that there were solid results that the author of the paper was simply not reporting skillfully, for example in section B2

 

I favor an alternative hypothesis: the Sakana agent determines where a graph belongs, what would be on the X and Y axis of that graph, what it expects that the graph would look like, and how to generate that graph. It then generates the graph and inserts the caption the graph would show if its hypothesis was correct. The agent has no particular ability to notice that its description doesn't work with the graph.

 

Plausibly going off into the woods decreases the median output while increasing the variance.

Has anyone trained a model to, given a prompt-response pair and an alternate response, generate an alternate prompt which is close to the original and causes the alternate response to be generated with high probability?

I ask this because

  1. It strikes me that many of the goals of interpretability research boil down to "figure out why models say the things they do, and under what circumstances they'd say different things instead". If we could reliably ask the model and get an intelligible and accurate response back, that would almost trivialize this sort of research.
  2. This task seems like it has almost ideal characteristics for training on - unlimited synthetic data, granular loss metric, easy for a human to see if the model is doing some weird reward hacky thing by spot checking outputs

A quick search found some vaguely adjacent research, but nothing I'd rate as a super close match.

If this research really doesn't exist I'd find that really surprising, since it's a pretty obvious thing to do and there are O(100,000) ML researchers in the world. And it is entirely possible that it does exist and I just failed to find it with a cursory lit review.

Anyone familiar with similar research / deep enough in the weeds to know that it doesn't exist?

I think the ability to "just look up this code" is a demonstration of fluency - if your way of figuring out "what happens when I invoke this library function" is "read the source code", that indicates that you are able to fluently read code.

That said, fluently reading code and fluently writing code are somewhat different skills, and the very best developers relative to their toolchain can do both with that toolchain.

When triggered to act, are the homeostatic-agents-as-envisioned-by-you motivated to decrease the future probability of being moved out of balance, or prolong the length of time in which they will be in balance, or something along these lines?

I expect[1] them to have a drive similar to "if my internal world-simulator predicts a future sensory observations that are outside of my acceptable bounds, take actions to make the world-simulator predict a within-acceptable-bounds sensory observations".

This maps reasonably well to one of the agent's drives being "decrease the future probability of being moved out of balance". Notably, though, it does not map well to that the only drive of the agent, or for the drive to be "minimize" and not "decrease if above threshold". The specific steps I don't understand are

  1.  What pressure is supposed to push a homeostatic agent with multiple drives to elevate a specific "expected future quantity of some arbitrary resource" drives above all of other drives and set the acceptable quantity value to some extreme
  2. Why we should expect that an agent that has been molded by that pressure would come to dominate its environment.

If no, they are probably not powerful agents. Powerful agency is the ability to optimize distant (in space, time, or conceptually) parts of the world into some target state

Why use this definition of powerful agency? Specifically, why include the "target state" part of it? By this metric, evolutionary pressure is not powerful agency, because while it can cause massive changes in distant parts of the world, there is no specific target state. Likewise for e.g. corporations finding a market niche - to the extent that they have a "target state" it's "become a good fit for the environment".'

Or, rather... It's conceivable for an agent to be "tool-like" in this manner, where it has an incredibly advanced cognitive engine hooked up to a myopic suite of goals. But only if it's been intelligently designed. If it's produced by crude selection/optimization pressures, then the processes that spit out "unambitious" homeostatic agents would fail to instill the advanced cognitive/agent-y skills into them.

I can think of a few ways to interpret the above paragraph with respect to humans, but none of them make sense to me[2] - could you expand on what you mean there?

And a bundle of unbounded-consequentialist agents that have some structures for making cooperation between each other possible would have considerable advantages over a bundle of homeostatic agents.

Is this still true if the unbounded consequentialist agents in question have limited predictive power, and each one has advantages in predicting the things that are salient to it? Concretely, can an unbounded AAPL share price maximizer cooperate with an unbounded maximizer for the number of sand crabs in North America without the AAPL-maximizer having a deep understanding of sand crab biology?

  1. ^

    Subject to various assumptions at least, e.g.

    • The agent is sophisticated enough to have a future-sensory-perceptions simulato
    • The use of the future-perceptions-simulator has been previously reinforced
    • The specific way the agent is trying to change the outputs of the future-perceptions-simulator has been previously reinforced (e.g. I expect "manipulate your beliefs" to be chiseled away pretty fast when reality pushes back)

    Still, all those assumptions usually hold for humans

  2. ^

    The obvious interpretation I take for that paragraph is that one of the following must be true

    For clarity, can you confirm that you don't think any of the following:

    1. Humans have been intelligently designed
    2. Humans do not have the advance cognitive/agent-y skills you refer to
    3. Humans exhibit unbounded consequentialist goal-driven behavior

    None of these seem like views I'd expect you to have, so my model has to be broken somewhere

Load More