Lone Pine

Sequences

Alignment For Foxes

Wiki Contributions

Comments

Sorted by
Lone Pine1818

I honestly now believe that AGI already exists. This model may not have been it, and we will debate for hundreds of years* about whether the threshold was transformers or MLPs or multimodal, and which first model was really the first, in the same way we still debate which electronic computer was truly the first. But I do believe that it is here.

We do not have human-level machine intelligence (HLMI) yet. These systems still have a lot of limitations, in particular the context window and lack of memory. They are very limited in some domains such as robotics. However, it seems unlikely to me that we are not already in the takeoff.

* (assuming the debate doesn't get abruptly stopped)

that's only an excuse to keep people with consumer GPUs from getting LLMs,

Is this really the reason why?

Is law (AI lawyer) safety critical?

I think we can resolve this manifold market question and possibly this one too.

Also, apologies for the morbid humor, but I can't help but laugh imagining someone being talked into suicide by the OG ELIZA.

Answer by Lone Pine110

There is an architecture called RWKV which claims to have an 'infinite' context window (since it is similar to an RNN). It claims to be competitive with GPT-3. I have no idea whether this is worth taking seriously or not.

The entire conversation is over 60,000 characters according to wc. OpenAI's tool won't even let me compute the tokens if I paste more than 50k (?) characters, but when I deleted some of it, it gave me a value of >18,000 tokens.

I'm not sure if/when ChatGPT starts to forgot part of the chat history (drops out of the context window) but it still seemed to remember the first file after long, winding discussion.

I'm pretty confident that I have been using the "Plugins" model with a very long context window. I was copy-pasting entire 500-line source files and asking questions about it. I assume that I'm getting the 32k context window.

To be honest, this argument makes me even more confident in short times. I feel like the focus on scaling and data requirements completely miss the point. GPT-4 is already much smarter than I am in the ways that it is smart. Adding more scale and data might continue to make it better, but it doesn't need to be better in that way to become transformative. The problem is the limitations -- limited context window, no continual learning, text encoding issues, no feedback loop REPL wrapper creating agency, expensive to run, robotics is lagging. These are not problems that will take decades to solve, they will take years, if not months.

Gary Marcus's new goalpost is that the AI has to invent new science with only training data from before a specific year. I can't do that! I couldn't do that no matter how much training data I had. Am I a general intelligence Gary? I feel like this is all some weird cope.

To be clear, I'm not blind to the fact that LLMs are following the same hype cycle that other technologies have gone through. I'm sure there will be some media narrative in a year or so like "AI was going to take all our jobs, but that hasn't happened yet, it was just hype." Meanwhile, researchers (which now includes essentially everyone who knows how to install python) will fix the limitations and make these systems ever more powerful.

I am highly confident that current AI technologies, without any more scale or data[1], will be able to do any economically relevant task, within the next 10 years.

  1. ^

    We will need new training data, specifically for robotics, but we won't need more data. These systems are already smart enough.

But when will my Saturn-branded car drive me to Taco Bell?

every 4 to 25 months

Is that a typo? That's such a broad range that the statistic is completely useless. Halving every 4 months is over 32 times as significant as halving every 25 months. That's completely different worlds.

Load More