ChatGPT is OpenAI’s newest language model based on the GPT-3.5 series of LLMs, optimised for dialogue. It is trained using Reinforcement Learning from Human Feedback; human AI trainers provide supervised fine-tuning by playing both sides of the conversation.

It is evidently better than GPT-3 at following user intentions & context. And has a notably skewed moral compass. While I’m still Unsouled in the **Way, my brief tenure with ChatGPT has led to non-trivial belief updates. Below is some of my playground experience. The conversations are edited to enable a reasonable reading experience (no intellectual honesty tradeoff for dramatics).

Levels of perception

What is one clever question you would ask an AI to entertain a >50%... (read 3647 more words →)

Replying toOpenAI Codex: First Impressions

specbug5y

OpenAI Codex: First Impressions

I wouldn't read too much into this - the challenge was buggy and slow enough that I almost ragequit, and it took me about an hour to start submitting, I expect many people had similarly bad experiences

I had the same experience (50 mins for first problem, as seen in the post). I agree, it is possible that the server issues biased the stats greatly.

Replying toOpenAI Codex: First Impressions

specbug5y

OpenAI Codex: First Impressions

The only correctness filters are the hidden testcases (as is standard in most competitive coding competition). You can check the leaderboard - the positions correlate with the cumulative time taken to solve problems & codex assists. If there are any hidden metrics, I wouldn't know.

If so, how was Codex deployed solo? Did they just sample it many times on the same prompt until it produced something that passed the tests? Or something more sophisticated?

They didn't reveal this publicly. We can only guess here.

This makes no sense to me. Do you assume solo-Codex exploited the prompts submitted by other competitors? Or that the assistant-Codexes communicated with each other somehow? I kinda doubt either

... (read more)

OpenAI Codex: First Impressions

specbug

OpenAI organised a challenge to solve coding problems with the aid of an AI assistant. This is a review of the challenge, and first impressions on working with an AI pair-programmer.

OpenAI Codex

OpenAI is an AI research and development company. You might have heard some buzz about one of its products: GPT-3. GPT-3 is a language model that can generate human-like text. It can be used for chatting, text auto-completion, text summarisation, grammar correction, translation, etc.

Checkout OpenAI API to access the playground.

Codex is a descendant of GPT-3, trained on natural language data and publicly available source-codes (e.g. from public GitHub repos). Codex translates a natural language prompt to code. It is the very... (read 1039 more words →)

Replying toFuture Of Work

specbug5y

Future Of Work

It's hard to monitor most work in the short term, so having the engagements be longer-term makes it possible to adjust job and compensation based on years' of output rather than the latest delivery.

Fair point. I agree, I am exaggerating the effectiveness of certain elements. And downplaying the necessity of others.

Although, there's an inherent survivorship bias to favour a longer-term contract, because we've never experienced an efficient short-term engagement model, at scale, before. But I do believe this adjustment buffer will shorten with time, as the tendency of finer hiring accelerates. And, short-term alignment and work efficiency will increase, as everyone adapts to a "faster" work culture.

Replying toFuture Of Work

specbug5y

Future Of Work

Yes but there's generally a long enough buffer before the messenger apps change status.

Working on something personal, reading some blog, general web surfing, etc., I feel, constitute 80% of "alt work" sessions. These scenarios won't register on instant-messenger as "away". It is not about going out for a one-hour walk in the middle of the day, without informing anyone. It is these bursts of freedom, and the ability to switch context, unmonitored.

Also, pinging someone for feedback, checking someone's status or organizing group activities, seems like a less efficient monitoring medium (over constantly being in their range of vision).

Future Of Work

specbug

The current employment model is outdated. For the majority of workers, vocation and avocation are incongruent vectors. Here, we describe a set of tools that can be used to form a better-integrated work model, to dispense high-quality work to everyone.

An Optimisation Problem

The number of jobs is ever-mutating. There is no finite number of enterprises being segmented since ancient times. Evolving cultures yield evolving problems, increasing opportunities for innovation. Synchronously, an increasing knowledge base curates more individuals with unique competencies and interests.

So the issue of optimal employment is neither of quantity nor quality. As a crude formulation, we have $n$ tasks to be solved and $k$ agents to solve them. The problem is optimally matching the $k$ agents to... (read 1541 more words →)

-1

LESSWRONG
LW

LESSWRONG
LW

specbug

specbug

ChatGPT: First Impressions

OpenAI Codex: First Impressions

Future Of Work

specbug

specbug

ChatGPT: First Impressions

OpenAI Codex: First Impressions

Future Of Work

Levels of perception

OpenAI Codex

An Optimisation Problem