Ethan Caballero

Replying toWe may be able to see sharp left turns coming

We may be able to see sharp left turns coming

Read Section 6 titled “The Limit of the Predictability of Scaling Behavior” in this paper:
https://arxiv.org/abs/2210.14891

1

0

Replying toPaLM-2 & GPT-4 in "Extrapolating GPT-N performance"

Ethan Caballero3y

PaLM-2 & GPT-4 in "Extrapolating GPT-N performance"

We describe how to go about fitting a BNSL to yield best extrapolation in the last paragraph of Appendix Section A.6 "Experimental details of fitting BNSL and determining the number of breaks" of the paper:
https://arxiv.org/pdf/2210.14891.pdf#page=13

1

0

Replying toPaLM-2 & GPT-4 in "Extrapolating GPT-N performance"

Ethan Caballero3y

PaLM-2 & GPT-4 in "Extrapolating GPT-N performance"

Sigmoids don't accurately extrapolate the scaling behavior(s) of the performance of artificial neural networks.

Use a Broken Neural Scaling Law (BNSL) in order to obtain accurate extrapolations:
https://arxiv.org/abs/2210.14891
https://arxiv.org/pdf/2210.14891.pdf

2

4

-4

Replying toGPT-4

Ethan Caballero3y

GPT-4

Did ARC try making a scaling plot with training compute on the x-axis and autonomous replication on the y-axis?

0

Replying toAI Safety in a World of Vulnerable Machine Learning Systems

Ethan Caballero3y

AI Safety in a World of Vulnerable Machine Learning Systems

The setting was adversarial training and adversarial evaluation. During training, PGD attacker of 30 iterations is used to construct adversarial examples used for training. During testing, the evaluation test set is an adversarial test set that is constructed via PGD attacker of 20 iterations.

Experimental data of y-axis is obtained from Table 7 of https://arxiv.org/abs/1906.03787; experimental data of x-axis is obtained from Figure 7 of https://arxiv.org/abs/1906.03787.

1

0

Replying toAI Safety in a World of Vulnerable Machine Learning Systems

Ethan Caballero3y

AI Safety in a World of Vulnerable Machine Learning Systems

"However, to the best of our knowledge there are no quantitative scaling laws for robustness yet."

For scaling laws for adversarial robustness, see appendix A.15 of openreview.net/pdf?id=sckjveqlCZ#page=22

2

3

0

Replying toEthan Caballero on Private Scaling Progress

Ethan Caballero3y

Ethan Caballero on Private Scaling Progress

arxiv.org/abs/2210.14891

1

0

Replying toParameter Scaling Comes for RL, Maybe

Ethan Caballero3y

Parameter Scaling Comes for RL, Maybe

See section 5.3 "Reinforcement Learning" of https://arxiv.org/abs/2210.14891 for more RL scaling laws with number of model parameters on the x-axis (and also RL scaling laws with the amount of compute used for training on the x-axis and RL scaling laws with training dataset size on the x-axis).