Wiki Contributions

Comments

If an AI doesn’t fully ‘understand’ the physics concept of “superradiance” based on all existing human writing, how would it generate synthetic data to get better?

I think "doesn't fully understand the concept of superradiance" is a phrase that smuggles in too many assumptions here. If you rephrase it as "can determine when superradiance will occur, but makes inaccurate predictions about physical systems will do in those situations" / "makes imprecise predictions in such cases" / "has trouble distinguishing cases where superradiance will occur vs cases where it will not", all of those suggest pretty obvious ways of generating training data.

GPT-4 can already "figure out a new system on the fly" in the sense of taking some repeatable phenomenon it can observe, and predicting things about that phenomenon, because it can write standard machine learning pipelines, design APIs with documentation, and interact with documented APIs. However, the process of doing that is very slow and expensive, and resembles "build a tool and then use the tool" rather than "augment its own native intelligence".

Which makes sense. The story of human capabilities advances doesn't look like "find clever ways to configure unprocess rocks and branches from the environment in ways which accomplish our goals", it looks like "build a bunch of tools, and figure out which ones are most useful and how they are best used, and then use our best tools to build better tools, and so on, and then use the much-improved tools to do the things we want".

I think you get very different answers depending on whether your question is "what is an example of a policy that makes it illegal in the United States to do research with the explicit intent of creating AGI" or whether it is "what is an example of a policy that results in nobody, including intelligence agencies, doing AI research that could lead to AGI, anywhere in the world".

For the former, something like updates to export administration regulations could maybe make it de-facto illegal to develop AI aimed at the international market. Historically, that was successful at making it illegal to intentionally export software which implemented strong encryption for a bit. It didn't actually prevent the export, but it did arguably make that export unlawful. I'd recommend reading that article in full, actually, to give you an idea of how "what the law says" and "what ends up happening" can diverge.

I think the answer to the question of how well realistic NN-like systems with finite compute approximate the results of hypothetical utility maximizers with infinite compute is "not very well at all".

So the MIRI train of thought, as I understand it, goes something like

  1. You cannot predict the specific moves that a superhuman chess-playing AI will make, but you can predict that the final board state will be one in which the chess-playing AI has won.
  2. The chess AI is able to do this because it can accurately predict the likely outcomes of its own actions, and so is able to compute the utility of each of its possible actions and then effectively do an argmax over them to pick the best one, which results in the best outcome according to its utility function.
  3. Similarly, you will not be able to predict the specific actions that a "sufficiently powerful" utility maximizer will make, but you can predict that its utility function will be maximized.
  4. For most utility functions about things in the real world, the configuration of matter that maximizes that utility function is not a configuration of matter that supports human life.
  5. Actual future AI systems that will show up in the real world in the next few decades will be "sufficiently powerful" utility maximizers, and so this is a useful and predictive model of what the near future will look like.

I think the last few years in ML have made points 2 and 5 look particularly shaky here. For example, the actual architecture of the SOTA chess-playing systems doesn't particularly resemble a cheaper version of the optimal-with-infinite-compute thing of "minmax over tree search", but instead seems to be a different thing of "pile a bunch of situation-specific heuristics on top of each other, and then tweak the heuristics based on how well they do in practice".

Which, for me at least, suggests that looking at what the optimal-with-infinite-compute thing would do might not be very informative for what actual systems which will show up in the next few decades will do.

Can you give a concrete example of a safety property of the sort that are you envisioning automated testing for? Or am I misunderstanding what you're hoping to see?

For example a human can to an extent inspect what they are going to say before they say or write it. Before saying Gary Marcus was "inspired by his pet chicken, Henrietta" a human may temporarily store the next words they plan to say elsewhere in the brain, and evaluate it.

Transformer-based also internally represent the tokens they are likely to emit in future steps. Demonstrated rigorously in Future Lens: Anticipating Subsequent Tokens from a Single Hidden State, though perhaps the simpler demonstration is simply that LLMs can reliably complete the sentence "Alice likes apples, Bob likes bananas, and Aaron likes apricots, so when I went to the store I bought Alice an apple and I got [Bob/Aaron]" with the appropriate "a/an" token.

I think the answer pretty much has to be "yes", for the following reasons.

  1. As noted in the above post, weather is chaotic.
  2. Elections are sometimes close. For example, the winner of the 2000 presidential election came down to a margin of 537 votes in Florida.
  3. Geographic location correlates reasonably strongly with party preference.
  4. Weather affects specific geographic areas.
  5. Weather influences voter turnout[1] --

During the 2000 election, in Okaloosa County, Florida (at the western tip of the panhandle), 71k of the county's 171k residents voted, with 52186 votes going to Bush and 16989 votes going to Gore, for a 42% turnout rate.

On the day of November 7, 2000, there was no significant rainfall in Pensacola (which is the closest weather station I could find with records going back that far). A storm which dropped 2 inches of rain on the tip of the Florida panhandle that day would have reduced voter turnout by 1.8%,[1] which would have resulted in a margin that leaned 634 votes closer to Gore. Which would have tipped Florida, which would in turn have tipped the election.

Now, November is the "dry" season in Florida, so heavy rains like that are not incredibly common. Still, they can happen. For example, on 2015-11-02, 2.34 inches of rain fell.[2] That was only one day, out of the 140 days I looked at, which would have flipped the 2000 election, and the 2000 election was, to my knowledge, the closest of the 59 US presidential elections so far. Still, there are a number of other tracks that a storm could have taken, which would also have flipped the 2000 election.[3] And in the 1976 election, somewhat worse weather in the great lakes region would likely have flipped Ohio and Wisconsin, where Carter beat Ford by narrow margins.[4]

So I think "weather, on election day specifically, flips the 2028 election in a way that cannot be foreseen now" is already well over 0.1%. And that's not even getting into other weather stuff like "how many hurricanes hit the gulf coast in 2028, and where exactly do they land?".

  1. ^

    Gomez, B. T., Hansford, T. G., & Krause, G. A. (2007). The Republicans should pray for rain: Weather, turnout, and voting in US presidential elections.

    "The results indicate that if a county experiences an inch of rain more than what is normal for the county for that election date, the percentage of the voting age population that turns out to vote decreases by approximately .9%.".

  2. ^

    I pulled the weather for the week before and after November 7 for the past 10 years from the weather.gov api and that was the highest rainfall date.

    var precipByDate = {}  
    for (var y = 2014; y < 2024; y++) {  
       var res = await fetch('https://api.weather.com/v1/location/KPNS:9:US/observations/historical.json?apiKey=<redacted>&units=e&startDate='+y+'1101&endDate='+y+'1114').then(r => r.json());  
       res.observations.forEach(obs => {  
           var d = new Date(obs.valid_time_gmt*1000);  
           var ds = d.getFullYear()+'-'+(d.getMonth()+1)+'-'+d.getDate();  
           if (!(ds in precipByDate)) { precipByDate[ds] = 0; }  
           if (obs.precip_total) { precipByDate[ds] += obs.precip_total }  
       });  
    }  
    Object.entries(precipByDate).sort((a, b) => b[1] - a[1])[0]
  3. ^

    Looking at the 2000 election map in Florida, any good thunderstorm in the panhandle, in the northeast corner of the state, or on  the west-middle-south of the peninsula would have done the trick.

  4. ^

    https://en.wikipedia.org/wiki/1976_United_States_presidential_election -- Carter won Ohio and Wisconsin by 11k and 35k votes, respectively.

An attorney rather than the police, I think.

Also "provably safe" is a property a system can have relative to a specific threat model. Many vulnerabilities come from the engineer having an incomplete or incorrect threat model, though (most obviously the multitude of types of side-channel attack).

Counterpoint: Sydney Bing was wildly unaligned, to the extent that it is even possible for an LLM to be aligned, and people thought it was cute / cool.

The two examples everyone loves to use to demonstrate that massive top-down engineering projects can sometimes be a viable alternative to iterative design (the Manhattan Project and the Apollo Program) were both government-led initiatives, rather than single very smart people working alone in their garages. I think it's reasonable to conclude that governments have considerably more capacity to steer outcomes than individuals, and are the most powerful optimizers that exist at this time.

I think restricting the term "superintelligence" to "only that which can create functional self-replicators with nano-scale components" is misleading. Concretely, that definition of "superintelligence" says that natural selection is superintelligent, while the most capable groups of humans are nowhere close, even with computerized tooling.

Load More