Commit to only use superhuman persuasion when arguing towards a valid conclusion via valid arguments, in a manner that doesn't go against the interests of the person being persuaded.
In this plan, how should the AI define what’s in the interest of the person being persuaded? For example, say you have a North Korean soldier who can be persuaded to quite for the west (at the risk of getting the shitty jobs most migrants have) or who can be persuaded to remain loyal to his bosses (at the risk of raising his children in the shitty country most north korean have), what set of rules would you suggest?
That’s great analogy. To me the strength of the OP is to pinpoint that LLMs already exhibit the kind of general ability we would expect from AGI, and the weakness is to forget that LLMs do not exhibit some specific ability most thought easy, such as the agency that even clownfishes exhibit.
In a way this sounds like again the universe is telling us we should rethink what intelligence is. Chess is hard and doing the dishes is easy? Nope. Language is hard and agency is central? Nope.
Perhaps you could identify your important beliefs
That part made me think. If I see bright minds falling in this trap, does blindness goes with importance of the belief for that person? I would say yes I think. As if that’s where we tend to make more « mistakes. that can behave as ratchets of the mind ». Thanks for the insight!
that also perhaps are controversial
Same exercise: if I see bright minds falling in this trap, does blindness goes with controversial beliefs? Definitely! Almost by definition actually.
...each year write down the
Yup. Thanks for trying, but these beliefs seem to form a local minima, like a trap for the rational minds -even very bright ones. Do you think you understand how an aspiring rationalist could 1) recover and get out of this trap 2) don’t fall for it in the first place?
To be clear, my problem is not with the possibility of a lab leak itself, it’s with the evaluation that present evidences are anything but posthoc rationalizations fueled by unhealthy levels of tunnel vision. If bright minds can fall for that on this topic specifically, how do I know I’m not making the same mistake on something else?
(Spoiler warning)
(Also I didn’t check the previous survey nor the comments there, so expect some level of redondance)
The score itself (8/18) is not that informative, but checking the « accepted » answers is quite interesting. Here’s my « errors » and how much I’m happy making them:
You should be on the outlook for people who are getting bullied, and help defend them against the bullies.
I agree some rationalist leaders are toxic characters who will almost inevitably bully their students and collaborators, and I’m happy to keep strongl...
I am an old person. They may not let you do that in chemistry any more.
Absolutely! In my first chemistry lab, a long time ago, our teacher warned us that she had just lost a colleague to cancer at the age of forty, and she swore that if we didn't follow the security protocols very seriously, she would be our fucking nightmare.
I never heard her swear after that.
Not bad! But I stand by « random before (..) » as a better picture in the following sense: neurons don’t connect once to an address ending in 3. It connects several thousands of times to an address ending in 3. Some connexion are on the door, some on windows, some on the roof, one has been seen trying to connect to the dog, etc. Then it’s pruned, and the result looks not that far from a crystal. Or a convnet.
(there’s also long lasting silent synapses and a bit of neurogenesis, but that’s details for another time)
The story went that “Perceptrons proved that the XOR problem is unsolvable by a single perceptron, a result that caused researchers to abandon neural networks”. (…) When I first heard the story, I immediately saw why XOR was unsolvable by one perceptron, then took a few minutes to design a two-layered perceptron network that solved the XOR problem. I then noted that the NAND problem is solvable by a single perceptron, after which I immediately knew that perceptron networks are universal since the NAND gate is.
Exactly the same experience and thoughts in ...
He can be rough and on rare occasion has said things that could be considered personally disrespectful, but I didn't think that people were that delicate.
You may wish to update on this. I’ve only exchange a few words with one of the name, but that was enough to make clear he doesn’t bother being respectful. That may work in some non delicate research environment I don’t want to know about, but most bright academic I know like to have fun at work, and would leave any non delicate work environment (unless they make their personal duty to clean the place).
What do you think orthogonality thesis is?
I think that’s the deformation of a fundamental theorem (« there exists an universal Turing machine, e.g. it can run any program ») into a practical belief (« an intelligence can pick its value at random »), with a motte and bailey game on the meaning of can where the motte is the fundamental theorem and the bailey is the orthogonal thesis.
(thanks for the link to your own take, e.g. you think it’s the bailey that is the deformation)
...Consider the sense in which humans are not aligned with ea
Existentially dangerous paperclip maximizers don't misunderstand human goals.
Of course they do. If they didn’t and picked their goal at random, they wouldn’t make paperclips in the first place.
There's this post from 2013 whose title became a standard refrain on this point
I wouldn’t say that’s the point I was making.
...This has been hashed out more than a decade ago and no longer comes up as a point of discussion on what is reasonable to expect. Except in situations where someone new to the arguments imagines that people on LessWrong expect such unbal
Perhaps the position you disagree with is that a dangerous general AI will misunderstand human goals. That position seems rather silly, and I'm not aware of reasonable arguments for it. It's clearly correct to disagree with it, you are making a valid observation in pointing this out.
Thanks! To be honest I was indeed surprised that was controversial.
But then who are the people that endorse this silly position and would benefit from noticing the error? Who are you disagreeing with, and what do you think they believe, such that you disagree with it?
Wel...
(Epistemic fstatus: first thoughts after first reading)
Most is very standard cognitive neuroscience, although with more emphasis on some things (the subdivision of synaptic buttons into silent/modifiable/stable, notion of complex and simple cells in the visual system) than other (the critical periods, brain rhythms, iso/allo cortices, brain symetry and circuits, etc). There’s one bit or two wrong, but that’s nitpicks or my mistake.
The idea of synapses as detecting frequency code is not exactly novel (it is the usual working hypothesis for some synapses in ...
More specifically, if the argument that we should expect a more intelligent AI we build to have a simple global utility function that isn't aligned with our own goals is valid then why won't the very same argument convince a future AI that it can't trust an even more intelligent AI it generates will share it's goals?
For the same reason that one can expect a paperclip maximizer could both be intelligent enough to defeat humans and stupid enough to misinterpret their goal, e.g. you need to believe the ability to select goals is completely separated from the ability to reach them.
(Beware it’s hard and low status to challenge that assumption on LW)
Yes, that’s the crux. In my view, we can reverse…
Inability to distinguish noice and patters is true only for BBs. If we are real humans, we can percieve noice as noice with high probability.
… as « Ability to perceive noise means we’re not BB (high probability). »
Can you tell more about why we can’t use our observation to solve this?
A true Boltzmann brain may have an illusion of the order in completely random observations.
Sure, like a random screen may happen to look like a natural picture. It’s just exponentially unlikely with picture size, whereas the scenario you suggest is indeed generic in producing brains that look like they evolved from simpler brains.
In other words, you escape the standard argument by adding an observation, e.g. the observation that random fluctuations should almost never make our universe looks obeying physical laws.
One alternative way to see this point is the following: if (2) our brains are random fluctuations, then they are exponentially unlikely to have been created long ago, whereas if (1) it is our observable universe itself that comes from random fluctuations, it could equally have been created 10 billions years or 10 seconds ago. Then counting makes (1) much more likely than (2).
0% that the tool itself will make the situation with the current comment ordering and discourse on platforms such as Twitter, Facebook, YouTube worse.
Thanks for the detailed answer, but I’m more interested in polarization per see than in the value of comment ordering. Indeed we could imagine that your tool feels like it behaves as well as you wanted, but that’s make the memetic world less diverse then more fragile (like monocultures tend to collapse here and then). What’d be your rough range for this larger question?
Nothing at all. I’m big fan of these kind of ideas and I’d love to present yours to some friends, but I’m afraid they’ll get dismissive if I can’t translate your thoughts into their usual frame of reference. But I get you didn’t work this aspect specifically, there’s many fields in cognitive sciences.
About how much specificity, it’s up to interpretation. A (1k by 1k by frame by cell type by density) tensor representing the cortical columns within the granular cortices is indeed a promising interpretation, although it’d probably be short of an extrapyramidal tensor (and maybe an agranular one).
You mean this: "We're not talking about some specific location or space in the brain; we're talking about a process."
You mean there’s some key difference in meaning between your original formulation and my reformulation? Care to elaborate and formulate some specific prediction?
As an example, I once gave a try at interpreting data from olfactory system for a friend who were wondering if we could find sign of an chaotic attractor. If you ever toy with Lorenz model, one key feature is: you either see the attractor by plotting x vs vs z, or you can see it b...
Is accessing the visual cartesian theater physically different from accessing the visual cortex? Granted, there's a lot of visual cortex, and different regions seem to have different functions. Is the visual cartesian theater some specific region of visual cortex?
In my view: yes, no. To put some flesh on the bone, my working hypothesis is: what’s conscious is gamma activity within an isocortex connected to the claustrum (because that’s the information which will get selected for the next conscious frame/can be considered as in working memory)
...I'm not
I'm willing to speculate that [6 Hz to 10 Hz ]that's your 'one-shot' refresh rate.
It’s possible. I don’t think there was relevant human data in Walter Freeman time, so I’m willing to speculate that’s indeed the frame rate in mouse. But I didn’t check the literature he had access to, so just a wild guess.
the imagery of the stage 'up there' and the seating area 'back here' is not at all helpful
I agree there’s no seating area. I still find the concept of a cartesian theater useful. For exemple, it allows knowing where to plant electrodes if you want to...
A few comments before later. 😉
What I meant was that the connectionist alternative didn't really take off until GPUs were used, making massive parallelism possible.
Thanks for the clarification! I guess you already noticed how research centers in cognitive science seem to have a failure mode over a specific value question: Do we seek excellence at the risk of overfitting funding agency criterion, or do we seek fidelity to our interdisciplinary mission at the risk of compromising growth?
I certainly agree that, before the GPUs, the connectionist approach ...
Thanks, I didn’t know this perspective on the history of our science. The stories I most heard were indeed more about HH model, Hebb rule, Kohonen map, RL, and then connexionism became deep learning..
If the object tends toward geometrical simplicity – she was using identification of visual objects as her domain – then a conventional, sequential, computational regime was most effective.
…but neural networks did refute that idea! I feel like I’m missing something here, especially since you then mention GPU. Was sequential a typo?
Our daily whims might be a bit inconsistent, but our larger goals aren't.
It’s a key faith I used to share, but I’m now agnostic about that. To take a concrete exemple, everyone knows that blues and reds get more and more polarized. Grey type like old me would thought there must be a objective truth to extract with elements from both sides. Now I’m wondering if ethics should ends with: no truth can help deciding whether future humans should be able to live like bees or like dolphins or like the blues or like the reds, especially when living like the reds...
Fascinating paper! I wonder how much they would agree that holography means sparse tensors and convolution, or that the intuitive versus reflexive thinking basically amount to visuo-spatial versus phonological loop. Can’t wait to hear which other idea you’d like to import from this line of thought.
I have no idea whether or not Hassibis is himself dismissive of that work
Well that’s a problem, don’t you think?
but many are.
Yes, as a cognitive neuroscientist myself, you’re right that many within my generation tend to dismiss symbolic approaches. We were students during a winter that many of us thought caused by the over promising and under delivering of the symbolic approach, with Minsky as the main reason for the slow start of neural networks. I bet you have a different perspective. What’s your three best points for changing the view of my generation?
Because I agree, and because « strangely » sounds to me like « with inconstancies ».
In other words, in my view the orthodox view on orthogonality is problematic, because it suppose that we can pick at will within the enormous space of possible functions, whereas the set of intelligent behavior that we can construct is more likely sparse and by default descriptible using game theory (think tit for tat).
I’m a bit annoyed that Hassabis is giving neuroscience credit for the idea of episodic memory.
That’s not my understanding. To me he is giving neuroscience credit for the ideas that made possible to implement a working memory in LLM. I guess he didn’t want to use words like thalamocortical, but from a neuroscience point of view transformers indeed look inspired by the isocortex, e.g. by the idea that a general distributed architecture can process any kind of information relevant to a human cognitive architecture.
This is aiming at a different problem than goal agnosticism; it's trying to come up with an agent that is reasonably safe in other ways.
Well, assuming a robust implementation, I still think it obeys your criterions, but now you mention « restrictive », my understanding is that you want this expression to specifically refers to pure predictors. Correct?
If yes, I’m not sure that’s the best choice for clarity (why not « pure predictors »?) but of course that’s your choice. If not, can you give some examples of goal agnostic agents other than pure predictors?
(The actual question is about your best utilitarian model, not your strategy given my model.)
Uniform distribution of donating kidney sounds also the result when a donor is 10^19 more likely to set the example. Maybe I should precise that the donor is unlikely to take the 1% risk unless someone else is more critical to war effort.
That may be too strong of a statement. Say some new tool helps improve AI legislation more than AI design, this might turn slowing down the wheel.