Your argument about corporate secrets is sufficient to change my mind on activist patent trolling being a productive strategy against AI X-risk.
The part about funding would need to be solved with philanthropy. I don't believe that org exists, but I don't see why it couldn't.
I'm still curious whether there are other cases in which activist patent trolling can be a good option, such as animal welfare, chemistry, public health, or geoengineering (ie fracking).
That's fair enough and a good point.
I think that the key difference is that in the case of profitable-but-bad technologies, someone, somewhere, will probably invent them because there's great incentive to do so.
In the case of gain-of-function, if there stops being grants and the academics who do it become pariahs, then the incentive to do the gain-of-function research is gone.
One of the most powerful capabilities an AGI will have is its ability to copy itself. Among other things, this allows it to easily avoid shutdown, make use of more compute resources, and collaborate with copies of itself.
Is there research into ways to deny this capability to AI, making them uncopyable? Preferably something harder to circumvent than "just don't give the AI the permissions," since we know people are going to give them root access immediately.
Just found out about this paper from about a year ago: "Explainability for Large Language Models: A Survey"
(They "use explainability and interpretability interchangeably.")
It "aims to comprehensively organize recent research progress on interpreting complex language models".
I'll post anything interesting I find from the paper as I read.
Have any of you read it? What are your thoughts?
What if the incorrect spellings document assigned each token to a specific (sometimes) wrong answer and used that to form an incorrect word spelling? Would that be more likely to successfully confuse the LLM?
The letter x is in "berry" 0 times.
...
The letter x is in "running" 0 times.
...
The letter x is in "str" 1 time.
...
The letter x is in "string" 1 time.
...
The letter x is in "strawberry" 1 time.
Good point, I didn’t know about that, but yes that is yet another way that LLMs will pass the spelling challenge. For example, this paper uses letter triples instead of tokens. https://arxiv.org/html/2406.19223v1#:~:text=Large language models (LLMs) have,textual data into integer representation.
Spoiler free again:
Good to know there’s demand for such a review! It’s now on my todo list.
To quickly address some of your questions:
Pros of PL: If the premise I described above interests you, then PL will interest you. Some good Sequences-style rationality. I certainly was obsessed reading it for months.
Cons: Some of the Rationality lectures were too long, but I didn’t mind much. The least sexy sex scenes. Because they are about moral dilemmas and deception, not sex. Really long. Even if you read it constantly and read quickly, it will take time (1.8 million words will do that). I really have to read some authors that aren’t Yud. Yud is great, but this is clearly too much of him, and I’m sure he’d agree.
I read PL when it was already complete, so maybe I didn’t get the full experience, but there really wasn’t anything all that strange about the format (the content is another matter!). I can imagine that *writing * a glowfic would be a much different experience than writing a normal serialized work (ie dealing with your co-authors), but reading it isn’t very different from reading any other fiction. Look at the picture to see the POV, look at who’s the author if you’re curious, and read as normal. I’m used to books that change POV (though usually not this often). There are sometimes bonus tangent threads, but the story is linear. What problems do you have with the glowfic format?
Main themes would require a longer post, but I hope this helps.
I’m pretty sure there’s no such use it or lose it law for patents, since patent trolls already exist.