Comment Permalink

I expect you'd get better results by using older, less hyped NLP techniques that are designed for this sort of thing:

https://stackoverflow.com/questions/15377290/unsupervised-automatic-tagging-algorithms

The tagging work that's already been done need not be a waste, because you can essentially use it as training data for the kind of tags you'd like an automated system to discover and assign. For example, tweak the hyperparameters of the topic modeling system until it is really good at independently rediscovering/reassigning the tags that have already been manually assigned.

An advantage of the automated approach is that you should be able to reapply it to some other document corpus--for example, autogenerate tags for the EA Forum, or all AI alignment related papers/discussion off LW, or the entire AI literature in order to help with/substitute for this job https://intelligence.org/2017/12/12/ml-living-library/ (especially if you can get some kind of hierarchical tagging to work)

I've actually spent a while thinking about this sort of problem and I'm happy to video call and chat more if you want.

See in context

76 PSA: Tagging is Awesome

by abramdemski

30th Jul 2020

2 min read

76

I'd like to supplement the open call for taggers with a few points:

Tagging is a neglected cause area. There is still a huge amount to do, and tagging makes a real difference in making LessWrong content easier to discover, explore, and re-find.
The problem is real. People find LessWrong difficult to read because it is full of deep inferential distances and special jargon. Tags offer an easy way to disambiguate jargon, and reference all the relevant material on a subject.
Tagging isn't just altruistic. Want to promote an idea/topic? Tagging posts and writing a good tag description is a great way to make that topic easier to discover and explore. If writing blobs of text which lots of people later read pumps your ego, tagging is a good way to do that. Write that tag before someone else! But it's also useful -- not just to other people, but also, to yourself. Tagging posts on subjects you love, and upvoting the tag on the most relevant ones, will make it easier for you to reference them later.
You will probably discover things you want to read. Tagging naturally gets you searching for content on LessWrong related to your favorite topics. You are likely to discover than more has been written on these topics than you previously realized.
Tagging is easy. Whenever you think of a tag you want to exist (usually because you're reading a post and decide to tag it with something, only to discover the tag doesn't exist yet), just do a search for that thing on LessWrong and tag all the relevant results! There are other approaches, of course, but if everyone did this, then we'd be in a pretty good position: any new tag created would already be put on most of the relevant posts. (This strategy doesn't work for tags which cover a significant portion of the content on LessWrong, such as the core tags, of course.)
- If you're not sure what to tag, take a look at the top posts without tags. You may want to familiarize yourself with the core tags and the concepts portal, so that you're not missing some obvious ones when you tag things.

Wiki/TaggingSite Meta

Personal Blog

76

Mentioned in

54Tagging Progress at 100%! (Party & Celebratory Talk w/ Jason Crawford, Habryka on Sun, Aug 30th, 12pm PDT)

New Comment

19 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:20 AM

[-]Yoav Ravid5y150

Seems worth adding a link to the list of top posts without tags

[-]Gurkenglas5y120

Just ask GPT to do the tagging, people.

[-]Ruby5y30

You know, it'd probably work.

[-]Raemon5y150

We sure did just talk about this for 15 minutes. Seems like GPT would actually do a decent (and/or interesting?) job. But, also, man, I sure have reservations about giving GPT any control over the ontology of the AI Alignment space.

[-]Raemon5y260

Update: we asked GPT-3 to create some tags, and it suggested, among other things: "Robotic Autonomy"

[-]Matt Goldenberg5y20

For which article? Was it an article about Robotic Autonomy? Or, did you just give it a list of LW tags and have it create more.

[-]habryka5y50

And here is me asking it to extend the list of all Rationality tags:

-------- My Input -------
Alief
Anticipated Experiences
Bayes Theorem / Bayesianism
Bounded Rationality
Cached Thoughts
Conservation of Expected Evidence
Decision Theory
Epistemology
Game Theory
Hansonian Pre-Rationality
Law-Thinking
Newcomb's Problem
Robust Agents
Solomonoff Induction
Utility Functions
Value of Information
Betting
Dark Arts
Focusing
Forecasting & Prediction
Goal Factoring
Hamming Questions
Identity
Inside/Outside View
Internal Double Crux
Introspection
Noticing
Scholarship & Learning
Techniques
Trigger-Action Planning
Affect Heuristic
Bucket Errors
Compartmentalization
Confirmation Bias
Fallacies
Goodhart's Law
Groupthink
Heuristics and Biases
Mind Projection Fallacy
Motivated Reasoning
Pica
Pitfalls of Rationality
Rationalization
Self-Deception
Sunk-Cost Fallacy
Common Knowledge
Conversation
Decoupling vs Contextualizing
Disagreement
Inferential Distance
Information Cascades
Philosophy of Language
Steelmanning
Consciousness
Dual Process Theory (System 1 & 2
General Intelligence
Subagents
Predictive Processing
Perceptual Control Theory
Center for Applied Rationality
Curiosity
Gears-Level
Memetic Immune System
Practice & Philosophy of Science
Rationality Quotes

------- GPT3's Response -------

Mental Simulation
Mind Projection Fallacy
Modal Rationality
Narrative Causality
Simulation Heuristic
Subjunctive Probability
Bayes' Theorem
Blind Spots
Counterfactuals
Deontology
Deontology Quotes
Epistemic Regret
Epistemic Utility
Eternalism
Eternalism Quotes
Experientialism
Fictionalism
Fictionalism Quotes
Game Theory
Hansonian Pre-Rationality
Law-Thinking
Logical Fallacies
Predictive Processing
Predictive Processing Quotes
Psychology of Science
Rationality
Rationality Quotes
Science & Rationality
Science Quotes
Scientific Progress
Scientific Progress Quotes
Scientific Realism
Scientific Realism Quotes
Solomonoff's Theory
Solomonoff's Theory Quotes
Solomonoff's Prior
Solomonoff's Prior Quotes
Solomonoff's Induction
Solomonoff's Induction Quotes
Solomonoff's Algorithm
Solomonoff's Algorithm Quotes
Rationalist Community
Affective Death Spirals
Anthropic Bias
Bayes' Theorem
Bias Heuristic
Confirmation Bias
Deductive vs Inductive
Epistemic Status
Epistemic Utility
Expert Testimony
Extrapolative Tendencies
False-Positive Paradox

[-]Gurkenglas5y60

Long outputs will tend to naturally deteriorate, as it tries to reproduce the existing deterioration and accidentally adds some more. Better: Sample one tag at a time. Shuffle the inputs every time to access different subdistributions. (I wonder how much the subdistributions differ for two random shuffles...) If you output the tag that has the highest minimum probability in each of a hundred subdistributions, I bet that'll produce a tag that's not in the inputs.

[-]abramdemski5y40

Shuffling would also be good to combat the alphabetic order, which has got to be skewing output somehow.

[-]habryka5y40

I just did it again, asking it to extend the list of all the AI tags.

------- My Input -----

AIXI(23)
Complexity of Value(8)
Corrigibility(12)
Decision Theory(83)
Embedded Agency(13)
Fixed Point Theorems(7)
Goodhart's Law(29)
Inner Alignment(17)
Instrumental Convergence(9)
Logical Induction(22)
Mesa-Optimization(10)
Myopia(6)
Newcomb's Problem(4)
Optimization(21)
Orthogonality Thesis(4)
Outer Alignment(7)
Solomonoff Induction(9)
Utility Functions(20)
AI Boxing (Containment)(8)
Debate (AI safety technique)(9)
Factored Cognition(8)
Humans Consulting HCH(7)
Impact Measures(21)
Inverse Reinforcement Learning(14)
Iterated Amplification (26)
Transparency / Interpretability(9)
Value Learning(39)
AI Risk(53)
AI Services (CAIS)(8)
AI Takeoff(38)
AI Timelines(28)
CHAI (UC Berkeley)(12)
FHI (Oxford)(13)
MIRI(72)
OpenAI(17)
Ought(7)
Alpha- (algorithm family)(10)
GPT(44)
Research Agendas(27)

-------- GPT3's Response ---------

Happiness / Life Satisfaction(3)
Existential Risks(6)
Growth Mindset(6)
Optimization(21)
Metacognition(9)
Probabilistic Thinking(9)
Rationality(33)
Retroduction(9)
Semi-HUMANs(6)
Superintelligence(40)
Tractability(18)
Trust(5)
VNM(44)
Books(1)
Causal Explanation(5)
Computational Equivalence(9)
Computational Models(13)
Computationalism(11)
Computational Psychosociology(8)
Computational Sociology(8)
Conferences(4)
Control Theory(6)
Deliberation(10)
Egalitarianism(7)
Ethical Alignment(7)
Ethical Uncertainty(6)
Evolutionary Game Theory(5)
Experimental Ethics(9)
Experimental Philosophy(9)
Fairness(5)
Futarchy(8)
Game Theory(9)
Genetic Algorithms(6)
Happiness / Life Satisfaction(3)
Human Level AI(7)
Human-level Intelligence(5)
Human-Level Systems(5)
Impact Measures(21)
Incentives(6)

[-]John_Maxwell5y30

I expect you'd get better results by using older, less hyped NLP techniques that are designed for this sort of thing:

https://stackoverflow.com/questions/15377290/unsupervised-automatic-tagging-algorithms

I've actually spent a while thinking about this sort of problem and I'm happy to video call and chat more if you want.

[-]Raemon5y*40

In this case someone just gave it a list and asked it to create more. (I do think the ideal process here would have been to feed it some posts + corresponding taglists, and then given it a final post with a "Tags: ..." prompt. But, that was a bit more work and nobody did it yet AFAICT)

[-]Gurkenglas5y20

You make it sound like it wants things. It could at most pretend to be something that wants things. If there's a UFAI in there that is carefully managing its bits of anonymity (which sounds as unlikely as your usual conspiracy theory - a myopic neural net of this level should keep a secret no better than a conspiracy of a thousand people), it's going to have better opportunities to influence the world soon enough.

[-]Raemon5y60

Sorry, to be clear this was a joke.

[-]Raemon5y20

(joke was more about the general principle of putting opaque AIs in charge of alignment ontology, even if this one obviously wasn't going to be adversarial about it)

[-]abramdemski5y40

I think the concern is more "it wouldn't optimize the ontology carefully".

[-]Raemon5y20

Thanks!

One thing I'll add is that the tagging we could most use help with is fairly "conceptually nuanced." We're most excited to have taggers that are something like "dedicated ontologists" who think seriously about LessWrong's concepts and how they fit together.

This role requires a bit of onboarding/getting-in-sync, and I'd be interested in chatting with anyone who's interested in that aspect of it. Send me (or Ruby) a PM if you're interested.

[-]abramdemski5y40

I've mainly been tagging special topics which have few actual posts dedicated to them, and whose posts are not usually so popular. For example, Hansonian Pre-Rationality. Special topics like this require conceptual nuance in the sense that the tagger should be familiar with the topic (which is a high bar, if these are posts which relatively few people read or which have been long forgotten by most people).

For this sort of tagging to happen, basically, someone with the niche interest has to decide to make the tag.

I'm hoping more people with niche interests will do that. I also kind of think of this as the main benefit of tagging?

It sounds like this differs somewhat from your picture of what tagging is most valuable / what the LW team primarily needs help with. Do you think so?

[-]Raemon5y40

(totally off the cuff, this is not official Ray Opinion let alone a commonly endorsed LW Team opinion)

I think there's something of a spectrum of value.

I think it's straightforwardly valuable to add the Obvious Posts to the Obvious Tags, and this is a job that most people can do.

I think, indeed, niche specialists are going to need to make the Niche Specialist tags.

The thing that gets a bit tricky is that different people might be finding the same topics but giving them different names. Something that I think is useful for Niche Specialists to do is to also be putting some effort into "taking in the overall evolving tagging structure". Then, thinking about how to resolve multiple overlapping tags, and or competing definitions within tags, etc.

Perhaps most interestingly: most of the LW team is currently leaning towards tags evolving in a wiki-like direction. So, one of the things we need here is not just good tagging, but good pedagogy for introducing various concepts on a given tag page. And that requires some thinking about how the concept relates to other concepts.

I guess this means that there's maybe two types of people I'm especially excited by getting involved with Tagging: people with the niche interests (who would be even more beneficial if they learned to look holistically at the concept-space), and people who are good at pedagogy and/or mapping out the concept-space, who could probably use to learn a bit more about individual niche areas to help improve the descriptions for the niche tags.

(I find myself thinking about Scott Alexander's Alchemists / Teachers story where you need a mix of people who are particularly good at a subject to even understand the details at all, and people who are good at education to make it easier to understand, and various points on the spectrum between to bridge the gaps)

Moderation Log