You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Internet Research (with tangent on intelligence analysis and collapse)

11 [deleted] 31 July 2013 04:58AM

Want to save time? Skip down to "I'm looking to compile a thread on Internet Research"!

Opinionated Preamble:

There is a lot of high level thinking on Less Wrong, which is great. It's done wonders to structure and optimize my own decisions. I think the political and futurology-related issues that Less Wrong cover can sometimes get out of sync with the reality and injustices of events in the immediate world. There are comprehensive treatments of how medical science is failing, or how academia cannot give unbiased results, and this is the milieu of programmers and philosophers in the middle-to-upper-class of the planet. I at least believe that this circle of awareness can be expanded, even if it's treading into mind-killing territory. If anything I want to give people a near-mode sense of the stakes aside from x-risk: all in all the x-risk scenarios I've seen Less Wrong fear the most, kill humanity somewhat instantly. A slower descent into violence and poverty is to me much more horrifying, because I might have to live in it and I don't know how. In a matter of fact, I have no idea of how to predict it.

This is one reason why I'm drawn to the Intelligence Operations performed by the military and crime units, among other things. Intelligence product delivery is about raw and immediate *fact*, and there is a lot of it. The problems featured in IntelOps are one of the few things rationality is good for - highly uncertain scenarios with one-off executions and messy or noisy feedback. Facts get lost in translation as messages are passed through, and of course the feeding and receiving fake facts are all a part of the job - but nevertheless, knowing *everything* *everywhere* is in the job description, and some form of rationality became a necessity.

It gets ugly. The demand for these kinds of skills often lie in industries that are highly competitive, violent, and illegal. I believe that once a close look is taken on how force and power is applied in practice then there isn't any pretending anymore that human evils are an accident.

Open Source Intelligence, or "OSINT", is the mining of data and facts from public information databases, news articles, codebases, journals. Although the amount of classified data dwarfs the unclassified, the size and scope of the unclassified is responsible for a majority of intelligence reports - and thus is involved in the great majority of executive decisions made by government entities. It's worth giving some thought as to how much that we know, that they do too. As illustrated in this expose, the processing of OSINT is a great big chunk of what modern intelligence is about aside from many other things. I think understanding how rationality as developed on Less Wrong can contribute to better IntelOps, and how IntelOps can feed the rationality community, would be awesome, but that's a post for another time.

--

The Show

Through my investigations into IntelOps I've noticed the emphasis on search. Good search.

I'm looking to compile a thread on Internet Research. I'm wondering if there is any wisdom on Less Wrong that can be taken advantage of here on how to become more effective searchers.  Here are some questions that could be answered specifically, but they are just guidelines - feel free to voice associated thoughts, we're exploring here.

  • Before actually going out and searching, what would be the most effective way of drafting and optimizing a collection plan? Are there any formal optimization models that inform our distribution of time and attention? Exploration vs exploitation comes to mind, but it would be worth formulating something specific. I heard that the multi-armed bandit problem is solved?
  • Do you have any links or resources regarding more effective search?
  • Do you have any experiences regarding internet research that you can share? Any patterns that you've noticed that have made you more effective at searching?
  • What are examples of closed-source information that are low-hanging fruit in terms of access (e.g. academic journals)? What are possible strategies for acquiring closed source data (e.g. enrolling in small courses at universities, e-mailing researchers, cohesion via the law/Freedom of Information Act, social engineering etc)?
  • I would like to hear from SEOs and software developers on what their interpretation of semantic web technologies and how they are going to affect end-users. I am somewhat unfamiliar with the semantic web, but from my understanding information that could not be indexed is now indexed; and new ontologies will emerge as this information is mined. What should an end-user expect and what opportunities will there be that didn't exist in the current generation of search?

That should be enough to get started. Below are some links that I have found useful with respect to Internet Research.

--

Meta-Search Engines or Assisted Search:

Summarizers:

Bots/Collectors/Automatic Filters:

Compilations and Directories:

Guides:

Practice:

I don't really care how you use this information, but I hope I've jogged some thinking of why it could be important.

Googling is the first step. Consider adding scholarly searches to your arsenal.

19 Tenoke 07 May 2013 01:30PM

Related to: Scholarship: How to Do It Efficiently

There has been a slightly increased focus on the use of search engines lately. I agree that using Google is an important skill - in fact I believe that for years I have came across as significantly more knowledgeable than I actually am just by quickly looking for information when I am asked something.

However, There are obviously some types of information which are more accessible by Google and some which are less accessible. For example distinct characteristics, specific dates of events etc. are easily googleable1 and you can expect to quickly find accurate information on the topic. On the other hand, if you want to find out more ambiguous things such as the effects of having more friends on weight or even something like the negative and positive effects of a substance - then googling might leave you with some contradicting results, inaccurate information or at the very least it will likely take you longer to get to the truth.

I have observed that in the latter case (when the topic is less 'googleable') most people, even those knowledgeable of search engines and 'science' will just stop searching for information after not finding anything on Google or even before2 unless they are actually willing to devote a lot of time to find it. This is where my recommendation comes - consider doing a scholarly search like the one provided by Google Scholar.

And, no, I am not suggesting that people should read a bunch of papers on every topic that they discuss. By using some simple heuristics we can easily gain a pretty good picture of the relevant information on a large variety of topics in a few minutes (or less in some cases). The heuristics are as follows:

1. Read only or mainly the abstracts. This is what saves you time but gives you a lot of information in return and this is the key to the most cost-effective way to quickly find information from a scholary search. Often you wouldn't have immediate access to the paper anyway, however you can almost always read the abstract. And if you follow the other heuristics you will still be looking at relatively 'accurate' information most of the time. On the other hand, if you are looking for more information and have access to the full paper then the discussion+conclusion section are usually the second best thing to look at; and if you are unsure about the quality of the study, then you should also look at the method section to identify its limitations.3

2. Look at the number of citations for an article. The higher the better. Less than 10 citations in most cases means that you can find a better paper.

3. Look at the date of the paper. Often more recent = better. However, you can expect less citations for more recent articles and you need to adjust accordingly. For example if the article came out in 2013 but it has already been cited 5 times this is probably a good sign. For new articles the subheuristic that I use is to evaluate the 'accuracy' of the article by judging the author's general credibilty instead - argument from authority.

4. Meta-analyses/Systematic Reviews are your friend. This is where you can get the most information in the least amount of time!

5. If you cannot find anything relevant fiddle with your search terms in whatever ways you can think of (you usually get better at this over time by learning what search terms give better results).

That's the gist of it. By reading a few abstracts in a minute or two you can effectively search for information regarding our scientific knowledge on a subject with almost the same speed as searching for specific information on topics that I dubbed googleable. In my experience scholarly searches on pretty much anything can be really beneficial. Do you believe that drinking beer is bad but drinking wine is good? Search on Google Scholar! Do you think that it is a fact that social interaction is correlated with happiness? Google Scholar it! Sure, some things might seem obvious to you that X but it doesn't hurt to search on google scholar for a minute just to be able to cite a decent study on the topic to those X disbelievers.

 

This post might not be useful to some people but it is my belief that scholarly searches are the next step of efficient information seeking after googling and that most LessWrongers are not utilizing this enough. Hell, I only recently started doing this actively and I still do not do it enough. Furthermore I fully agree with this comment by gwern:

My belief is that the more familiar and skilled you are with a tool, the more willing you are to reach for it. Someone who has been programming for decades will be far more willing to write a short one-off program to solve a problem than someone who is unfamiliar and unsure about programs (even if they suspect that they could get a canned script copied from StackExchange running in a few minutes). So the unwillingness to try googling at all is at least partially a lack of googling skill and familiarity.

A lot of people will be reluctant to start doing scholarly searches because they have barely done any or because they have never done it. I want to tell those people to still give it a try. Start by searching for something easy, maybe something that you already know from lesswrong or from somewhere else. Read a few abstracts, if you do not understand a given abstract try finding other papers on the topic - some authors adopt a more technical style of writing, others focus mainly on statistics, etc. but you should still be able to find some good information if you read multiple abstracts and identify the main points. If you cannot find anythinr relevant then move on and try another topic.

 

P.S. In my opinion, when you are comfortable enough to have scholarly searches as a part of your arsenal you will rarely have days when there is nothing to check for. If you are doing 1 scholarly search per month for example you are most probably not fully utilizing this skill.

 


1. By googleable I mean that the search terms are google friendly - you can relatively easily and quickly find relevant and accurate information.
2. If the people in question have developed a sense for what type of information is more accessible by google then they might not even try to google the less accessible-type things.
3. If you want to get a better and more accurate view on the topic in question you should read the full paper. The heuristic of mainly focusing on abstracts is cost-effective but it invariably results in a loss of information.

 

 

Use Search Engines Early and Often

0 katydee 05 May 2013 08:33AM

The Internet contains vast amounts of useful content. Unfortunately, it also contains vast amounts of garbage, superstimulus hazards, and false, meaningless, or outright harmful information. One skill that is hence quite useful in the modern day is using search engines correctly, allowing you to separate the wheat from the chaff. When doing so, one can often uncover preexisting work that solves your problem for you, the answers to relevant factual questions, and so on. It is rare to find a situation where search engines are outright useless-- at the very least they tend to point you in the direction of useful information.

Further, the time cost of setting up and refining a search is extremely low, meaning that most of the time "just Google it" should in fact be your default response to a situation where you don't have very much information.[1] Overall, I consider one's ability to use search engines-- and, just as importantly, one's ability to recognize what types of situations can benefit from using them-- a basic but fairly significant instrumental rationality skill.

Much of the above sounds extremely obvious, and in point of fact it should be-- but the fact remains that people don't use search engines anywhere near as often as they seemingly should. I've frequently found myself in situations where someone in the same room as me asks me a trivially searchable factual question while we are both using computers. Worse still, I've been in situations where people do the same over IRC! The existence of lmgtfy indicates that others have noticed this issue before, and yet it remains a problem.

So, how can we do better?

One easy trick that I've found very helpful is to use Goodsearch instead of Google. Goodsearch is a service that automatically donates a cent to a charity of your choice whenever you search.[2] Further, it can be installed into your search toolbar in Firefox, making the activation cost of using Goodsearch rather than Google essentially zero if, like me, you tend to search in the search bar instead of the URL field. Goodsearch has had profound effects on my tendency to perform searches because it gives me a little hit of "doing good" every time I perform a search, thus encouraging me to do so in more situations, thus causing me to accrue more money via Goodsearch, etc.

This has not only made me more productive by causing me to search more but added positive externalities to every search I conduct. Earlier, I would say that I frequently used search engines to find out information about a new topic or project-- now I would say that I nearly automatically do this as the first step in most situations where I need some information before proceeding. The potential information gained from a search is very high, the costs of performing a search are very low, and with Goodsearch you can donate a little bit to charity while you do so.

If you're reading this in Firefox and haven't already spent large amounts of time getting used to advanced search methods in other engines (and maybe even if you have), I strongly suggest navigating over to Goodsearch, signing up for an account, and installing the Goodsearch App to make it your default toolbar search. For me, this proved to be a big win-- opportunities to increase instrumental rationality for only a minimal time expenditure while also earning free money for charity are not exactly common!

 

[1] Note that there are some things you might not want to Google. I would, for instance, be very careful about what terms I used if I were looking into the history of political assassinations.

[2] Before anyone gets too clever, there are restrictions.

Learn Power Searching with Google

18 [deleted] 02 July 2012 07:09PM

Google Search makes it amazingly easy to find information. Come learn about the powerful advanced tools we provide to help you find just the right information when the stakes are high.

Daniel Russell is doing a free Google class on how to search the web. Besides six 50-minute classes it will include interactive activities to practice new skills. Upon passing the post-course assessment you get a Certificate of Completion.

Advanced search skills are not only a useful everyday skill but vital to doing scholarship. Searching the web is a superpower that would make thinkers of previous centuries green with envy. Learn to use it well. I recommend checking out Inside Search, Russel's Blog or perhaps reading the article "How to solve impossible problems" to get a feeling about what you can expect to gain from it.

I think for most the value of information is high enough to be worth the investment. Also I suspect it will be plain fun. I am doing the class and strongly recommend it to fellow LessWrong users. Anyone else who has registered please say so publicly in the comments as well. :)

Registration is open from June 26, 2012 to July 16, 2012.