Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

PSA: Please list your references, don't just link them

21 Post author: Benja 04 January 2013 01:57AM

In what became 5th most-read new post on LessWrong in 2012, Morendil told us about a study widely cited in its field... except that source cited, which isn't online and is really difficult to get, makes a different claim — and turns out to not even be the original research, but a PowerPoint presentation given ten years after the original study was published!

Fortunately, the original study turns out to be freely available online, for all to read; Morendil's post has a link. The post also tells us the author and the year of publication. But that's all: Morendil didn't provide a list of references; he showed how the presentation is usually cited, but didn't give a full citation for the original study.

The link is broken now. The Wayback machine doesn't have a copy. The address doesn't give hints about the study's title. I haven't been able to find anything on Google Scholar with author, year, and likely keywords.

I rest my case.

Comments (43)

Comment author: RomeoStevens 04 January 2013 03:07:42AM *  10 points [-]

I only take citations as weak evidence until I've reviewed them. Too many people dabbling in scientism these days with the internet making it easy to link to a few articles whose abstracts support your point. Oh look another nutrition article based on rat studies and elderly stroke victims. Fun.

Comment author: John_Maxwell_IV 05 January 2013 10:41:41AM *  1 point [-]

You don't think an article's abstract is significant Bayesian evidence? (How about the abstract of a meta-analysis?) Which is the weaker link here: from blog post to abstract or from abstract to actual paper?

Too many people dabbling in scientism these days with the internet making it easy to link to a few articles whose abstracts support your point.

Can't have those unwashed masses linking to scientific papers now can we? :)

Comment author: RomeoStevens 05 January 2013 12:05:06PM 3 points [-]

abstracts of meta analyses are significantly better. The problem with normal papers is that the abstract doesn't always specify the methodology, effect size, and clinical relevance.

Comment author: gwern 04 January 2013 02:33:04AM 9 points [-]

Actually, the Wayback Machine might have a copy, but even if it did, you couldn't get it: http://findarticles.com/robots.txt now specifies User-agent: * Disallow: / which is a big FU to the Internet Archive and also the Google cache.

By the way, based solely on the information in the article, I was able to find the citation and the actual full original publication in under 2 minutes. Can you guess how?

Comment author: Benja 04 January 2013 02:58:03AM 4 points [-]

Ah, thanks. (Here, page 57 and following; the article is "Dissecting software failures", published in the Hewlett-Packard journal, April 1989.) I did forget to try that, but it's rather a piece of luck that Morendil's article contains that.

Comment author: gwern 04 January 2013 03:02:47AM *  8 points [-]

So how did you find it this time? I'm always curious about this phenomenon where person A goes "I can't do it!", person B says "there is a solution", and person A then goes "ah!"

This incident actually looks a bit like the Shannon anecdote I quote in http://www.gwern.net/on-really-trying

BTW, if you hate Scribd as much as I do, once you know what issue of the HP journal it's in, you can easily find the official HP archives and download the PDF at http://www.hpl.hp.com/hpjournal/pdfs/IssuePDFs/1989-04.pdf (Scribd shows up as the main hit just because they did OCR on their copy of the PDF, but where did their uploader get it, one should wonder upon seeing it.)

Comment author: Qiaochu_Yuan 04 January 2013 05:20:54AM 4 points [-]

I'm always curious about this phenomenon where person A goes "I can't do it!", person B says "there is a solution", and person A then goes "ah!"

Yes, I noticed this back when I was doing math competitions: it was often much easier for me to find a solution to a problem if someone told me that they had found a solution, especially if they had found it quickly. The obvious corollary is that you should first approach problems as if you knew someone who had found a solution quickly, but I never successfully internalized this.

Comment author: TheOtherDave 04 January 2013 05:46:47AM 11 points [-]

My favorite variation of this was when one of our developers asked me to review a design she was contemplating for fixing a defect.

So she went through it in some detail, and I worked through some edge cases, and finally said "Yeah, this looks OK to me. You should go talk to Mark about the tax allocation bit over here, though, because he understands the tax code better than I do and he may notice stuff I won't. For example, he'd probably notice that this will fail in cases where thus-and-such is true.... um... which I, er, wouldn't notice."

And she looked at me a little confused, and I said "So, there's a problem with this design in cases where thus-and-such is true. We should modify the design" and we kept going as if that particular brain failure hadn't been narrated out loud.

My guess is I do this all the time, but I remember that incident because I was vocalizing my thoughts.

Comment author: Qiaochu_Yuan 04 January 2013 10:50:08AM 8 points [-]

I have also been told to use this as a problem-solving technique (namely pretending you are a different person and seeing what they would notice), but I am not very good at this either. I tried to run a simulation of MoR!Quirrell in my head, but my head is not a sufficiently interesting place for him to be at the moment, so I think he left.

Comment author: TheOtherDave 04 January 2013 02:45:34PM 3 points [-]
  • chuckle *
    I've done some playing around with this and have come to the tentative conclusion, backed up by no evidence, that the key thing isn't really pretending to be someone else, but rather relaxing the constraints that I keep around "me". That is, it's not so much creating a "what would Mark think?" simulation as it is temporarily purging my "what kinds of things does Dave not think?" filters.
    Which is to say, it's basically a question of maximizing creativity.
Comment author: gwern 04 January 2013 04:03:39PM 2 points [-]

So you think these sorts of incidents are just another form of rubber-ducking?

Comment author: TheOtherDave 04 January 2013 04:35:54PM 2 points [-]

Mm... in a sufficiently broad sense, yes, but in detail, not really.

I would say that rubber-ducking (by which I assume you mean the exercise of explaining a complex technical concept, like the flow of control through code, to an inanimate object before submitting it to group review) is primarily a technique for attentional control; it forces me to actually think through a problem rather than simply telling myself that i have thought through the problem.

I think what goes on in these sorts of incidents is somewhat different, though related in many ways.

Basically, I think I've got a set of "the sorts of things Dave thinks" filters that run in my head, and there are some useful thoughts that my brain is capable of generating that tend to get excluded from my conscious awareness by those filters (because they "aren't the sort of thing Dave would think"), and sometimes it can be useful to subvert or reconfigure those filters.

And role-playing of this sort ("What would I say if I were Mark?") is one way to reconfigure those filters.

Comment author: gwern 04 January 2013 04:43:24PM 0 points [-]

And role-playing of this sort ("What would I say if I were Mark?") is one way to reconfigure those filters.

So is that what is going on in this search and that Shannon example? But that seems a little weird, why would Benja have a 'gwern filter' in his head which says 'the article has a direct quote from G89, gwern would try searching a direct quote, so I should too'?

Comment author: TheOtherDave 04 January 2013 05:24:35PM 1 point [-]

WRT the shannon example... well, yes and no.

I suppose something similar is going on: Shannon has been invited to step out of the frame that he's in and step into a new one, where he is identifying with his brother, who knows something important about how to get to a solution to the puzzle from where Shannon is now, and that reframing helps encourage creativity. But also, and significantly, Shannon's brother has given him a new datum: there is a discrete thing-to-be-told which would significantly help. (This is, admittedly, implicit. But if I don't assume it, the story makes no sense to me.)

So no, I don't think it's the only thing going on, or necessarily the most important thing.

And I disagree with "you can always give it to yourself," actually. Or, rather, with the implicit statement that doing so is necessarily useful. For some puzzles his brother might have instead said "Huh. You probably want to rethink your whole approach." Which is also a hint I can always give myself, but it's a different hint that leads me in different directions.

There's probably a huge number of hints like that I can give myself for any given problem, but picking them at random is perhaps not the best problem solving strategy.

Still, if I'm stuck, trying a few is better than nothing.

WRT the Benja search... I suspect that was more of a case of trying harder by virtue of being motivated by the knowledge that success is possible/likely, and to some extent breaking out of transient mental sets.

But even if it were a case of temporarily reconfiguring more persistent unhelpful filters like I describe, it wouldn't follow that Benja has a "gwern filter", merely that Benja, like gwern, has some learned techniques for finding stuff on Google, which includes 'search for direct quotes' along with a million other things, and that the default Benja filter for whatever reason excludes that technique when it searches for techniques to suggest for this kind of problem, and the role-playing exercise encourages disabling the default Benja filter, making that technique easier to access. The "gwernyness" of that reconfiguration, much like the "markiness" in my example, is rather tangential; the importance of being gwern, in this hypothetical, would be that it entails not being Benja.

Comment author: gwern 04 January 2013 05:40:13PM 0 points [-]

But also, and significantly, Shannon's brother has given him a new datum: there is a discrete thing-to-be-told which would significantly help. (This is, admittedly, implicit. But if I don't assume it, the story makes no sense to me.)

But he's solving a puzzle, there's always a thing-to-be-told!

Comment author: Luke_A_Somers 04 January 2013 04:53:34PM 0 points [-]

What you said out loud wasn't wrong. There are likely cases which are much like the one that you did find, except that you would not be able to find them.

Comment author: TheOtherDave 04 January 2013 04:59:59PM 0 points [-]

True, though it's less clear that Mark would probably notice them.
Still, that's probably true as well.

Comment author: BlazeOrangeDeer 07 January 2013 10:25:52PM *  0 points [-]

The other question is whether it's helpful to quickly look for obvious answers when there isn't one. The information content of "there is a solution" is actually not only one bit (yes vs no), because the fact that that person told it to you means that they solved it quickly using techniques that they already know about. This usually helps you because you either share much of their knowledge, or have an idea of what things they are knowledgeable about. The correct advice in some other cases might have been "you need to learn something else completely new before you'll get it" or "just stop trying because this problem is really of no value and has no easy answer".

Comment author: Benja 04 January 2013 03:22:09AM 2 points [-]

Gurer'f n qverpg dhbgr sebz gur negvpyr va Zberaqvy'f cbfg. (Rot'ed in case someone else feels like trying themselves.)

Comment author: gokfar 04 January 2013 12:17:08PM 1 point [-]

Nygreangviryl, frnepuvat sbe gur hey jvgubhg /ct_2/ lvryqf gur shyy pvgngvba.

Comment author: gwern 04 January 2013 03:26:34AM 1 point [-]

Yes, that's how I did it too. V nffhzr V pbhyq nyfb unir tbar sebz gur erfrnepure'f fheanzr naq gur lrne gb gur negvpyr gvgyr naq sbhaq vg gung jnl nyfb, ohg V unira'g gevrq fvapr gur dhbgr zrgubq jbexrq vgf hfhny frnepu zntvp.

Comment author: someonewrongonthenet 09 January 2013 03:00:08AM *  0 points [-]

Person B saying "there is a solution" provides person A with useful information.

Little details, such as the speed at which another person finds the solution (and the fact that they found it at all) gives clues as to what type of problem it is - divergent or convergent thinking, overall hardness, etc.

The fact that a specific person x was able to find the solution narrows the space to "things that person x would be good at solving".

Finally, the resources which another person put into finding the solution provide a rough upper bound to how many resources the seeker will have to devote to find it for himself, reducing the risk involved in the investment.

All of these effects are social in nature, which means that it is not unlikely that we humans have in-built mechanisms to use this information without being able to consciously articulate what exactly the information we have gained is.

Comment author: gwern 09 January 2013 03:43:30AM 0 points [-]

That someone found the solution cannot be relevant in cases where it's known that there is a solution, where this effect seems to still apply. I don't see how one could extract anything about divergent or convergent thinking, since you don't know how they solved it or usually how long they took; if you knew how long it took and you knew whether they tended towards convergent thinking, then you could infer whether you should focus harder on convergent or divergent thinking, but if you know neither...?

Comment author: someonewrongonthenet 09 January 2013 08:09:50AM *  1 point [-]

I think my explanation of my thoughts is lacking, let me give a specific example of what I mean.

Imagine a teacher with a penchant for pointless questions ask non-mathematics students the following question:

"What is 6+7+8+9+...+347"?

Most of the students in the classroom will begin dutifully adding the numbers up. Some of them won't even bother - they've estimated the time it will take and it isn't worth the effort to solve such uninteresting busywork.

Of course, someone will take about five seconds to shout out that they have an answer.

Now the other students know that there is a way to solve the problem that doesn't involve investing a large amount of time. They'll get out of "let's tediously add all the numbers" mode and go into "let's find a quick shortcut to solving this" mode.

Everyone knew a solution existed, but they didn't imagine it would be the quick, clever sort of solution until someone actually solved it quickly. The fact that someone found the answer without investing large amounts of time and resources into the problem gave them vital information about the best method for finding the answer.

Comment author: gwern 09 January 2013 06:51:12PM 0 points [-]

One could also appeal to the story about Gauss as a child adding up 1..100 by a clever trick, and none of his classmates figuring it out despite clearly seeing that Gauss must've done something clever.

But notice how your example does not fit my points: "since you don't know how they solved it or usually how long they took"; in this case, you have a very good estimate of how long it will take them to use the O(n) summation algorithm from all your past sums, and since you were all assigned the problem at the same time, you also know precisely how long it took them.

In the Shannon anecdote, you know nothing about how long it took the brother to answer it nor, given how heterogenous puzzles can be, how long it might take him to solve it, nor is there even any 'brute force' approach for most puzzles which you could compare against a 'clever' approach and so choose to look for a clever approach rather than spend more time executing the brute force approach.

Similarly for web searching, there's typically no brute force approach at all: if Google spits out a list of 10 hits total for the paper title and you look at all 10 and they fail, then what? What's the dumb brute force approach in searching? You simply have to try another 'clever' approach, because you've exhausted all your available data.

Comment author: someonewrongonthenet 09 January 2013 07:11:59PM *  0 points [-]

Sorry, you're right, I didn't read your previous post carefully enough.

I agree that if this phenomenon is real, in order to explain it in terms of a rational agent you do need to either know something about the person who solved it, or how long they took, or some other detail about them in order for this to be helpful in any way.

In the real world, however, a declaration of having solved the problem always leaves some sort of knowledge. In the web search case that just unfolded in this thread, by posting a solution you leaked the information that a solution existed and that it didn't take an unreasonable amount of time to figure out, which provided Benja additional incentive to start looking for a clever approach.

I'll agree that it does seem like there is more than simple information gain going on here though. Perhaps there are other factors, such as the insertion of an element of competition?

Comment author: gwern 09 January 2013 07:23:03PM 0 points [-]

I'll agree that it does seem like there is more than simple information gain going on here though. Perhaps there are other factors, such as the insertion of an element of competition?

Certainly seems possible. I admit I tend to announce the time it took to find something that someone failed to as part of showing off and elevating myself, so it would be no surprise if the recipient felt shamed and inflamed into looking better - the difference between peak and average performance might explain the differential.

Comment author: Morendil 04 January 2013 07:18:35AM 8 points [-]

Thanks for the heads-up!

Yes to your overall point: link rot is a nasty problem; one that will increasingly mess with things like scientific citation.

Now for the nitpicks. G89 wasn't even the "original" study, just the earliest source I could find that discussed those "results".

What I wanted was to show the quote in question - to make it available to the reader of my post so they could check that I had my facts right. For that purpose the link is what I really needed, not "merely" a citation; and it sucks that the link went dead, but that wasn't under my control.

I have updated the post with another link (the last extant copy of this content; we can hope the link remains longer than the previous one, but I'm under no illusion that it will). I have also added the title of the original article and the publication.

BTW, I don't know what "PSA" means?

Comment author: Benja 04 January 2013 09:02:14PM 3 points [-]

Point taken on "original", and thanks for updating the article! Gwern has also found a link on the HP homepage.

What I wanted was to show the quote in question - to make it available to the reader of my post so they could check that I had my facts right. For that purpose the link is what I really needed, not "merely" a citation; and it sucks that the link went dead, but that wasn't under my control.

I'm not saying you shouldn't have given the link -- I'm saying that if you had also given the citation, then even after the link broke, it would have been slightly more inconvenient but not difficult for me to look it up! That's the main point of also giving the citation: to make the source available to the reader of your post even if the link rots.

Comment author: drethelin 04 January 2013 09:09:27AM 4 points [-]

Public Service Announcement

Comment author: Konkvistador 05 January 2013 12:42:25PM 1 point [-]

It is kind of superfluous in the title, I wish it was removed. Besides being Americentric it actually made me want to read a thoughtful suggestion less.

Comment author: Morendil 04 January 2013 05:36:59PM 0 points [-]

Ah. Thanks!

Comment author: lukeprog 04 January 2013 02:58:03AM 9 points [-]

Plenty of people complain about my long lists of references in earlier posts. :(

Maybe the best is to put the references in a comment that is linked from the end of the post.

Comment author: gwern 04 January 2013 03:07:08AM 15 points [-]

My current solution, besides paranoid archiving for Internet links, is to exploit tool tips: so the full title & authors exists in the page, but entirely unobtrusive to readers of either the article or comments. It seems to be working out so far.

Comment author: lukeprog 04 January 2013 03:16:59AM 4 points [-]

I couldn't find this quickly: What's the Markdown and HTML for adding tooltips?

Comment author: gwern 04 January 2013 03:22:55AM *  13 points [-]

In Markdown, it goes like [displayed text](hyperlink "tooltip alt text"); in HTML I think it's an additional argument to <a> or <href> which goes title="tooltip alt text", so for my example above:

  • Markdown:

    [paranoid archiving for Internet links](http://www.gwern.net/Archiving%20URLs "'Archiving URLs', gwern 2013")

  • HTML:

    <a href="<http://www.gwern.net/Archiving%20URLs>" rel="nofollow" title="&#39;Archiving URLs&#39;, gwern 2013">paranoid archiving for Internet links</a>

Comment author: ChristianKl 04 January 2013 08:48:55PM 3 points [-]

This should probably be added to the official formating help page of lesswrong.

Comment author: gwern 04 January 2013 09:04:02PM 2 points [-]
Comment author: Morendil 04 January 2013 07:24:16AM 0 points [-]

I like this solution a whole lot. I'm late for work so I won't search for the specific occasion but I seem to recall I've suggested stealing it for LW.

Comment author: Qiaochu_Yuan 04 January 2013 10:47:59AM 7 points [-]

Really? I think they're wonderful.

Comment author: [deleted] 04 January 2013 11:06:08AM 1 point [-]

They are when I'm reading on a desktop or a notebook, but they're a pain in the ass to scroll down beyond when I'm reading on a smaller device such as a smartphone.

Comment author: RolfAndreassen 04 January 2013 09:12:34PM 2 points [-]

Then don't do that.

Seriously: I strongly suggest that our articles should be optimised for checkability and in-depth reading, not convenience on the bus.