If it's something I trust an idiot to make the right conclusion on with good data, I'll look for meta-analyses, p<<0.05, or do a quick and dirty meta analysis myself if the number of studies is sufficiently small. If it's something I'm surprised has even been tested, I'll give one study more weight. If it's something that I'd expect to be tested a lot, I'd give it less. If the data I'm looking for is orthogonal to the data they're being published for, it probably doesn't suffer from selection bias so I'll take it at face value. If the studies result is 'convenient' in some way for the source that showed it to me, I'll be more skeptical of selection bias and misinterpretation.
If it's a topic where I see very easy to make methodological flaws or interpretation errors, then I'll try to actually dig in and look for them and see if there's a new obvious set of conclusions to draw.
Separately from determining how strong the evidence is, I'll try to 'put it in my brain' if there's only a study or two if it's testing a hypothesis I already suspected of being true, or if it makes too much sense in hindsight (aka high priors), or put it in my brain with a 'probably untrue but something to watch out for' tag otherwise.
Related to: Parapsychology: the control group for science, Dealing with the high quantity of scientific error in medicine
Some of you may remember past Less Wrong discussion of the Daryl Bem study, which claimed to show precognition, and was published with much controversy in a top psychology journal, JPSP. The editors and reviewers explained their decision by saying that the paper was clearly written and used standard experimental and statistical methods so that their disbelief in it (driven by physics, the failure to show psi in the past, etc) was not appropriate grounds for rejection.
Because of all the attention received by the paper (unlike similar claims published in parapsychology journals) it elicited a fair amount of both critical review and attempted replication. Critics pointed out that the hypotheses were selected and switched around 'on the fly' during Bem's experiments, with the effect sizes declining with sample size (a strong signal of data mining). More importantly, Richard Wiseman established a registry for advance announcement of new Bem replication attempts.
A replication registry guards against publication bias, and at least 5 attempts were registered. As far as I can tell, at the time of this post the subsequent replications have, unsurprisingly, failed to replicate Bem's results.1 However, JPSP and the other high-end psychology journals refused to publish the results, citing standing policies of not publishing straight replications.
From the journals' point of view, this (common) policy makes sense: bold new claims will tend to be cited more and raise journal status (which depends on citations per article), even though this means most of the 'discoveries' they publish will be false despite their p-values. However, this means that overall the journals are giving career incentives for scientists to massage and mine their data for bogus results, but not to challenge bogus results by others. Alas.
1 A purported "successful replication" by a pro-psi researcher in Vienna turns out to be nothing of the kind. Rather, it is a study conducted in 2006 and retitled to take advantage of the attention on Bem's article, selectively pulled from the file drawer.
ETA: The wikipedia article on Daryl Bem makes an unsourced claim that one of the registered studies has replicated Bem.
ETA2: Samuel Moulton, who formerly worked with Bem, mentions an unpublished (no further details) failed replication of Bem's results conducted before Bem submitted his article (the failed replication was not mentioned in the article).
ETA3: There is mention of a variety of attempted replications at this blog post, with 6 failed replications, and 1 successful replication from a pro-psi researcher (not available online). It is based on this ($) New Scientist article.
ETA4: This large study performs an almost straight replication of Bem (same methods, same statistical tests, etc) and finds the effect vanishes.
ETA5: Apparently, the mentioned replication was again submitted to the British Journal of Psychology: