SilasBarta comments on Is Google Paperclipping the Web? The Perils of Optimization by Proxy in Social Systems - Less Wrong

37 Post author: Alexandros 10 May 2010 01:25PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (104)

You are viewing a single comment's thread.

Comment author: SilasBarta 10 May 2010 10:38:10PM *  12 points [-]

sufficiently advanced spam is indistinguishable from content

Great phrase. It's a reminder that: you know you have a good proxy when you're not sure that people who are gaming it are actually doing any harm.

Comment author: Lightwave 13 May 2010 09:06:07AM *  2 points [-]

I hereby propose the Turing test for spam - if a human judge cannot reliably distinguish spam from "genuine" content, it passes the test and qualifies as actual content. Or does it?

Comment author: HDriscoll 12 March 2011 05:52:10AM 0 points [-]

You may say the same about industrially processed food: yes, you can consume it. Eventually you become malnourished and the receptor sites of your cell membranes have trouble distinguishing the hormones you really need from the hormone inhibitors that get to the receptor sites first. Over time, you will lose vitality.

Spam of course, is a 'meat-like' substance, so the analogy may hold....

Spam content degrades the whole, just like bad food saps your vitality over time.

Comment author: Alexandros 11 May 2010 10:52:50AM *  1 point [-]

Indeed, if you can't tell spam from content, you may have identified the 'correct' definition of the quality you are trying to measure. I think one deviousness of the made-for-adsense content is that it can't be too informative, otherwise the visitors have no incentive to click on the ads. It balances between informative enough to get the users through but not enough to satisfy them. Normal content is not usually like that. But figuring that out is like judging intent, a task difficult for humans, never mind machines. Would the true definition of quality need to catch even that type of abuse? hmm..

Comment author: JenniferRM 12 May 2010 04:24:40AM 3 points [-]

My cynicism leads me to speculate that Google's ownership of both the adword market and the search market means it may already have the data set it would need to notice people finding a page via search and then moving on to click on the ads because the content didn't satisfy them.

The "metrics" from the two systems are probably very voluminous and may not be strongly bound to each other (like within session GUIDs to make things really easy) so it wouldn't be trivial to correlate them in the necessary ways, but it doesn't strike me as impossible. A simple estimate of the "ad bounce through" (percent of users who click on ads at a site within N seconds of arriving there via search) could probably be developed and added to PageRank as a negative factor if this is not already in the algorithm.

However, despite access to the necessary data set, Google may not have the incentive to do this.

Comment author: Alexandros 12 May 2010 07:26:14AM *  1 point [-]

This is a very good thought I hadn't considered. Thinking about it, on the one hand, I can imagine it easy to circumvent by switching ad providers. On the other hand this would drive many spammers to using alternative ad providers, which would degrade those services so it may be strategically good for Google. Or perhaps by driving spammers and affiliate marketers on to a competitor, it will help them acheive critical mass, something google would like to avoid. Also, using some kind of 'ad bounce through' ratio may have unacceptably high false positive ratios, again a bad outcome.

I hope this was not too much rambling, thanks for the interesting perspective.

Comment author: thomblake 10 May 2010 10:44:53PM 0 points [-]

I hadn't thought of that angle. If we end up with a lot of actually good original machine-generated content (somehow) then surely that wouldn't be a loss.

Comment author: SilasBarta 11 May 2010 04:06:30PM *  3 points [-]

Yes, and imagine if spammers went through the effort to make an android indistinguishable from a human on the outside (in behavior and form), and had it "spam" you after reading your internet postings/websites, on the pretense that it has some questions and wants to collaborate with you.

Then, it fakes an entire friendship, in which it gives you many useful ideas, in order to be able to slip in a few remarks here and there of the form, "Hey, I know a good Mexican pharmacy where you can get cheap Viagra." (Which you point out to your "friend" is probably a scam.)

If that's what spam comes to look like one day, I don't want a filtered inbox!

Comment author: Yvain 11 May 2010 06:27:28PM 6 points [-]

If that's what spam comes to look like one day, I don't want a filtered inbox!

http://www.smbc-comics.com/index.php?db=comics&id=1024#comic

Comment author: SilasBarta 13 May 2010 04:03:49PM 2 points [-]

I think it's freaking awesome that someone had already made a comic about that concept.

Comment author: kpreid 13 May 2010 06:32:59PM 1 point [-]
Comment author: NancyLebovitz 11 May 2010 04:19:33PM 2 points [-]

I expectt there would still be a range of spam-- crude spam only needs a very low success rate to continue to be produced-- so you'll still want your filters.

Comment author: SilasBarta 11 May 2010 04:27:07PM 2 points [-]

Eh, I was just going for a zinger. You're right, it would be more accurate to say, "I don't want my inbox to call that spam!"

Comment author: thomblake 11 May 2010 06:31:53PM 1 point [-]

Don't forget your VK couples testing

Comment author: Caspian 14 May 2010 11:42:52AM 0 points [-]

But it could suggest fake online shops that appear similar to the real ones you use, and you'd be more likely to fall for it than the viagra ones.

Comment author: Alexandros 14 May 2010 10:14:03AM *  0 points [-]

Kinda sounds like having a useful service and supporting it with an ad-based model (but without clearly delineating the 'sponsored links'). If I could have someone interact with my work and give me useful ideas, I would probably pay for the privilege.

Comment author: Leonhart 10 May 2010 11:39:16PM 3 points [-]

This is indeed happening. Not so much the machine-generated aspect, but the second biggest question I ask myself about my SEO clients these days is "What interesting media could they author about their field of expertise?" The biggest question is, of course, "How do I persuade them that they need to actually DO this?"

In extremis, of course, we end up with comparethemeerkat. It's the only way to make a financial services aggregator unboring enough to get people to link to it.

Comment author: RichardKennaway 14 May 2010 11:59:59AM *  1 point [-]

This reminds me of a short story by O. Henry. I don't remember many of the specifics, but it's set in the world of American (or perhaps it was Mexican) small-town politics and graft. There's a character, a career con-man, who gets to be town mayor by discovering what he says is the best graft of all: honesty. You just do what you say you're going to do and don't try to con people. They'll flock to do business with you, and you make a pile of money without having to steal anything! They can't even put you in jail for it!

ETA: A quick look at Wikipedia suggests this is from his collection of linked short stories, Cabbages and Kings, set in Central America.