Book review: The Reputation Society. Part II

5 Stefan_Schubert 14 May 2014 10:16AM

This is the second part of my book review of The Reputation Society. See the first part for an overview of the structure of the review. 

 

Central concepts of The Reputation Society

 

Aggregation of reputational information. Since the book is entirely untechnical, and since aggregation rules by their nature are mathematical formulae, there isn’t much on aggregation rules (i.e., on how we are to aggregate individuals' ranking of, e.g., a person or a product, into one overall rating) in the book. The choice of aggregation rules is, however, obviously very important to optimize the different functions of reputation systems.


One problem that is discussed, though, is whether the aggregation rules should be transparent or not (e.g., in chs. 1 and 3). Concealing them makes it harder for participants to game the system, but on the other hand it makes it easier for the system providers to game the system (for instance, Google has famously been accused of manipulating search results for money). Hence concealment of the aggregation rules can damage the credibility of the site. (See also Display of reputational information.)

 

Altruism vs self-interest as incentives in rating systems. An important question for any rating system is whether it should appeal to people’s altruism (their community spirit) or to their self-interest. Craig Newmark (foreword) seems to take the former route, arguing that “people are normally trustworthy”, whereas the authors of ch. 11 argue that scientists need to be given incentives that appeal to their self-interest to take part in their reputation system.

 

It could be argued that the success of Wikipedia shows that appealing to people’s self-interest is not necessary to get them to contribute. On the other hand, it could also be argued that the notion that Wikipedia has been successful is due to a lack of imagination concerning the potential of sites with user-generated content. Perhaps Wikipedia would have been still more successful if they had given contributors stronger incentives.

 

Anonymity in online systems. Dellarocas (ch. 1) emphasizes that letting social network users remain anonymous while failing to guard against the creation of multiple identities facilitates gaming greatly. On the other hand, prohibitions against remaining anonymous might raise privacy concerns.

 

Display of reputational information. Dellarocas (ch. 1, Location 439, p. 7) discusses a number of ways of displaying reputational information:

 

  1. Simple statistics (number of transactions, etc.
  2.  Star ratings (e.g. Amazon reviews)
  3. Numerical scores (e.g., eBay’s reputation score)
  4. Numbered tiers (e.g., World of Warcraft player levels)
  5. Achievement badges (e.g., Yelp elite reviewer)
  6. Leaderboards (lists where users are ranked relative to other users; e.g. list of Amazon top reviewers.

 

See gaming for a brief discussion of the advantages and disadvantages of comparative (e.g., 6) and non-comparative systems (e.g., 5).
  

Expert vs peer rating systems. Most pre-Internet rating systems were ran by experts (e.g., movie guides, restaurant guides, etc.) Internet has created huge opportunities for rating systems where large number of non-expert ratings and votes are aggregated into an overall rating. Proponents of the Wisdom of the crowd argue that even though many non-experts are not very reliable, the noise tends to even out as the number of rater grows, and we are left with an aggregated judgment which can beat that of experienced experts.


However, the Internet also offers new ways of identifying experts (emphasized, e.g., in ch. 8). People whose written recommendations are popular, or whose ratings are reliable as measured against some objective standard (if such a standard can be constructed – that obviously depends on context) can be given a special status. For instance, their recommendations can become more visible, and their ratings more heavily weighted. It could be argued that such systems are more meritocratic ways of identifying the experts than the ones that dominate society today (see, e.g., ch. 8).

 

Explicit vs implicit reputation systems. In the former, your reputation is a function of other users’ votes, whereas in the latter, your reputation is derived from other forms of behavior (e.g., the number of readers of your posts, your number of successful transactions, etc.). This is a distinction made by several authors, but unfortunately they use different terms for it, something which is never acknowledged. Here the editors should have done a better job.


In the language of economics, the implicit reputation systems (such as Google’s page rank system) are, by and large, based on people's revealed preferences - by their actions – whereas explicit reputation systems are built on their stated preferences. Two main advantages of revealed preferences are that we typically get them for free (since we infer them from publically observable behavior that people do for other reasons – e.g.,  making a link to a page – whereas we need to ask people if we want their stated preferences) and that they typically express people’s true preferences (whereas their stated preferences might be false – see untruthful reporting). On the other hand, we typically only get quite coarse-grained information about people’s preferences by observing their behavior (e.g., observing that John chose a Toyota over a Ford does not tell us whether he did that because it was cheaper, or because of a preference for Japanese cars, or because of its lower fuel consumption, etc.), whereas we can get more fine-grained information about their preferences by asking them to state them.

 

Functions of reputation systems. Dellarocas (ch. 1, Location 364, p. 4) argue that online reputation systems have the following functions (to varying degrees, depending on the system):


 a) a socializing function (rewarding desired behavior and punish undesired one; build trust). As pointed in chs. 6 and 7, this makes reputation systems an alternative to other systems intended to socialize people; in particular government regulation (backed by the threat of force). This should make reputation systems especially interesting to those opposed to the latter (e.g., libertarians).


 b) an information-filtering function (makes reliable information more visible).


 c) a matching function (matching users with similar interests and tastes in, e.g., restaurants or films – this is similar to b) with the difference that it is not assumed that some users are more reliable than others).


 d) a user lock-in function – users who have spent considerable amounts of time creating a good reputation on one site are unlikely to change to another site where they have to start from scratch.

 

Gaming. Gaming has been a massive problem at many sites making use of reputation systems. In general, more competitive/comparative displays of reputational information exacerbate gaming problems (as pointed out in ch. 2). On the other hand, strong incentives to gain a good reputation are to some extent necessary to solve the undersupply of reputational information problem.


Dellarocas (ch. 1) emphasizes that it is impossible to create a system that is totally secure from manipulation. Manipulators will continuously come up with new gaming strategies, and therefore the site’s providers constantly have to update its rules. The situation is, however, quite analogous to the interplay between tax evaders and legislators and hence these problems are not unique to online rating systems by any means.

 

Global vs personalized/local trust metrics (Massa, ch. 14). While the former gives the same assessments of the trustworthiness of person X to each other person Y, the latter gives different assessments of the trustworthiness of X to different people. Thus, the former are comprised of statements such as “the reputation of Carol is .4”, the latter of statements such as “Alice should trust Carol to degree .9” and “Bob should trust Carol to degree .1” (Location 3619, p. 155). Different people may trust others to different degrees based on their beliefs and preferences, and this is reflected in the personalized trust metrics. Massa argues that a major problem with global rating systems is that they lead to “the tyranny of the majority”, where original views are unfairly down-voted. At the same time, he also argues that the use of personalized trust metrics may lead to the formation of echo chambers, where people only listen to those who agree with them.

 

Immune system disorders of reputation systems (Foreword). Rating systems can be seen as “immune systems” intended to give protection against undesirable behavior and unreliable information. However, they can also give rise to diseases of their own. For instance, the academic “rating systems” based mainly on number of articles and numbers of citations famously give rise to all sorts of undesirable behavior (see section IV, chs. 10-12, on the use of rating/reputation systems in science). An optimal rating system would of course minimize these immune system disorders.

 

Karma as currency. This idea is developed in several chapters (e.g., 1 and 2) but especially in the last chapter (18) by Madeline Ashby and Cory Doctorow, two science fiction writers. They envision a reputation-based future society where people earn “Whuffie”  – Karma or reputation – when they are talked about, and spend it when they talk about others. You can also exchange Whuffie for goods and services, effectively making it a currency.

 

Moderation. Moderation is to some extent an alternative to ratings in online forums. Moderators could either be paid professionals, or picked from the community of users (the latter arguably being more cost-efficient; ch. 2). The moderators can in turn be moderated in a meta-moderation system used, e.g., by Slashdot (their system is discussed by several of the authors).


Yet another system which in effect is a version of the meta-moderation system is the peer-prediction model (see ch. 1), in which your ratings are assessed on the basis of whether they manage to predict subsequent ratings. These later ratings then in effect function as meta-ratings of your ratings.   

 

Privacy – several authors raise concerns over privacy (in particular chs. 16-18). In a fully-fledged reputation society, everything you did would be recorded and counted either for or against you. (Such a society would thus be very much like life according to many religions – the vicious would get punished, the virtuous rewarded – with the crucial difference that the punishments and rewards would be given in this life rather than in the after-life.) While this certainly could improve behavior (see Functions of reputation systems) it could also make society hard and unforgiving (or so several authors argue; see especially ch. 17). People have argued that it therefore should be possible to undergo “reputational bankruptcy” (cf. forgiveness of sins in, e.g., Catholicism), to escape one’s past, as it were, but as Eric Goldman points out (ch. 5, Location 1573, p. 59), this would allow people to get away with anti-social behavior without any reputational consequences, and hence make the reputation system’s socializing effects much weaker. 


As stated in the introduction, in small villages people often have more reliable information about others’ past behavior and their general trustworthiness. This makes the villages’ informal reputation systems very powerful, but it is also to some extent detrimental to privacy. The story of the free-thinker who leaves the village where everyone knows everything about everyone for the freedom of the anonymous city is a perennial one in literature.


 It could be argued, thus, that there is necessarily a trade-off between the efficiency of a reputation system, and the degree to which it protects people’s privacy. (See also Anonymity in online systems for more on this.) According to this line of reasoning, privacy encroachments are immune system disorders of reputation systems. It is a challenger for the architect of reputation systems to minimize this, and other, immune system disorders.

 

Referees – all rating systems need to be overlooked by referees. The received view seems to be that they need to be independent and impartial, and the question is raised whether private companies such as Google can function as such trustworthy and impartial referees (ch. 3). An important problem with regards to this is who “guards the guards?”. In ch. 3, John Henry Clippinger argues that this problem, which “has been the Achilles heel of human institutions since times immemorial” (Location 1046, p. 33) can be overcome in online reputation systems. The key, he argues, is transparency:


In situations in which both activities and their associated reputation systems become fully digital, they can in principle be made fully transparent and auditable. Hence the activities of interested parties to subvert or game policies or reputation metrics can themselves be monitored, flagged, and defended against.

  

Reporting bias (ch. 1) – e.g., that people refrain from giving negative votes for fear of retaliation. Obviously this is more likely to happen in systems where it is publically visible how you have voted. Another form of reporting bias is due to a certain good or service only being consumed by fans, who tend to give high ratings.

 

Reputation systems vs recommendation systems. This is a simple terminological distinction: reputation systems are ratings of people, recommendations systems are ratings of goods and services. I use “rating systems” as a general term covering both reputation and recommendation system.


Undersupply of reputational information; i.e. that people don’t rate as much as is socially optimal. This is also a concept mentioned by several authors, but in most detail in ch. 5 (Location 1520, p. 57):

 

Much reputational information starts out as non-public (i.e. “private”) information in the form of a customer’s subjective impressions about his or her interactions with a vendor. To the extent that this information remains private, it does not help other consumers make marketplace decisions. These collective mental impressions represent a vital but potentially underutilized social resource.

 

The fact that private information remains locked in consumers’ head could represent a marketplace failure. If the social benefit from making reputational information public exceeds the private benefit, public reputational information will be undersupplied.

 

Personally I think this is a massively underappreciated problem. People get countless such subjective impressions every day. At present we harvest but a tiny portion of these subjective impressions, or judgments, as a community. If the authors’ vision is to stand a chance of getting realized, we need to make people share these judgements to a much greater extent than they do today. (It goes without saying that we also need to distinguish the reliable ones from the unreliable ones).

 

Universal vs. constrained (or contextual) reputation systems. (ch. 17) The former are a function of your behavior across all contexts, and influences your reputation in all contexts, whereas the latter are rather constrained to a particular context (say selling and buying stuff on eBay). 

 

Untruthful reporting (ch. 1). This can happen either because raters try to game the system (e.g., in order to benefit themselves, their restaurant, or what not) or because of vandalism/trolling. Taking a leaf out of Bryan Caplan’s "The Myth of the Rational Voter", I’d like to add that even people who are neither gaming or trolling typically spend less time and effort giving accurate ratings for others’ benefit than they do when they make decisions that influence their own pockets. Presumably this will decrease the level of accuracy of their ratings.

 

Book Review: The Reputation Society. Part I

10 Stefan_Schubert 14 May 2014 10:13AM

The Reputation Society (MIT Press, 2012), edited by Hassan Masum and Mark Tovey, is an anthology on the possibilities of using online rating and reputation systems to systematically disseminate information about virtually everything - people, goods and services, ideas, etc., etc. Even though the use of online rating systems is an overarching theme, the book is, however, quite heterogeneous (like many anthologies). I have therefore chosen to structure the material in a somewhat different way. This post consists of a short introduction to the book, while in the next, far longer post, I list a number of concepts and distinctions commented on by the authors (either explicitly or implicitly) and briefly summarize their take on them.

 

My hope is that this Wiki-style approach maximizes the amount of information per line of text. Also, though these concepts and distinctions are arguably the most useful stuff in the book, they are unfortunately not gathered in any one place in the book. Hence I think that my list should be of use for those that go on to read the book, or parts of it. I also hope that this list of entries could be a start to a series of Less Wrong Wiki entries on reputation systems. Moreover, it could be a good point of departure for general discussions on rating and reputation systems. I would be happy to receive feedback on this choice of presentation form (as well as on the content, of course).

 

A chapter-by-chapter review (more of a guide to what chapters to read, really) can be found on my blog. (This review is already too long which is why I put the chapter-by-chapter overview there rather than here at Less Wrong.) Monique Sadarangani has also written a review (which focuses on various legal aspects of online rating systems). Another associated text you might consider reading is Masum's and Yi-Cheng Zhang's "Manifesto for the Reputation Society" (2004).


Introduction

People have of course always relied on others' recommendations on a massive scale. We often don't have time to figure out who is reliable and who is not, what goods are worth buying and what are not, which university education is valued by employers and which is not, etc. Instead we look to the testimonies and recommendations of others.


As pointed out by several of the authors, these recommendations have, however, often been given in a quite unsystematic fashion. In small societies, this lack of systematicity and structure was, though, to some extent outweighed by the wealth of information you would obtain about any individual person or item. Everybody knew everybody, which meant that a crook would sooner or later typically be identified as such, even though information about people’s trustworthiness was not being spread in an organized, rational fashion.


However, when people moved into cities, it became easier for dishonest people to hide in the crowds. One-off encounters with strangers became much more common, and with them the incentives to cheat increased: these strangers could typically not identify you, which meant that your reputation was not damaged by dishonorable behavior (see chs. 4, 6).


The inhabitants of cities, particularly those working in professions such as trade, tried to counter these problems by forming associations which guaranteed that their members conducted themselves properly (or else they would be thrown out). As the complexity of society has increased, so has the number and efficiency of these recommendation and reputation systems (italicized terms appear as entries in the next post). Today there are countless organizations that keep track of the creditworthiness of individuals (e.g.,  FICO), companies and countries (e.g., Standard & Poor and Moody), the quality of education (e.g., The Guardian's University League Table, which provides an influential annual university ranking in the UK), the quality of restaurants (Guide Michelin), etc.


As virtually all of the authors argue, the Internet offers, however, spectacular opportunities for constructing rating systems that are more reliable and vastly much more pervasive than anything yet seen. The editors sum up this optimism in their introduction 
(Location 182, 2nd page of introduction):

 

In today’s world, reliable advice from others’ experience is often unavailable, whether for buying products, choosing a service, or judging a policy. One promising solution is to bring to reputation a similar kind of systematization as currencies, laws, and accounting brought to primitive barter economies. Properly designed reputation systems have the potential to reshape society for the better by shining the light of accountability into dark places, through the mediated judgments of billions of people worldwide, to create what we call the Reputation Society


There are of course already a great number of Internet rating systems, including those used by Google, Facebook (the like system), Amazon, eBay, Slashdot, Yelp, Netflix, Reddit, and, not to forget, Less Wrong. Many of these systems are discussed in the book (not Less Wrong, though). In particular, the authors try to assess what we can learn from the successes and failures of these rating sites.

 

There is a general (though seldom explicitly stated) sentiment in the book that the existing rating systems do not nearly exhaust the opportunities that the Internet provides us with. I certainly share this sentiment. As pointed out in ch. 5 (see the entry Underutilization of reputational information), people make a great number of “private judgments” in their heads which it would be very useful for others to learn about, but which they do not share. If they could be persuaded to share them to a greater extent, the social gains would be huge. If consumers got more reliable information about the quality of different goods and services (via consumer rating systems), the providers of those items would be forced to increase the quality of their products. In some areas, this is already happening, but other areas are lagging. The potential gains stretch far beyond goods and services, though: public debates would be conducted more rationally if rating systems penalized bullshitters (ch. 15), government would be better run if its actions and policies were rated in a rational way (ch. 13), science could improve if peers rated others’ work in a more rational way than they do at present (chs. 10-12). You would even imagine people rating your life plans, your behavior, and other stuff that is primarily interesting to yourself. Only imagination puts a limit to the potential uses of rating systems.


Even though there is some research on rating systems - on which Chrysantos (Chris) Dellarocas, the author of the first chapter, is an expert - most rating systems seem to be created in a quite unsystematic, trial-and-error fashion. Instead we should draw from the full range of the social sciences – e.g., from psychology, sociology, economics, law, history, anthropology and political sciences – when constructing such systems. I am convinced that we could benefit greatly as a society if we spent more time and resources on the construction of efficient rating systems. I also think that the Less Wrong community, with its combination of an intellectually curious and rational attitude and strong programming skills, potentially has a lot to contribute here.

 

At the same time, one shouldn't get over-optimistic. There are lots of hurdles to pass. I certainly do not share the wild optimism of Craig Newmark, the founder of Craigslist, who writes as follows in the foreword (Location 70, 1st page of Foreword): 

 

By the end of this decade, power and influence will have shifted largely to those people with the best reputations and trust networks and away from people with money and nominal power. That is, peer networks will confer legitimacy on people emerging from the grassroots.

 

These bold words remind me of some similarly bold predictions of how prediction markets, Wikis, and other forms of collective enterprises in the spirit of the Wisdom of the crowd will transform society and especially human knowledge, made in Cass Sunstein’s Infotopia  (2006). So far, Sunstein’s predictions haven’t been borne out and the odds don’t look too good for Newmark, either. I do agree with Newmark that there is a huge potential in rating systems, but realizing that potential is not going to happen by itself. It will take lots of testing, lots of ingenuity, lots of hard work, and certainly a considerably greater amount of time than Newmark believes, to do that.