The ratings game

In the continuing series, “Ask WITW,” we respond to reader questions with sometimes serious, sometimes insane, commentary. Today, we’ll take up two questions from Peter over at Rethinking Markets:

I’d like you to write about:
1) how and why recommendation engines seem to get suckier over time using them – what’s with the degradation of predictive power, the more it knows what you like/are listening to/are buying? How could Netflix imagine that I’d want to watch Apollo 13, much less that it’d be a top 10 for me?

2) Pushback by content producers in the technological transformation of cultural product delivery. E.g., How is it that a mass paperback like David McCullough’s The Great Bridge, from 1983, is available for $.35 as a used paperback, but is $15 on a kindle? How is it $18 to ‘buy’ Mean Girls on AppleTV?

I love these questions, especially since I’m unlikely to look tragically unhip answering them.

On #1:

Two broad answers seem likely.

First, it could be that prediction engines are just stochastically good at predicting your preferences, but you’ve mis-interpreted them as providing less accurate predictions over time.+ As I understand it, the algorithms combine a small amount of basically unsophisticated information that you provide about your own likes and dislikes (rating out of 5 stars, with only whole stars possible) and batch that with billions of pieces of information about millions of other users. From time to time, the algorithm identifies what we might think of as “true” resemblances between users and is thus able to accurately predict a movie you will enjoy in the future (or, one you will not enjoy).

So, let’s take each half of my argument in turn. First, why should we expect that recommendation engines sometimes, and sort of without warning, correctly predict your preferences?

Imagine that I rated the Zach Galifianakis “Live at the Purple Onion” comedy special with “4 stars.” And I rated “The Hangover” (a movie in which he plays a lead character) with “4 stars.” The recommendation engine combines my high ratings with those of other users (and in this case, there are millions, because so many people saw The Hangover) and recommends “The Hangover 2” with an expected rating of 3.7 stars. It did not expect that I would give The Hangover 2 a rating of 2 stars. This is because it wasn’t able to determine that I have no attachment to Galifianakis as an actor–no consistent “liking” of him–but rather particular preferences for aspects of the first two films (in the first, it is the resemblance to a Andy Kaufman style comedy; in the latter, the situational pleasure I took in seeing the film with a good friend, in the right mood–I’ve never gone to “adjust” the rating to be lower, reflecting my repeated viewing of the film, and depressed pleasure in it).

In contrast, the engine correctly predicted that I would like the TV series “White Collar” despite my not having rated many “crime” TV shows, not having rated any shows and movies starring any of the major characters, etc.

A few hypotheses flow from my explanation:

a. Users who rate more videos will probably not have more accurate predictions than those who rate fewer videos because:

  1. The relevant predictor variables are rare (e.g., watching a film in the right mood with a specific friend);
  2. The relevant predictor variables are impossible to code (e.g., that Galifianakis reminds me of another comedian whom I like, but who does not appear in the dataset, or whom I only like occasionally, or whose resemblance to the present case I am not conscious of)
  3. User preferences are unstable (e.g., I stopped “liking” The Hangover)

b. Rating engines cannot be improved by adding more data, but only by using different data, which will prove to be enormously burdensome on the user, and therefore might be impossible to execute. (Plus, what makes us think that I am any better at predicting my future preferences than engines might be?)

The second part of my argument is that you would view the recommendation engines as having less accurate picks over time, despite the fact that they are consistently inconsistent in the accuracy of their predictions. Why?

Here I have to defer a little to the cognitive/culture sociology literature, about which I know very little. But it is my understanding that we tend to impose order, and that we impose an order that privileges our own autonomy, knowledge, etc. In other words, we’re much less likely to impose an order that casts ourselves in a negative light. In the case of taste, it is clear that our typical bias is to view ourselves as having stable, predictable, coherent tastes, even if they are tastes we view as inexpert. So, in comparing our vision of ourselves (e.g., I would have known when I watched “The Hangover” that I would not enjoy “The Hangover 2”) to the recommendation engines, we view them as having done a bad job at a task we would have done well. And the fact that a “learning” engine continues to present recommendations for movies we do not enjoy heightens our sensitivity and irritation to this fact, making it seem like a stronger pattern than it truly is.

I have loads of things to add about this “science news” about the prediction of “hits” based on song qualities, but that will have to wait for a future post.

Now, the second answer.

Flowing from this psychological theory that we enjoy rejecting or questioning the authority of the recommendation engine (as a means to promote a positive self-image), it would follow that there would be a performativity effect with recommendation engines (here is Omar writing about performativity in the sense meant here). That is, the fact of a measure of our taste being presented to us would cause our taste to formulate itself in reference to that measure. I happen to think it is as likely that this performativity would cause us to agree with the engine’s recommendations more often (than we would have without recommendations being presented to us in this fashion), as to disagree.*

I think we arguably find this effect with professional critics, who provide “recommendation engines” for readers who follow their columns and reviews. As Shyon Baumann showed in his research on film, professional film critics played a key role in legitimizing some American film as art, leading film goers to increasingly view some American film as art.

The fact that some of us are patterned in our disagreements with the recommendation of search engines probably points to our class backgrounds–our “habitus” of dispositions–than it does to some feature of the engines themselves. People with rather high levels of social and cultural capital might tend to reject the authority of recommendation engines more often than those from different backgrounds and with different resources.

This leads to some hypotheses:

  1. Users with high levels of educational attainment, or high SES birth families, may be more likely to perceive recommendation engines as poor predictors of taste/ratings.
  2. The more frequent the exposure to recommendation engines, the more strongly held one’s opinions of their performance (whether positive or negative).
On the second question: I disagree with the premise. Here it is in case you’ve lost track:
Pushback by content producers in the technological transformation of cultural product delivery. E.g., How is it that a mass paperback like David McCullough’s The Great Bridge, from 1983, is available for $.35 as a used paperback, but is $15 on a kindle? How is it $18 to ‘buy’ Mean Girls on AppleTV?
The issue of pricing doesn’t reflect issues of “technological transformation” as much as the premium associated with owning a unique copy. While we certainly understand that ours is not the first or only copy of a book or a movie, the value of an object that has been owned by someone else is lower than when we know it has no such provenance. This is different, of course, when the provenance includes people of high status. If that McCullough book had been owned by George Bush, who gave McCullough the Presidential Medal of Honor, or by the head of the Pulitzer prize committee during the year he got one of his two awards, the value of the book would no longer be $.35–it would certainly be higher.
Anyway, that’s my thought. Pre-owned materials are less valuable unless there’s a high status person in the provenance. Maybe you’ll choose different examples to make the “technological transformation” argument more persuasive? And don’t let me sneak out of the next version with a simple “that’s the price the market will bear.”
+N.B. I’ve just learned the proper term for this: “apophenia.” I read it in Chuck Klosterman’s new book, “The Visible Man,” which I’d especially recommend to sociologists, given that the second protagonist’s life philosophy dominates the book and engages quite directly several of our most famous theories, including the constitution of the self, the nature of reality and perception, and the impact of reactivity. Philosophers should like it for the same reason/s, although both groups will be a little annoyed if they prefer a more subtle thematic presentation. This one kind of bonks you in the forehead.
* Note that my “hunch” here diverges with what the Salganik, Watts and Dodds paper would predict, which is that everyone is likely to increase their positive evaluations of songs evaluated positively by others (when you can see their ratings). I would argue this experiment is a poor guide to the situation I describe because there is no apparent or implied “critic” (either a professional critic, or a “recommendation engine”–a “crowdsourced” critic). My sense is that the addition of such a filter–a surrogate consumer–changes the situation significantly enough to render their results less applicable.

1 Comment

Filed under Uncategorized

One response to “The ratings game

  1. yowza, I want to reserve the right to reply proper when I get back. goodsy goodsy stuff though.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s