In the continuing series, “Ask WITW,” we respond to reader questions with sometimes serious, sometimes insane, commentary. Today, we’ll take up two questions from Peter over at Rethinking Markets:
I’d like you to write about:
1) how and why recommendation engines seem to get suckier over time using them – what’s with the degradation of predictive power, the more it knows what you like/are listening to/are buying? How could Netflix imagine that I’d want to watch Apollo 13, much less that it’d be a top 10 for me?
2) Pushback by content producers in the technological transformation of cultural product delivery. E.g., How is it that a mass paperback like David McCullough’s The Great Bridge, from 1983, is available for $.35 as a used paperback, but is $15 on a kindle? How is it $18 to ‘buy’ Mean Girls on AppleTV?
I love these questions, especially since I’m unlikely to look tragically unhip answering them.
Two broad answers seem likely.
First, it could be that prediction engines are just stochastically good at predicting your preferences, but you’ve mis-interpreted them as providing less accurate predictions over time.+ As I understand it, the algorithms combine a small amount of basically unsophisticated information that you provide about your own likes and dislikes (rating out of 5 stars, with only whole stars possible) and batch that with billions of pieces of information about millions of other users. From time to time, the algorithm identifies what we might think of as “true” resemblances between users and is thus able to accurately predict a movie you will enjoy in the future (or, one you will not enjoy).
So, let’s take each half of my argument in turn. First, why should we expect that recommendation engines sometimes, and sort of without warning, correctly predict your preferences?
Imagine that I rated the Zach Galifianakis “Live at the Purple Onion” comedy special with “4 stars.” And I rated “The Hangover” (a movie in which he plays a lead character) with “4 stars.” The recommendation engine combines my high ratings with those of other users (and in this case, there are millions, because so many people saw The Hangover) and recommends “The Hangover 2” with an expected rating of 3.7 stars. It did not expect that I would give The Hangover 2 a rating of 2 stars. This is because it wasn’t able to determine that I have no attachment to Galifianakis as an actor–no consistent “liking” of him–but rather particular preferences for aspects of the first two films (in the first, it is the resemblance to a Andy Kaufman style comedy; in the latter, the situational pleasure I took in seeing the film with a good friend, in the right mood–I’ve never gone to “adjust” the rating to be lower, reflecting my repeated viewing of the film, and depressed pleasure in it).
In contrast, the engine correctly predicted that I would like the TV series “White Collar” despite my not having rated many “crime” TV shows, not having rated any shows and movies starring any of the major characters, etc.
A few hypotheses flow from my explanation:
a. Users who rate more videos will probably not have more accurate predictions than those who rate fewer videos because:
- The relevant predictor variables are rare (e.g., watching a film in the right mood with a specific friend);
- The relevant predictor variables are impossible to code (e.g., that Galifianakis reminds me of another comedian whom I like, but who does not appear in the dataset, or whom I only like occasionally, or whose resemblance to the present case I am not conscious of)
- User preferences are unstable (e.g., I stopped “liking” The Hangover)
b. Rating engines cannot be improved by adding more data, but only by using different data, which will prove to be enormously burdensome on the user, and therefore might be impossible to execute. (Plus, what makes us think that I am any better at predicting my future preferences than engines might be?)
The second part of my argument is that you would view the recommendation engines as having less accurate picks over time, despite the fact that they are consistently inconsistent in the accuracy of their predictions. Why?
Here I have to defer a little to the cognitive/culture sociology literature, about which I know very little. But it is my understanding that we tend to impose order, and that we impose an order that privileges our own autonomy, knowledge, etc. In other words, we’re much less likely to impose an order that casts ourselves in a negative light. In the case of taste, it is clear that our typical bias is to view ourselves as having stable, predictable, coherent tastes, even if they are tastes we view as inexpert. So, in comparing our vision of ourselves (e.g., I would have known when I watched “The Hangover” that I would not enjoy “The Hangover 2”) to the recommendation engines, we view them as having done a bad job at a task we would have done well. And the fact that a “learning” engine continues to present recommendations for movies we do not enjoy heightens our sensitivity and irritation to this fact, making it seem like a stronger pattern than it truly is.
I have loads of things to add about this “science news” about the prediction of “hits” based on song qualities, but that will have to wait for a future post.
Now, the second answer.
Flowing from this psychological theory that we enjoy rejecting or questioning the authority of the recommendation engine (as a means to promote a positive self-image), it would follow that there would be a performativity effect with recommendation engines (here is Omar writing about performativity in the sense meant here). That is, the fact of a measure of our taste being presented to us would cause our taste to formulate itself in reference to that measure. I happen to think it is as likely that this performativity would cause us to agree with the engine’s recommendations more often (than we would have without recommendations being presented to us in this fashion), as to disagree.*
I think we arguably find this effect with professional critics, who provide “recommendation engines” for readers who follow their columns and reviews. As Shyon Baumann showed in his research on film, professional film critics played a key role in legitimizing some American film as art, leading film goers to increasingly view some American film as art.
The fact that some of us are patterned in our disagreements with the recommendation of search engines probably points to our class backgrounds–our “habitus” of dispositions–than it does to some feature of the engines themselves. People with rather high levels of social and cultural capital might tend to reject the authority of recommendation engines more often than those from different backgrounds and with different resources.
This leads to some hypotheses:
- Users with high levels of educational attainment, or high SES birth families, may be more likely to perceive recommendation engines as poor predictors of taste/ratings.
- The more frequent the exposure to recommendation engines, the more strongly held one’s opinions of their performance (whether positive or negative).
Pushback by content producers in the technological transformation of cultural product delivery. E.g., How is it that a mass paperback like David McCullough’s The Great Bridge, from 1983, is available for $.35 as a used paperback, but is $15 on a kindle? How is it $18 to ‘buy’ Mean Girls on AppleTV?