Recently I bought an expensive book, Recommender Systems: An Introduction, by Dietmar Jannach, Markus Zanker, Alexander Felfernig, Gerhard Friedrich. This book could be used as a textbook for a class which may explain why it was so expensive. I bought it anyway because I think recommendation systems are the most important application of artificial intelligence technology. The purpose of this blog post is to explain my reasoning.
According to Wikipedia, a recommender system or a recommendation system (sometimes replacing “system” with a synonym such as platform or engine) is a subclass of information filtering system that seeks to predict the “rating” or “preference” that a user would give to an item.
Undoubtedly the recommendation system which I have had the most exposure to is the Amazon recommendation system, now powered by artificial intelligence technology. I buy a lot of books on Amazon and I’m always interested in the related titles that show up on a book’s product page. There are two separate lists of books which are recommended; Customers who bought this item also bought and Inspired by your browsing history. Now we can assume that Customers who bought this item also bought is a form of the wisdom of the crowd. What other books readers (who have read the book I am buying) have also chosen to read is a good indication that I might want to read those other books too. Therefore my intellectual development is given the benefit of the wisdom of hundreds or thousands of other readers who have gone on to read various related books. But Inspired by your browsing history is a curious secondary list. Where is this inspiration coming from? We know that Amazon is using an open source artificial intelligence framework that Amazon developed, Deep Scalable Sparse Tensor Network Engine, DSSTNE, pronounced “destiny”. We also know that this software is used for building Deep Learning models. This means it could be weighing thousands or millions of factors to decide whether a book should be recommended to me. You can think of artificial intelligence as a sophisticated probability calculator which is used to boil down many deterministic factors into one probability, is it probable that I would buy this book?
Amazon’s recommendation system is a black box. We cannot know exactly how all these probability factors are being calculated to determine whether a book appears on that Inspired by your browsing history list. There are two reasons this is going to be a mystery to us. First, Amazon cannot share the exact nature of its production algorithm because this would allow writers and publishers to game the system. Publishers would love to know how Amazon’s recommendation system works because then they could tweak a book’s title, keywords, and blurb to give the book an unfair advantage in the marketplace. Writers could even write their books to give themselves a little boost in the marketplace. But the second reason we can’t know exactly why a book shows up on that list is because the computations performed by the artificial intelligence are so complex that they cannot be back traced. In other words, we cannot know how all those probability factors came together to output the result. Sure, you could look at raw numbers but there could be thousands or millions of numbers, depending upon just how deep the deep learning goes.
This raises interesting questions. Presumably my intellectual development may now be influenced by the computation of an unknown number of probability factors. Yes, there is no guarantee that I will buy and read a book just because it is suggested by Amazon. If Amazon is showing me Harlequin Romance novels while I’m looking at computer science textbooks then I will certainly disregard those suggestions for further reading. But of course Amazon is not going to do that. They will tweak their system until there is a high probability that I won’t disregard the other books they are recommending. Remember, that is the whole point of a recommendation system. A recommendation system is designed to over-determine an outcome. It is all about probabilities.
I’ve sometimes dreamed about a supercomputer which would consider every book that has ever been written. This supercomputer would actually read all those books and inter-relate their content until it had discovered the mysteries of the universe, at least as far as man has been able to figure it out and put it into writing. Then, taking into consideration my interests and goals, this supercomputer would recommend a book for me to read which would completely change my life. This book would provide me with the answers to all the questions which I have ever asked. Well now it seems like my dream may have become a reality. There is a now a supercomputer which will direct my intellectual development along a path that is optimized for maximum wisdom!
Is that a fantasy or is that the reality? It is actually hard to say. It is quite possible for a recommendation system to take content into consideration. It is far more simple to rely on user ratings but when you are dealing with obscure books it may be necessary to actually parse the content and do some textual analysis. A recommendation system can be very simple or very sophisticated. I think we can assume that Amazon’s recommendation system will be mind boggling in its complexity. Amazon could devote so many resources to their system that you could literally be assigned your very own artificial intelligence instance tasked just with dealing with you as a customer.
There are many factors which could lead you to reading a book. You could walk into a book store or a library and just randomly pick a book. But writers and publishers obviously put a lot of effort into ensuring that there will be a greater probability that you will be reading their book. In the vast marketplace of ideas there is fierce competition for the attention of the reader. Every aspiring writer would do well to consider the criteria used to make a selection of reading material. Every aspiring writer must now face the fact that there is a black box in this equation, a recommendation system powered by artificial intelligence, a black box which you cannot peer into.
As an intellectual, I am most curious about what drives a person’s intellectual development. What affect will artificial intelligence have on my intellectual development? Will it make me smarter? Or will it divert me down a path I did not intend to go down? One factor which often determines how often a book will appear in search results is the number of other books citing that work. The more frequently a book is cited by other scholars, the more frequently it will be referred to in other texts. This is why the most authoritative work will be the book that ranks at the top of search engine results. But lets say you just want to sell the most expensive book no matter what. The most expensive book will likely be a textbook and it will probably be the thickest textbook available. The thickest textbook is the one most likely to fully develop a concept so that is all well and good for your intellectual development.
One of the reasons that I decided to devote serious study to recommendation systems is because this branch of computer science is actually exploring the factors that determine the outcome of cultural work. Every creative writer needs to get his work past the cultural gatekeepers. In the future, creative writers are going to be faced with digital cultural gatekeepers. Today your work is read by a human literary agent but in the future it could be read by an artificial intelligence literary agent. Even if the artificial intelligence isn’t good enough to be relied upon to make human aesthetic decisions it could still be used to winnow out the thousands of incoming manuscripts. And of course every published book’s fate is already determined by where it appears in search results.
There are many deep, philosophical questions at play here. For example, consider affinity. Affinity is the probability that you will like something based on something else that you like. But what is responsible for this affinity? Affinity is based on the relationships between ideas, concepts, or stories. There is a web site of TV Tropes. TV Tropes is the all-devouring pop-culture wiki which catalogs and cross-references recurrent plot devices, archetypes, and tropes in all forms of media. This web site serves as an unintentional recommendation system because it is inter-relating content in such a way that you are likely to stumble upon stories which are similar to the stories you like. The relationships have been established by thousands of users but you could plug this information into your recommendation system to get the benefit of all that human evaluation. Creative writers already use this technology to develop their stories within their favorite genre. Creative writers frequently get additional ideas for their stories based on the most popular tropes, or they decide to mash up tropes to appeal to two sets of readers. Basically what you have here is a database of story ideas combined with an index of how story ideas are related to other story ideas.
What all this technology appears to be doing is increasing the probability that cultural works or even intellectual concepts which have affinity will be brought together. This serves to reinforce ideas and increases cross-fertilization of ideas within a narrow domain. It may have the negative effect of discouraging the cross-fertilization of ideas without strong affinity. In other words, it significantly reduces random factors. Random connections become increasingly unlikely as our world becomes increasingly over-determined by meaningful inter-connectivity.