Goodreads: An Analysis on Context and Algorithms

If you love books as much as I do, it is likely that you have come across Goodreads. Goodreads is a website that allows people, mainly those who read or listen to books, to connect with one another by reviewing, rating, and discussing books. Founded in January of 2007, Goodreads proudly markets itself as a place for readers all over the world to come together to share and recommend books. After launching, Goodreads quickly became a fan favorite amongst bibliophiles, so much so that it has become the biggest website for books. While Goodreads’ popularity is interesting, what really captures my interest is the recommendation algorithm it uses to deliver books on an individual basis. How does it work? Is it accurate? More importantly, how can it be improved to provide even better book recommendations?  


Purpose

Before answering any of these questions, it is first important to figure out the purpose of Goodreads’ recommendation algorithm. The simple answer is obvious, to recommend books to users, but there is also a monetary incentive spurring the company into giving readers books that they enjoy. In 2013, Goodreads was acquired by Amazon. This is important to note because Amazon has been credited with the decline in profitable book sales among bookstores and other independent retailers. A study done by researchers from the Indian Institute of Technology Kharagpur found that they were able to predict which books would become Amazon bestsellers by looking at their Goodreads score (Maity et al.). Additionally, in order to make a profit, Goodreads runs ads on its site, and it partners with different authors and publishing houses in order to promote certain books. If the recommendation algorithm that runs Goodreads does its job well and gets books to users, then both Goodreads and its parent company, Amazon, have reason to celebrate since they work so closely together. Put simply, Goodreads and Amazon work together to recommend books and then subsequently sell them to users as well.


How It Works

But how does the algorithm actually work? Well, based on the different features that readers can interact with, it seems like Goodreads’ algorithm relies on a few different factors to sort creative works. First, genre and content tags. These tags can either be provided by readers or the publishing companies/authors that upload a specific book onto the website. These tags include things such as the specific genres and subgenres, themes, or tropes within the content of the text. Alongside this, readers also have the power to influence the algorithm. Readers can rate books on a 5-star scale. All of these ratings then get averaged to give a book its final score. Readers are also able to sort books using “shelves” Shelves are a way for readers to organize their personal libraries on the website by customizing what tags they want to give books. Some users may make shelves to organize their books based on the year they read them, the rating they gave them, if the book was a favorite, whether they DNFed—”did not finish”—it, and more. Shelves allow users to organize and tag their own books in a hyper-specific way, creativity being the only limitation. While shelves are a useful mechanism for users, it also allows the Goodreads recommendation algorithm to better understand how a certain book is being received and what shelves it is primarily being put in.  


Machine Learning within the Algorithm?

The Goodreads recommendation algorithm uses all of these variables to recommend books to people. The algorithm does not use machine learning when it comes to showing books that are similar to others—it seems like the algorithm bases this off of genre tags—and the same can be applied for its personalized recommendation feature. Under the “recommendations” section of the website, users are given books that Goodreads thinks they would like. It is here that we are able to see how the algorithm works. Under the “How to improve recommendations…” link, the website mentions that to get better results, users need to be more specific about their favorite and preferred genres, they need to rate more books, make more shelves, and hit the “not interested” button on any recommendations that they do not like. By doing these things, the Goodreads algorithm is promoted as being able to learn and be tailored to a specific reader’s preferences in order to deliver books that have a higher chance of being bought and enjoyed. If users hover over certain books that were recommended to them, they will be able to see why exactly they were chosen. Goodreads does a great job at showing why the algorithm made a choice by showing what specific book or genre influenced it. In this way, the algorithm is relatively transparent and easy to grasp. 


Weaknesses

While Goodreads’ recommendation algorithm seems great on paper, in practice it may not be the best. This is in part because Goodreads’ recommendation model was always meant to be social. Otis Chandler, the founder of Goodreads states that his website originally was meant “so that you could find new books based on what your friends are reading”. This means that users were always meant to receive recommendations through their friends and Goodreads circle rather than through a specific algorithm.

One of the first problems with the Goodreads algorithms can be found within its rating system. The system is fairly simple to utilize, but its simplicity is its downfall. The star system is too ambiguous, especially considering that every reader is different and will have very subjective opinions about what they read. Some readers prefer purple sentences that go on and on, while others prefer a much more relaxed, colloquial sound. Some readers enjoy trope-filled stories, while others enjoy reading something new. The star rating system is so flexible and unique to each reader that it cannot be used as an accurate representation of quality. This is especially true given that Goodreads’ social aspect often turns ratings into popularity contests rather than a true observation of a particular book’s merit.

My main issue with Goodreads’ algorithm: its lack of context. The algorithm currently employs does a good job at recognizing basic patterns, such as that of genre, target book audience, or themes, but it does not do a good job at providing recommendations based on an individual’s unique and complicated reading history. The algorithm mainly focuses on genres and subgenres with no real attention being paid to the context in which they are used. This often causes books to be inappropriately recommended or marked as “similar’. Additionally, if you pay attention to the way that the algorithm chooses what books to recommend, it usually relies on two or three books at a time, not the entirety of a user’s library. This is understandable since an algorithm of that nature would be immensely large, difficult, and expensive to create, but could it be a possible step toward better recommendations?

We built Goodreads so that you could find new books based on what your friends are reading

(Chandler)

Future Directions

The current algorithm used within Goodreads is not the best, but it certainly gets the job done for some users. Looking at the future for the possibility of more research on this topic, I am interested to explore the way that humans naturally tend to recommend books and the way that social media has influenced book-reading culture. Having a significant influx of readers due to the 2020 pandemic, the use of social media websites—such as YouTube and Twitter, but primarily TikTok—has given a new life to reading and sharing books. Where my main concern with Goodreads’ algorithm is its lack of media literacy and context while recommending, it would be interesting to study and learn the way that recommendation occurs between humans on the internet through the use of trends, fandom, tropes, and other abstract methods. The question I am left with is this: would studying human-based recommendations aid in designing a better, more cohesive machine-learning algorithm?


Conclusion

Goodreads as a social media platform works better than Goodreads as a book recommendation platform, especially if you’re looking for uber-specific niches to read. This is because books are not so easily fit into neat categories; they often overlap, subvert, and require media literacy, context, and other factors that a machine cannot so easily distinguish. As a social media platform, users can connect with other readers that they feel have a similar reading taste to them. Humans can better articulate what specific aspects of a text made them adore or distaste it. Humans can distinguish between technically good writing and bad. Humans can recommend books based on things that may not be so easily programmed in machines, such as aesthetics or “vibes”. Reverting back to a more human-focused and human-driven recommendation model might not be the most profitable to Goodreads, but it certainly can result in better recommendations for its users. 

Works Cited

Chandler, Otis. “Recommendations and Discovering Good Reads – Goodreads News & Interviews.” Goodreads, Goodreads, 10 Mar. 2011, https://www.goodreads.com/blog/show/271-recommendations-and-discovering-good-reads

Maity, Suman Kalyan, et al. “Book Reading Behavior on Goodreads Can Predict the Amazon Best Sellers.” 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Advances in Social Networks Analysis and Mining (ASONAM), 2017 IEEE/ACM International Conference On, July 2017, pp. 451–54.