Editor’s note: This is a guest post by Jessica Luna.
Searching has become a common thread in my daily life as an MLIS student. Learning basic and advanced methods of searching for different types of information, evaluating search results, and understanding how algorithms shape our queries are just a few things that I explored during my program. And it’s those algorithms that have dominated our lives since before the existence of Web 2.0.
One of my projects this semester was exploring Personalized Information Retrieval (PIR) methods using collaborative tagging, also known as folksonomies. Collaborative tagging has become more ingrained in next-gen online library systems because it’s familiar and convenient for users to browse and discover similarly tagged materials. Personalization has become an essential factor in shaping today’s folksonomies.
Folksonomies are used on social media websites like Flickr, Twitter, and Instagram to help users search for topics of their interests. Content creators on platforms like YouTube and Twitch use folksonomies to make their content more searchable and reach a broader audience. This made me wonder, are PIR recommender models the future of next-gen library catalogs?
How Does it Work?
Today’s basic model of folksonomy system contains three essential variables: users, resources, and tags. These three variables form the basis for calculating tags’ weight and ranking in various tagging systems. PIR models use user profiles and resource profiles to customize their searching experience based on their interests. The goal is to identify the closest answer to a person’s need based on the characteristics of their user account.
A big piece of PIR models for folksonomies involves recommender systems. There is a lot of research on different PIR methods that may drastically change how documents are ranked and what tags are recommended to users. Some PIR recommender systems acknowledge that users’ tagging behaviors often change over time, so the ideal system should consider a user’s historical tagging behavior when evaluating the weight and rank of a tagged resource. Other algorithms explore the relationship between query relevance and user interest relevance. The theory suggests that the more times a user uses a tag in their profile, the more interested they are in its topic.
Sentiment analysis is a way for machine learning to extract users’ interests and sentiments based on what topics they follow and what tags they’ve searched in the past. It applies similarity evaluations to create a higher precision of retrieved resources. Algorithms generate relationships between words and sentences, categorizing the semantic relationships between sentences and phrases. The terms are ranked within a vector space, and those with the highest score have more weight in representing a subject.
Issues with Personalization
A common complaint about PIR systems is privacy. Personalization can only occur by obtaining user data, often without their knowledge or consent.
Recommendation systems can force users into filter bubbles. These systems isolate users from other perspectives and world views, perpetuating people’s ignorance of alternative beliefs and cultures. It’s essentially another type of censorship that narrows a person’s perception. Luckily, other methods to diversify PIR recommendations have been developed.
Looking Towards the Future
PIR studies provide some insight as to how algorithms rank and weigh folksonomies. It also allows us to assume how private companies might use similar methods when suggesting resources to users. Twitter comes to mind when I think about PIR models used with folksonomies. When you interact enough times with another user or a particular hashtag, the algorithm adds similar users and hashtags to your feed.
I bring up Twitter because it’s been in the news lately. To promote transparency of the algorithm, Elon Musk stated in a recent TED talk about his intentions to make Twitter Open Source. It will be our first glimpse into critically analyzing the algorithms of a social media platform and further research on PIR systems.
With Twitter’s algorithms becoming Open Source, we may see more studies on how PIR systems shape users’ experiences. It also opens up further research on ethical implications for PIR systems and possible solutions.
At this time, I haven’t observed any next-gen catalogs that use PIR methods for folksonomies. ILS/LSPs often use tags as facets in their catalog. Koha has a tag cloud that displays the most commonly used tags. Adding sentiment analysis measures could prove fruitful in identifying similarities between tags. Tags that might have been considered annoying and subjective would now have meaning.
It’s unclear whether any ILS/LSPs will include PIR methods in their systems someday, but it’s good to know how they work.
Jessica Luna is an MLIS student at St. Catherine University interested in systems librarianship and digital libraries. She enjoys reading books and playing video games in her spare time.