That technology influences information and behavior through built-in and often invisible assumptions is neither a new phenomenon nor new to dialogue among librarians. (Although we could always stand to talk about it more than we already do.) In this post, I highlight some recent contributions on algorithms and libraries in hopes of keeping it in the forefront. If we didn’t already, librarians have to care about algorithms now.
Why should we care about algorithms?
Algorithms shape how we find, consume, share, and understand information, from search engines to social media to the news to library discovery systems. Safiya Noble analyzed how Google search results reflect built-in racist and sexist bias, for example by “hyper-representing” Black women in stereotypes and sexually exploitative contexts. Noble explains the threat biased search algorithms present (in part) this way:
“Google’s algorithmic practices of biasing information toward the interests of the powerful elites in the United States, while at the same time presenting its results as generated from objective factors has resulted in a provision of information that perpetuates the characterizations of women and girls through misogynist and pornified websites.”
“The Politics of Online Information: Algorithmic Ethics and Big Data(bases) in Libraries,” Noble’s keynote this week for the 2016 Library Technology Conference, continues this line of investigation. Watch the talk here and read some audience tweets here.
Search engines are often how we deliberately seek out information, but social media is becoming a key means of encountering information, including information we weren’t necessarily trying to find. The Pew Research Center offers this snapshot of demographic trends in social media use over the past 10 years. In short, use is rising across the board and social networks are increasingly the site of information seeking and sharing for matters of health, “civic life,” and news from the personal to the national. Pew surveys have shown social media to be where civic engagement and dialogue happen, but in between election talk and sharing personal news we navigate an algorithm-generated stream of stories, images, video, and audio.
At this point, I suspect most of us know that our Facebook experiences run on algorithms, continually adjusting our feeds and so on based on how we click, like, react, type, and even scroll. What’s less likely is that we know exactly how the adjustments happen. Will Oremus at Slate wrote a profile of the “sprawling complex” of Facebook algorithms, including how they’re continually redesigned — by people! — in search of a better fit. He writes, “Algorithms, in the popular imagination, are mysterious, powerful entities that stand for all the ways technology and modernity both serve our every desire and threaten the values we hold dear.” Oremus’ larger point is that algorithms are neither magical nor autonomous, but are in fact built and adjusted by people using limited proxies for human preferences and behavior. Moreover, users aren’t privy to the assumptions and code behind the algorithms. The stories that drift by or resurface, plus the ads we see, do so through processes we don’t have access to understand.
The growing use of automation in the production and dissemination of news is worth considering,
too, especially in light of increasing news consumption via social media. Emily Bell’s recent lecture on “The End of the News as We Know It: How Facebook Swallowed Journalism” is all about major shifts in control and authority in journalism — but is completely applicable to other kinds of information sharing and seeking as well. There’s plenty of bias and baggage in journalism (also produced by people!) but I’m not convinced that information sources increasingly given over to algorithms is a case of “New boss, same as the old boss.” Instead, I believe it means we’re less and less in control of how we get information, while being persuaded otherwise through personalization, curation, and customization.
Facebook isn’t the only social media platform using algorithms to connect users with content. Twitter’s Trending Topics have been algorithm-driven for a while. The While You Were Away feature surfaces old tweets via algorithm, determining what’s important based on your past faving and retweeting tendencies, plus tracking where and when you’ve logged in. Earlier this year, it was reported and confirmed that Twitter will introduce an algorithmic timeline (“more Facebook-style”) ordering tweets by “quality” rather than chronologically. (The feature is rolling out to users in waves, and in fact today is the day my feed was switched over.) Instagram also made a change to its feed this week. The blog post announcing the shift is titled, “See the Moments You Care About First,” which I guess is code for using past behavior to design and adjust timeline algorithms.
It’s interesting to note some of the reactions to these changes, specifically Twitter, with the hashtag #RIPTwitter and various think pieces popping up. Writing about the ubiquity and opacity of algorithms, it’s impossible not to see limited understanding of how our tools and platforms work as a universal problem, not at all limited to systems that rely on algorithms.
Libraries also have this problem
Of course, library systems are not immune either. One of the best things I read last week was a post by Matthew Reidsma titled, “Algorithmic Bias in Library Discovery Systems.” Everyone feel free to take a minute if needed to roll your eyes at the idea that bias in library systems is brand-new and strictly limited to algorithm-driven platforms. Take as long as you like.
While the standards, vocabularies, and various other conventions of information organization in libraries are recognized as structurally biased, Reidsma’s post is especially valuable for how it dismantles certain assumptions about libraries’ ability to support users seeking information. One is that simple interfaces promote better understanding of how a system works and how to use it. Reidsma cites numerous instances, library-based and otherwise, in which simple interfaces concealing complex software encourage users to imagine an equally straightforward “back end” and discourage delving further.
Another assumption challenged here is that searching library discovery systems retrieves all relevant results, and that sorting and/or ranking are the wicked problems in making those result sets usable. Reidsma’s tests using Summon 2.0 (a ProQuest product) and its Topic Explorer feature reveal systematic biases in the results, including suppression of relevant keywords for some searches. Read the post for detailed descriptions of findings, data downloads, and plenty of starting points for learning more about algorithms and libraries. (It does not hurt at all that Reidsma cites one of my all-time favorite pieces of writing about library systems.)
Whose responsibility is this?
As librarians, we’re often in the unique position of being both users and designers of tools for mediating access to information. One of the challenges we face lies in recognizing the limits in how our experience as users can inform how we design systems for others to use. Another is in recognizing the power we hold as designers and managers of systems. Speaking to the latter, Andreas Orphanides gave a talk at code4lib 2016 called “Architecture is politics: The power and the perils of system design,” in which he drew parallels between social control in built spaces (physical architecture) and information systems from forms to maps to websites and more. It’s powerful and worth your time whether or not you’ve made this particular connection between “architectures” before. Watch the talk on the code4lib YouTube channel and follow Dre’s slides and notes here.
In another code4lib talk about the future of web archives (watch the recording here), Ian Milligan and Nick Ruest discussed how library systems could continue to develop along these lines absent deliberate action to change course. Milligan, a historian building and using web archives to research the history of the 1990s, brought up the slide below to describe where he fears web archives discovery and research could be headed:
“I’m the input, I ask the question, I ask this magical black box, output comes out, I don’t understand the ranking mechanism, I don’t understand the decisions. That black box is writing my book; I’m not writing my book.”
That we know so little about algorithms may not make them inherently dangerous, but not knowing certainly places us at risk of exploitation.
How to learn more
There’s ever more writing out there about algorithms, artificial intelligence, bots, and more. I’ve found two newsletters extremely helpful: This Week in Algorithms, Automation, and Artificial Intelligence (published by Jay Cassano of Data & Society) and Real Future (published by Alexis Madrigal of The Atlantic). The scholars mentioned in this post continue to work through the knotty ethical issues that algorithms present, so I encourage the curious to follow breadcrumbs and see what the work I’ve linked to links to. For even more takes on algorithms and libraries, read Anna-Sophia‘s Hack Library School posts on Google and the Librarian and AI and LIS education. You won’t regret it.
Personal experience is a potentially powerful means of breaking down assumptions built into technologies in and out of libraries, so I’d love to know: What resonates with you in any of the above?
— Amy Wickner | @amelish