The Academic Library, where ignorance is proverbial

Your Humble Blogger came across an article called The Academic Library and the Promise of NGDLE, by Megan Oakleaf, Scott Walter and Malcolm Brown on some site called EDUCAUSE Review, and, well, I have thoughts.

OK, first of all, NGDLE stands for Next Generation Digital Learning Environment, and if you need a moment after that, I understand. I won’t hit you with interoperability just yet. It’s coming, though. Also data-driven and outcome-based. Possibly high-impact, too. I know, I know. But part of the conversation is about the conversation, which means speaking the mammyloshen, innit? Anyway. I don’t really advise reading the article, and I’m not going to claim to summarize it, either. I don’t think I really understood it, or at least not in detail. But it did make me see an argument that I think the writers would agree with, even if it isn’t what their actually saying in this article, and that is the argument that I have thoughts about, so that’s where I’m going. Ready?

OK, so it’s this: The current fashion in Higher Education administration is a focus on what they call data-driven, outcome-based programs. Which is great! They want evidence that stuff works, is all that means. Or all it should mean. In practice, it means that people are in meetings with administrators talking about how participants in such-and-such a program in their first semester show an 8% increase in retention rate, or that the GPA of such-and-such a subset of students is 0.65 points lower than a different subset. Which, again, fine, as long as everyone doing the talking and the listening really understands statistical analysis. Which, optimistic, I think.

Still and all, I am inclined to the notion that in the perpetual struggle for resources in the academy, it’s helpful to look at actual evidence, and it’s helpful to look at effects down the road, not just in any particular course. I do not intend to disparage the use of data, or a focus on what comes out. Education, as we have discussed in the past, doesn’t work, in the sense that a car works, but that doesn’t mean that we should waste a lot of money and time on programs that don’t make any difference a year down the road. And relying on anecdotal evidence, or the shining smiles of the successful students, is not the best way to go about allocating resources.

Well. Be that as it may, whether it’s rubbish or reason, academic administration is all about the data-driven and outcome-based right now. And academic libraries do not, on the whole, have outcome-based data to present to their administrations. Do students who use EBSCO FirstSearch have a higher retention rate than those who use ProQuest? Do students who check out books during their first semester have a higher GPA through their degree completion than those who don’t? We have no idea.

In fact, we go to some trouble to avoid knowing that information—we have our system delete the individual data daily, so that we know how often a book has circulated but never to whom. Our principle is that library use is not only confidential but almost sacrosanct. Took out a book? Took out twenty books? We don’t know. As long as you bring them back on time, we keep no record of them whatsoever.

I think the history of this started back in the late 1980s, when there were (or were reported to be) FBI investigations of library records, and library staff were said to be asked to inform on their patrons who took out certain books. I don’t know the actual history, but I know that after the Patriot Act (I am tempted to write the so-called Patriot Act, for it never seemed remotely patriotic to Your Humble Blogger) libraries were indeed the object of National Security Demands. The standard defense has become the basic fact that we have no such records. We don’t keep ’em. Gone, daddy, gone.

And that has worked out well for libraries, over the last thirty years or so. I think. The way the law is structured makes it possible that we are in fact scooping up information and delivering to the feds before we delete it, and that I am being kept in the dark about it for reasons having quite a lot to do with fighting terrorism and nothing at all to do with politics. But be that as it may, that’s how we libraries currently roll, and it would take a major philosophical shift as well as substantial workflow and software changes to keep data on individual students.

So. Where does that leave us? When higher-education administrative types start talking about programs that help students, they are looking for things we don’t have, things that we are philosophically opposed to having. The article that started this business simply takes for granted (I think) that this needs to be rectified. If the administration wants outcome-based data, then we need to provide it; it has to be interoperable and next-generation and creatively disruptive, and probably innovative, too. They are the bosses, and what the bosses want, they tend to get, or the people who don’t give it to them don’t get much support in return, right? And in truth, there is a good deal of truth in that. If we are not giving presentations about longitudinal studies in the meetings about Excellence in Undergraduate Outcomes or Developing Innovative Pedagogy, then those meetings will be held without us. We will not be, as the kids say, in the room where it happens. And on top of that, there probably are some things to be learned from those longitudinal studies. Things we think we know about how students use the library may not be true, and it would be better to learn that than to remain deluded. Correlations that never occurred to us might leap out of the data, and might even be causative correlations. Maybe we could improve our libraries and suck up to the administration at the same time.

And yet.

I don’t know the answer to this one. My instinct is to stick with anonymity. My instinct is that the current fad for the data-driven and outcome-based will be short-lived. My instinct is that we would be doing real damage to the library in pursuit of illusory gains. Speaking about this with my coworkers, it was clear that their instincts run along the same lines as my own. And yet we are all aware that our instincts are what they are because we have worked where we have, not because we all have fantastic instincts.

My initial instinct was to do something with pseudonymity: Give each library patron a number, which is uniquely theirs, but which isn't linked anywhere to their actual identity. This sounds like it'd be logistically annoying, because obviously things like their library card number *do* need to be linked to them, so that if they don't bring back the thing they've checked out, you know where to find them. So they'd have to have some additional number, call it their NGDLE number, which they'd presumably present at the same time as their actual library card number, but which wasn't actually connectable to their library card number in any way. You could potentially put both numbers on the cards when the cards are minted, and immediately destroy the record of which card number went with with NGDLE number, but if the cards know, the Authorities could potentially round up the cards and find out who's who (or at least prove or disprove that particular persons went with particular NGDLE numbers). So maybe two cards, but that sounds like a pain for the patrons. Hmm.

I was thinking of something somewhat similar myself, but then to examine correlations with the outcomes the University cares about, we'd have to connect the student's NGDLE number with her academic record (GPA etc) which we can't do while keeping the records properly anonymous.

What we could probably do is keep some numerical information while wiping out the connection from patron to item. We do this on the item end; we know how often a book was checked out and when, but not who did the outgechecking. And we could do similar, presumably, with database use, knowing that patron X looked at N articles in database B on such-and-such a date, without keeping the identity of the article. But (a) the two tables could be matched up again without too much difficulty, so that defeats the purpose, and probably more important, it would require a shift in philosophy from keeping as little as possible (while still doing what we need to do) to keeping as much as possible (without putting students in too much jeopardy).


I also would tend to support anonymity, but I also think that if the day comes when the Powers That Be insist on outcome measures, the library should have an approach ready, so one isn't imposed on them from above. One suggestion would be this:

First, classify library usage into broad categories that might still be useful for data purposes.
Second, at the end of the semester, give the student a printout of the summary of their usage (e.g. borrowed four reserved textbooks, printed out 10 articles from X database, borrowed two ILL books from LC subclass BD, etc.)
Third, with the student's permission, that summary data can be submitted to a database along with their grades. The database doesn't include individual identities, just the usage summary and the grade.

So, you only store the summary data, and you destroy the association with the individual after the semester ends. Probably you would ask the student to opt-in at the beginning of the semester in order to store any data.

This approach does not allow long-term outcome measures (students who read X in their first two years end up with better grades in their fourth year) but could be a reasonable compromise. You could tweak the approach to include slightly more retained data (e.g. student's major and class year) but the more you do, of course, the more risk of being able to match them up.

That's an interesting notion. I have concerns about the details, but I suspect that something could be worked out that retains some information and then anonymizes at the end of the semester, with either an opt-out or an opt-in at the start. At the least, if there is a demand for such data, a library could suggest such an approach, together with an estimate for the cost of changing our system to accommodate it...


A different approach -- a key information literacy outcomes should be "students understand the importance of privacy to free access to information." Engagement with this learning outcome can be incorporated into instruction around using the library and research, and surveys can be used to assess the extent to which students are achieving this learning outcome. Data on student achievement can support research into best practices for student learning with respect to this aspect of information literacy in a free society.

Our library instructors would be happy to take such an approach, but I'm not sure it would really help us in the eyes of the administration…


To deepen my response to make it more serious, I think an authentic and defensible approach to establishing the value of libraries and library resources through assessment runs more or less like this:

(1) With faculty support, make the case for the centrality of information literacy skills to student success in the 21st century; demonstrate the functional role of libraries and librarians for students' development of these skills; assess student work to show their achievement and the role of library resources in it
(2) With faculty support, make the case for the importance of undergraduate research to student success; demonstrate the functional role of libraries and librarians for students' conduct of undergraduate research; assess student research to show their achievement and the role of library resources in it.

What the administration needs to hear is a case that is (a) cogent, (b) marketable, and (c) grounded in sound assessment practices. "The Promise of NGDLE" project is weak on all of these fronts, primarily because it is going to be at least a decade before it even turns up, and the kind of results it will offer are not the kind that current assessment research indicates are most important. To show why the promise of NGDLE is of no current use to libraries re assessment that demonsrates their value, let me quote a bit from the article in which they describe the revolutionary potential for assessment of the "Next Generation Digital Learning Environment." That revolution will unfold along the following lines:

"First described in 2015, the NGDLE posits that the future of the digital learning environment will be marked by a shift from an over-dependence on the learning management system (LMS) to a new vision of learning environment architecture, one made up of a variety of pedagogical tools and applications all connected by means of open standards.4 One of the most visible and tangible ways the NGDLE will manifest is in the domain of learning data and learning analytics. Through the use of interoperability standards, all applications associated with an institution's teaching and learning mission can contribute learning data to a central repository. By analyzing this aggregated data, institutions can apply more powerful analysis techniques that will result in more useful information about learner success."

So they are offering a recently developed theory about what the future digital learning environment will be, with no specification of when or how this new learning environment will be implemented nationwide in higher education. They further posit that as a result of institutions running digital learning environments on open standards software, the software's interoperability standards will enable it to be shared seamlessly between institutions. These institutions will then decide (I suppose they could be coerced by accrediting bodies?) to share vast amounts of proprietary data about their students' learning in a "central repository," to whose aggregated data institutions can apply "more powerful analysis techniques that will result in more useful information about learner success."

To make this work:

First, we need to revolutionize the digital learning environment, replacing the platforms of established for-profit vendors like Blackboard with open standards modular software units that are all mutually compatible through interoperability standards (or get the for-profit operators to create mutual compatibility). This software that will do this does not yet exist, so actually constructing the open standards modules is actually step 1. Then, step 3, we need to get institutions to use this system to collect data and share it. Then, step 4, we need to turn data scientists loose on it with the idea that such big data analytics will give us useful information about learner success.

That's four massive steps, each of which is in the "vision" stage at present, as far as I can tell. We are not going to arrive at this future for a long time -- this is a decade away even if there is a massive push behind it from the Bill and Melinda Gates Foundation, which is the group that was targeted by the 2015 article that invented the NGLE idea.

If administrators want assessment data showing the value of libraries, it's going to be a long time coming if this is the route you take. And I am not aware that big data analytics has yet demonstrated that it can provide much in the way of "more useful information about learner success," though it is entirely possible that there are big and important studies out there that I haven't heard of. What I am hearing on assessment these days is that the drive is for (a) direct assessment of student learning to get meaningful results and (b) closing the loop between assessment results and pedagogical innovation. Neither of these priorities is advanced much by emphasizing the role of big data analytics in assessment, esp. when it will be years before the big data sets could be created.

I think it makes zero sense for libraries to even contemplate abrogating their standards on user privacy to prepare for this hypothetical NGDLE future, especially if the motive for doing so is to provide assessment data about libraries to administrators who are anxious to have it. (It's not clear to me that this article is suggesting that they do so, although that could be one implication of what it is saying, and it is troubling that the authors don't specifically address ethical concerns and the importance of maintaining patron privacy.) It does make sense for libraries to think now about how data that they do have and that could be ethically used might be made to interface with other data streams about student learning within an institution. That's a reasonable step to take on the basis of this article's analysis. Good IR professionals are going to be very concerned about privacy and ethics issues and will help steer institutions away from approaches to assessment that violate student privacy.

But if a library is feeling assessment pressure from an administration, there are much more practical, effective, and timely approaches to meet that information need.

You make excellent points, Chris. The NGDLE paradigm is definitely far from practically useful at this point, and it isn't clear that it ever will be.

Among my problems, though, is that I am extremely skeptical about all forms of assessment, even while sympathetic to the administration's need to assess what's going on. Among other problems, I don't think that it's usually clear to the administration what the purpose of the academic library actually is, which makes it difficult to assess how well it is achieving that purpose.

I also am not entirely sure that the faculty, taken as a whole, really does support the library's role in developing information literacy skills. It's unfortunate (to my mind), but my experience is that there is as much resistance as support to the library involving itself as more than a warehouse for material (virtual or physical). I would be happy to be wrong about that, and certainly there are many in the faculty to do support libraries as more than warehouses, but I don't think the sentiment is general. And it's in the absence of that support that the libraries have to use the language of the administration—the outcome-based, data-driven assessment—to insist that what they do works, for values of working that the administration values.

Having said that, it is tremendously wonderful to have faculty around such as you (and you are not alone in this at all) who are really interested in teaching students for their lives, not just their degrees or even their first jobs, and who recognize and support the libraries' part in that. Because in the end, while I suspect that the academic library is in fact helping the students succeed in institutionally helpful ways, it is doing just as much (and can do more) to help them succeed in becoming better people and citizens in the long run.


This is one of those rare cases where there's an argument from technical feasibility that supports the case of responsible conduct, as Chris points out above. To deepen that case, I found this article to be extremely useful when evaluating high-level mission statements that involve plucking abstract insight out of aspirational data collection: The AI Hierarchy of Needs. tl;dr: data do not imply meaning.

Going further, there's a whole field of study involved in the development of limited-retrieval (i.e., privacy-preserving) data structures. Not my area of expertise (and not even where I spend my amateur research time), but I believe the phrases "homomorphic encryption" and "zero knowledge proof" might provide a couple of ends to pull on that ball of yarn. If the money people really insist on data collection, I hope that the librarian community could use the existence of this technology to hold the line on building anonymity into the design specs.

Three caveats, and then a few brief points:

Caveat The First: I worked very closely with Megan Oakleaf from 2004 to 2006 at the NCSU Libraries; she was an early professional mentor (though in instruction, not in "Big Data") and while I don't always agree with every position she's taken, she's someone I respect highly as a colleague and a friend.

Caveat The Second: I may veer into librariansplaining here, and if so, please feel free to call me out on it and accept my apologies in advance.

Caveat The Third: There is a lot (really, seriously, A LOT) to unpack here and I haven't the time to do all of it justice because I have, you know, a full-time job and I prefer when possible not to do that job when I'm not at that job and I kind of feel like responding to this fully is more like "my job" than "not my job." So, I'm going to address the few points that I actually can address briefly and leave the bigger thinkier things for later, or more likely never.

Point The First: For general discussion of Issues-with-a-capital-I in the intersection between libraries and "Big Data," may I refer you to the following piece: http://crl.acrl.org/index.php/crl/article/view/16603 (Subsidiary caveat: I know the second author in a glancing, internet-enabled kind of way. I don't always agree with her.)

Point The Second: you wrote, "Do students who check out books during their first semester have a higher GPA through their degree completion than those who don’t? We have no idea." In point of fact, we do (though perhaps not with precisely those parameters). There are many other studies that have looked at correlations between aspects of library usage and outcomes generally considered to be markers of student success (GPA, retention, graduation rates, etc.) including one that looks at the relationship between library instruction and 4-year GPA with controls in place for differential grading standards across disciplines, on which I'm a co-author and which this afternoon finally progressed from revise-and-resubmit to accept-with-revisions. My point being: this is being studied, rather a lot, and it is by no means a new field of study.

Point The Third: Chris wrote, "a key information literacy outcomes should be 'students understand the importance of privacy to free access to information.'" In point of fact, it is (see the last "knowledge practice" item under "Information Has Value").

Point The Final (For Now At Least): A good friend of mine, in responding to people getting freaked out about the intersection of libraries and "Big Data" said (and I paraphrase), libraries have, and have access to, data that can help us learn what works and what doesn't. Who better than librarians, bound by a code of ethics almost eighty years old, to work with that data in ethical and responsible ways?

Thank you! I understand the notion of this being more like work than play, but at the same time, I totally appreciate getting your view, particularly since I know you have the inter-library conversations that I don't have, and actually do the instructional work. So, awesome!

I am inclined to think that when I said "We have no idea", I was thinking of We as being my own library. It's good to know that there is research at other institutions; ours was definitely not keen on assessment up to a very few years ago, and may or may not be playing catch-up these days. I don't hear about it, if it is. So it's terrific to see those links and at least get personally caught up.

I should also have probably said that I am assuming that the more general environment will continue to have the library culture of confidentiality and privacy fighting a rearguard action against the broader forces of Big Data being hoovered up by Big Corps and Big Government. It might be as well to consider the possibility of changing that environment as well.


