Saturday, February 9, 2013

The data you need?

Walt Crawford, who has done some rather amazing analyses of library-data, notably the freely-available data from IMLS (for public libraries) and NCES (for academic libraries), appears to be feeling a little, well, under-appreciated.  In his post, "The data you need? Musings on libraries and numbers," (in which he admits to have edited "to reduce the whininess") he expresses his concern that there appears to be a lot of data out there but nobody seems to care.  He cites a series of examples, including the low sales of his own work, the apparent demise of Tom Hennen's American Public Library Ratings (which, admittedly, I was previously unaware of), and the unfortunate circumstances that led a PhD colleague to pursue other venues because there were no jobs to analyze data in libraries.  This last example hits home with me because I feel quite fortunate to have the title of Collection Assessment Librarian.  Not only am I paid to analyze data about our collections, but that is my primary responsibility; it was not tacked on to the list of responsibilities of the Collection Development Librarian or the Reference Librarian or even the Dean.  This is what I do.  I am not meaning to boast, but rather to express my appreciation.  I also hope to point out that while such work of data analysis may be under-appreciated, I think there is interest, however scattered it may be.

But this is a general problem associated with LIS field itself - are we a science or are we a profession?  Can we be both?  The science of LIS implies that data is analyzed to answer fundamental questions regarding the who, what, when, where, why, and how of libraries and information.  But the big questions are almost always asked by the academicians.  Those in the profession are generally more concerned with the little questions regarding their collections, their budgets, their users.  What I think is needed is a greater connection of the little questions to the bigger ones.  How is my collection at my library affected by economic forces of scholarly communication?  In exactly what ways will the local, state and national economies affect the services and collections of my library?

My concern is that we are not preparing professional librarians to make these connections.  While "research" is mentioned in 16 of the approximately 90 sections/standards in the 2008 Accreditation Standards, courses in research methods and data analysis are haphazardly required by the SLIS graduates.  Of the 3 LIS schools in Texas, only UT requires a course in research methods, but not in statistics.  While I don't think that a practicing librarian needs to have the same training and skills of an epidemiologist or social sciences researcher, I do believe they should be able to read and evaluate published research and apply it to their smaller questions.  I also think they should be able to conduct small-scale studies to answer their questions using methods that will provide valid answers.

Walt goes on to question his own contributions, due to lack of response from the library community.  He is, essentially, taking a sounding, asking - Is anybody there? Does anybody care?  Well, Walt, I think we do care and some of us do read your results.  I think what is contributing to this apparent anomie is not disinterest, but perhaps a kind of paralysis - what do we do with this? It is interesting that the libraries in my state have generally moderate circulation rates or that circulation is correlated to expenditures.  What can I do with that information?  While this may be taught in the core curriculum of MLS programs, it may be forgotten as the graduates enter the workforce and get sucked into drudgery of their everyday routines.

Trying to address these issues, he asks some questions which are quite familiar to me:
  • Am I asking the right questions?
  • Is there any analysis that is worth doing?
  • How can this information be made "meaningful and useful to librarians"?
  • Are librarians "willing to deal with data at all–to work with the results, to go beyond the level of analysis I can do and make it effective for local use"?
  • Can librarians "get" the differences in statistical measures, such as averages versus medians?
I don't believe the truth is clear about this.  It is probably something like, some of us do care but don't get it; some of us get it but don't care; some care and understand it, but don't have the time; and some of us are quite interested and can follow through.  And it's not clear whether this last group is growing in numbers or just staying on the fringes.

Finally, he asks the one question that got me to start this post in the first place: What is the data we need?  This struck me because I've been working on my first collection assessment using more formal process and I keep thinking of other measures to include:
  • Relative circulation rates compared with holdings rates
  • Distribution of age of books
  • Distribution of materials by type, age and usage
  • Comparisons of these against our peer institutions
  • Comparisons of databases against our peers
  • Spending on these subject areas compared with our peers
  • Acquisitions of recognized materials (highly-recommended, highly-cited, award-winning, etc.)
  • Coverage of resources in databases
  • Usage of all of our resources (notably electronic)
  • Publication of materials in this area, especially given changes to the ecology and economy of scholarly communications
  • Impact of primary research materials on the field and in the school
Some of this data we have or can start collecting.  We are paying dearly for the use of the WorldCat Collection Analysis System so that we can compare with our peers (I understand the risks and problems with this but we believe it can still provide valid trends and comparisons).  We have been working to standardize how circulation and in-house usage data is collected at the different collections or libraries within our system.  And we have been working to bring all the data into central repositories to make comparisons and analysis a little easier.

Others, notably of usage and publication, are notoriously difficult.  It would be very useful to know if our usage of selected databases differed significantly from usage of the same resources at our peer institutions.  Heck, even after nearly 10 years of the COUNTER standard, our own usage data is still quite difficult to compile and understand (for some vendors, that data is absolutely worthless because it is masked by queries through a common interface).  

So, Mr. Crawford, I just wanted to say, I feel your pain.  It can seem lonely doing all this work without the formal recognition and the ultimate expression of value (money).  This is why I will start looking at the NCES data more carefully and try to think of how this information can be applied locally.  And I finally put my money where my mouth is...I purchased the book (print is still my preferred format) and downloaded the Graphing ebook - you now have one sale.

1 comment:

  1. Hi Karen,
    Thanks very much for this commentary (and for the first sale of GPLB!--feel free to pass it along to others). Your list of relevant data is interesting (and only the first one is clearly possible to analyze externally, I think). I believe the problem--"disinterest in numbers" is a negative way to put it--is more significant in public libraries than in academic libraries, but I could be wrong.

    Certainly, what I've done in Give Us a Dollar... is only a starting point for a local public library: It gives the library a set of benchmarks compared to similar libraries. The rest is up to the library, if/as they wish to try to improve their funding situation: Going beyond the raw numbers to determine meanings and pressure points.

    If I do that level of study again, I'd simplify it in some ways (fewer direct metrics, fewer levels per metric) and extend it in other ways (some year-to-year analysis, a couple of possible correlations). I'm still unclear whether that makes sense to do. Similarly, I'm pondering a book about dealing with numbers--working title, "The Mythical Average Library: Dealing with Numbers."

    The next Cites & Insights--probably out on February 11--will use NCES data to look at one bit of common knowledge (falling circulation in academic libraries), but that's a narrow analysis. In a later issue, I do plan to focus on and illustrate the misleading nature of averages where libraries are concerned.