In addition to providing the raw data to our campus community, I think the library can take a leadership role in providing the tools and expertise to mine this data into something usable and useful. However, many of the tools that are used to transform data are highly specialized and have a pretty steep learning curve. So I’m going to work to provide an overview of the tools available and focus on those that would be useful in the context of undergraduate education. Continue reading “Tools for Data Analysis”
We are talking a lot about data, data literacy, and how North Park University can use Chicago data in the classroom. There are already a lot of courses using data in instruction and research so part of my work is figuring out what is already happening. Continue reading “Chicago Data for Undergraduate Research”
I recently revised CV to include the following line:
Manage interlibrary loan systems; increased the local fulfillment rate from 59% to 80% while decreasing average turnaround time
I thought it would be good to provide a little additional context for this claim by supplying some data, look at visualizing that improvement, talk about how we accomplished this change here at North Park, as well as what I learned from looking at the data. Continue reading “Report on Interlibrary Loan Improvements”
First, I’ve determined that there is no one “roadmap” that will lead my library into digital publishing. So, instead of creating a map, I’m going to do the best I can to sketch out the terrain ahead and think about questions that can guide our path.
This section tries to address two main questions: What is happening in the world of scholarly publishing that is relevant to North Park? What is happening within the North Park setting that is relevant to a library published endeavor? Quick thoughts:
- Continued movement toward Open Access. There is still work to be done in our local context but that is the clear movement. The Covenant Quarterly and Journal of Hip Hop Studies indicate this trend is taking root on campus.
- Institutional branding. There is a renewed focus in institutional branding and online presence. There could be powerful connections to make here.
- Publishing and the North Park mission. My sense is that North Park values diverse contributions to the academic community more than creating a specialized repository
- Chicago. There might be some opportunities to promote North Park within the regional context through research and student projects.
We need to define the scope of this project. There are many different efforts that fall under the broad category of “digital publishing”, including:
- Institutional Repositories
- Digital Humanities
- Data Repository
- Open Educational Resources
- Campus multimedia (lectures, performances, etc.)
Of these options, I think the most appropriate level and scope would be an institutional repository that contains simple/static documents such as PDFs. A next step would be to curate multimedia from across campus.
Even within this scope, the library will need to make editorial and collection development decisions to make sure that (1) we have a critical mass of content and (2) that there is some editorial scope. I think we should prioritize the following content areas and focus on building relationships with relevant parties.
- Honor’s Projects and Papers
- Student Research
- Master’s Thesis
- NPPress Student Research
- Covenant History Papers
- Partnerships with different courses/programs.
- Journal Articles
- Faculty/Staff Presentations and other “gray” literature
- Papers from campus symposiums
- Offer hosting/support for existing campus projects
Political Realities/Soft Skills
We would need some strong support from across campus to take on this project and lead the campus here. Given the proposed scope of this project, here are the people I think it would be important to connect with:
- The President
- Campus Deans
- The University Marketing and Communication Office
- Honors Program
- Seminary Faculty
- Faculty/Tenure Committee
- NPPRESS Leadership
- Student Research Committee
Some of these needed connections blend into the next set of questions that seeks to define the scope of this project and effort. I think if we have 5 strong allies (willing to contribute the content they are responsible for) that would make a strong starting point.
Do we have the technical and social workflows to produce, distribute and preserve this content? There are many overlapping questions here, but here is an attempt to list the important ones:
- Do we have the rights/permissions to publish these materials? Who will work with each group to determine these permissions and who will maintain the paperwork?
- Do we have the staff expertise, staff time, and faculty/staff connections to successfully manage this projects?
- What is the ongoing cost of this project in terms of hosting costs, incentives and open access fees, etc.?
- Where does this rank compared to other library/institutional priorities?
- What are peer institutions doing? What can we learn from them?
I’ve been tasked with creating a “Roadmap to Digital Publishing” for the Brandel Library. I’ll post the draft of the “roadmap” I develop later on, but right now I’m in the research phase of the project and wanted to document the sources I’m reading. I could include this as a bibliography in the other post, but this is happening chronologically first and I thought it would be pretty interesting to document. So here is goes!
Library-as-Publisher: Capacity Building for the Library Publishing Subfield
Katherine Skinner, Sarah Lippincott, Julie Speer, Tyler Walters
Volume 17, Issue 2: Education and Training for 21st Century Publishers, Spring 2014 DOI: http://dx.doi.org/10.3998/3336451.0017.207
Although this article focused more on the professional developement needs of library staff to take on publishing activities, it also provided a very good overview and definition of the library publishing efforts and section on “Core Knowledge and Skills for 21st Century Publishers” could provide a helpful guide.
This article borrows the following definition from the “Library Publishing Coalition” (http://librarypublishing.org/about-us) that defines library publishing as:
the set of activities led by college and university libraries to support the creation, dissemination, and curation of scholarly, creative, and/or educational works.
This relatively basic definition seems like a good shared starting point and distinguishing library publishing form other forms of digital humanities work.
Knowledge and Skills
The list of core knowledge and skills identified in this article proved to be very helpful as well. Here is a summary of that section – with my summarizing comments in the form of questions:
- Scholarly Publishing Context – what is happening in the world of academic publishing?
- Academic Context – why is the role of the library within the larger institution? to publish something unique? enhance the institutional brand?
- Soft Skills – who do you need to build relationships with? What political connections need to be forged?
- Business Planning and Management – what is your business plan? who is your audience?
- Technology and Workflows for Production, Distribution, and Preservation – do you have the technology and workflow to publish (and care for) everything you want to publish? For text, audiovisual, datasets, interactive things, etc.
- Editorial and Acquisitions – can you get and edit the stuff you want to publish?
Overall, this was a very helpful article. The two sections I’ve highlighted were particularly relevant – even as I read the article from an “institutional roadmap” when the original article was written in terms of professional skills and development. The Notes section as provided to be quick valuable in highlighting other relevant sources and authorities.
My library just fielded a question from the Nursing department who, after reading this article from the Chronicle of Higher Education, wanted to know our policy for posting articles and chapters into our Learning Management System (LMS).
While I was drafting a response in private, I thought it would be good to summarize that article and then post my response here for future updating and public re-use.
The article is commentary on the Georgia State University lawsuit where three publishers – Cambridge University Press, Oxford University Press, and Sage Publications – challenged the Georgia State University’s policy that allowed faculty members to upload excerpts from books into their LMS. Thankfully, the court has decided that the vast majority (70/75) of these uses were “Fair Use” and therefore legal under the law.
But, as the article points out, the issue at stake is not just the Georgia State University uses but to clarify (perhaps define?) the legal limits of copyright and fair use as it related to academic libraries. So the case is not limited to those three published and that one university, the results are much more far-reaching.
The publishers’ request for a very broad injunction is not really a surprise. The plaintiffs always intended for the GSU case to establish a precedent that publishers could use to persuade colleges to pay for digital licenses from a company they work with, the Copyright Clearance Center.
So, like the author of this commentary from the Chronicle of Higher Education and likely most academic librarians, I am rooting for GSU in this case and hope that it established precedent that ensures a broad definition of fair use and does not impose time consuming record keeping to track the fair use of copyrighted material.
So, given my thoughts, how should I respond to the faculty inquiry about our policy regarding fair use. I think it’s an opportunity to both establish the broad playing field, underscore the ramifications of this decision, and invite further conversation.
Response to Faculty Inquiry
Thanks for reaching out with a question about copyright and fair use as it relates to articles and book chapters in an academic setting. This is clearly an important and heavily contested issue – one that really precludes a simple policy or rule – so I’m happy to provide some background and some safe best practices and then invite further conversation.
Best Practices for Licensed Content
In general, if you are using electronic resources licensed by the Brandel Library, we encourage you to provide permalinks to the library’s subscription into Moodle. Two main reasons for this policy:
- This is almost always a permitted use within our license agreements. Some database license agreements allow articles to be uploaded directly into a LMS but other licenses expressly forbid this. This prevents this level of confusion and creates a better experience for students and faculty.
- Linking back to the publisher provides the library with vital statistics. Linking this way ensures that we can make collection development decisions that reflect accurate usage – posting a PDF in Moodle prevents the library from tracking usage and impairs our ability to use data to make collection development decisions.
Fair Uses for Non-Licensed Content
This gets slightly more thorny with non-licensed content such as print book chapters, articles from print journals, or articles not available through the library’s online resources. Assuming that such materials are under copyright – which is a safe assumption unless it was published before 1923 or published with a Creative Commons license of some kind – the only legal option to consider is Fair Use.
The US legal code (Section 107 of the Copyright Act) defines four factors to consider with fair use:
- the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
- the nature of the copyrighted work;
- the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
- the effect of the use upon the potential market for or value of the copyrighted work.
Given that we are a university, the purpose and character of the use is educational and therefore the first factor will almost always support fair use. However, all educational uses are not permitted – copying an entire book and distributing it to a class would not be a fair use – and therefore all four factors should be considered.
The Georgia State University ruling seems to indicate that the courts view that using a single chapter from a book as fair use but that multiple chapters from a single book is problematic. However, posting a PDF of a scholarly article in Moodle would be problematic and would likely not quality as a fair use of that material. We are working on building up our electronic reserve capabilities here in the library and should be able to provide more robust services in this area soon.
I will conclude by underscoring a few things:
One issue at stake in the GSU case is how extensive our institutional record keeping needs to be in this area. The publishers want to require extensive recording keeping that GSU (and most schools) would view as very burdensome and a hindrance to fair use.
The proposed injunction would also require university personnel to confirm that every excerpt uploaded to course websites met the fair-use criteria and to keep track of information about the book, which parts were used, the number of total pages, the sources that were consulted to determine whether digital permissions were available, the date of the investigation, the number of students enrolled in the course, and the name of the professor. The university would have to maintain those records for three years.
North Park does not currently require any record keeping and entrusts faculty members to make informed decisions about fair use. The library will continue to follow this case and inform the campus if our record keeping policies need to change.
Second, one reason that fair use is so fuzzy and unclear is that there have not been many cases that have tested the limits of fair use as it related to academic institutions. As an academic library, we want to rigorous defend the rights of authors and content creators by respecting fair use and honoring our licensing agreements with the publishers we work with. On the other hand, we also want to claim the full expression of fair use afforded to us in the law.
I’m taking the Foundations of Data Curation class at GSLIS and just finished a progress report for the MODIS Snow Frequency data set. Here is a link to the report.
Scanning and File Naming
To create high quality digital copies of photographs, scan at a high resolution and save in an approved file format. For most images, scanning at a moderate resolution (600 DPI) and saving as a JPEG should be adequate for this project. For particularly important images and/or images with preservation concerns, scan at a high resolution and save as a TIFF file.
For documents, maps, or other textual materials, scanning at 600 DPI and saving as a JPEG should be adequate. For multimedia files, save the highest quality rendering and avoid proprietary file types.
All files should get a unique file name. I am proposing the following file naming conventions: NHS_##_###. The goals are to provide a unique name to each file to make file management easier. File names are composed of:
- NHS = a generic prefix to distinciton Northfield Historical Society files.
- ## = Accession number. Each new project, donation, or scanning should be given a new accession number.
- ### = File number. Within each accession, give each file a unique number.
Therefore, for example, NHS_01_024 would be the 24th file in the first accession group.
Metadata – that is, the descriptions of images or collections – is an important part of this project. I’m proposing the following metadata plan where every item has the following pieces of descriptive information:
- Title = Short, description of item.
- Creator = Person/Organization responsible for creating the item, if known.
- Description = free form description or any length.
- Publisher = for published works, name of the original publisher.
- Contributor = name of the person/organization who donated the resource.
- Date = in the form YYYY-MM-DD. Use “Approximately” for date ranges, etc.
- Type = Still Image, Text, Map, etc.
- Original = Photograph, Newspaper, Map, etc.
- Digital = image/jpeg, image/tiff,
- Identifier = File Name (as outlined above)
- Language = Language of material, if applicable.
- Spatial = all should be “Northfield (Ill.)
- Temporal = Grouped by decade (1910-1920, etc.)
- Rights = Any copyright information. If none, include “No known restrictions”
- Collection = What collection is the item a part of.
- Tags = uncontrolled and informal short descriptions to add access points.
Subject Source Relation
- Install Omeka on the northfieldhistoricalsociety.org domain.
- Migrate all existing content (all images and descriptions) from current site.
- Update this digitization and metadata guide.
- Provide some training and support.
I’m back for the second day of my internship with the Center for Railroad Photography and Art and wanted to record some of the technical parts of our process.
While most of this collection has already been scanned, some needs to be scanned for the first time. Here is the process I’m using for the scanning process.
I’m scanning using Lake Forest College’s “Epson Perfection V750 Pro” scanner with the frame for negatives. I am scanning at 600 dpi in 16-bit grayscale and saving as jpg. The purpose of these scans is to create decent access copies for the Center with the option to add them to online databases (Flickr or the Center’s website) in the future so that this level of scanning is appropriate.
Most of the files are named using the following convention: Collection Name/Box/Envelope. For example
- Springer_02_123 = Springer Collection, Box 2, Envelope 123
- Springer_04_067 = Springer Collection, Box 4, Envelope 67
I’m proposing that we standardize this practice across the entire collection. Additionally – to account for instances where one envelope contains multiple photos, I’m proposing that we extend this convention to be: Collection Name/Box/Envelope/Image. For example,
- Springer_01_049_A = Springer Collection, Box 1, Envelope 49, First Image
- Springer_01_049_B = Springer Collection, Box 1, Envelope 49, Second Image
- Springer_01_050 = Springer Collection, Box 1, Envelope 50 (the only image).
I think it makes sense to supply the alphabetical distinction only when needed and to use letters instead of numbers because (1) it will improve computer sorting and (2) the original order (within the individual envelopes) is difficult to preserve.
This is the core metadata for the collection
- File Name (see above!) – This will serve as the unique identifier for each image in the collection.
- Railroad – This collection is organized by railroad so that information will be preserved. I would imagine that acronyms and abbreviations will be replaced by the standard form of the name
- Railroad Number – Because I lack the context, I’m transcribing as I find on the object. Again, I think standardizing to a formal, controlled vocabulary will be important at some point.
- QUESTION – what should I do when multiple trains are listed. Should I create two fields for this data or combine them in one field?
- Description – A few images have short (2-3 word) annotations. I wanted to preserve these notes and couldn’t think of a better field.
- Location – I’ve seperated city from state in the spreadsheet on the basis that (1) it would be easy to combine these fields in the future and (2) this separation is easier to manipulate.
- QUESTION – some cards contain information like “MP 60” which I’m interpreting as “Mile Post 60.” This seems like valuable metadata but data that doesn’t fit squarely in the “Location” field. Is this information worth preserving and – if so – how should I record it.
- Date – So far, all photos have a clearly marked date. I’m recording that using the YYYY-MM-DD standard recommended by the Center.
- Collection – This is all the Springer Collection.
- Original Format – This collection is black and white negatives in several standard sizes.
- Digital Format – I am creating jpegs.
- Date Scanned – This technical metadata is recorded in the system.
As an aside, I thought it would be fun to mention two pieces of train-specific metadata I’ve encountered so far. I’ve mentioned one already – “MP 60.” Again, I think that means “mile post” but I’m really not sure. I think it would be great to record this and could be very useful to certain people in a certain context. However, it doesn’t fit with other standard vocabularies (city/state) and requires additional context to be useful.
The second train-specific information I encountered is this (2-8-4). A quick google search took me to Wikipedia where I learned that is the Whyte Notation for a particular wheel arrangement (http://en.wikipedia.org/wiki/2-8-4).
I’ve volunteered to speak at the ALCTS Symposium on January 30th about North Park’s experience using demand driven acquisitions for ebooks. The topic of the day is “Collection Directions: The Evolution of Library Collections and Collecting” (website) and is based on the following article:
Dempsey, Lorcan, Constance Malpas, and Brian Lavoie. 2014. “Collection Directions: The Evolution of Library Collections and Collecting” portal: Libraries and the Academy 14,3 (July): 393-423. Links to full text here: http://oclc.org/research/news/2014/10-14.html
The instructions for the day invited us to create a short presentation that “should be practical in nature (what are you doing in your local and/or consortial environment that can serve as a model for other institutions) but also touch on what you think this means (if anything) for the future of collecting.” Because this is my first time ever speaking at such an event, I think I will stay as close to the prompt as possible.
My name is Andy Meyer and I am the Digital Information Specialist at North Park University here in Chicago. North Park University is a small, private liberal arts college with a FTE of less than 2,500. We have a print collection of about 250,000 books that is focused on supporting undergraduate research and graduate study in a few areas. We have a full time staff of nine people. So our context is quite different than some of my fellow panelist. I’m here to reflect on how changes in library collections and collecting are influencing small schools in unique ways.
This is also my first time speaking at such a large forum and – to be honest – I’m a little nervous. So I’m going to stay close to my notes and answer as directly as possible the invitation that I received in December. So be prepared for 15 minutes on the projects I’ve worked in for my institution that can serve as a model for other institutions while touching on what I think that means for the future of collecting.
In particular, I’m going to talk about two different PDA programs that I managed. First, I’m going to talk about a program I lead that helped my library’s reference collection transition from print to electronic. Second, I’m going to talk about the opportunities and challenges of managing a multivendor DDA program, particularly how we planned, implemented, and maintain that program. Lastly, I’ll touch on the next steps we have planned for North Park as well as some broader thoughts about the future of library collections and collecting.
DDA for our Reference Collection
I implemented North Park’s first demand driven program with the head of collection development as a way to help us move our reference collection from print to online. We wanted to provide a lot of eReference content to serve our undergraduate student population and we had to do so with relatively limited funds. This project was relatively straightforward and involved the following steps:
- Reviewed a title list supplied by the vendor and removed items that are clearly outside of the local scope of our collection (Dentistry, etc.).
- Batch loaded vendor supplied records into our catalog and created additional access points (that is, we added a vendor supplied widget).
- Marketed this new resource to the library and the rest of campus.
Here is my first bit of “practical advice”: become an expert on your DDA program. We focused on marketing this resource to outside audience (students) but I didn’t anticipate how much I would need to reach out to my co-workers. I learned that I needed to explain everything well and in multiple formats (email, meetings, informal conversation) so that everyone was on the same page. I would imagine this is true and every scale and it was certainly true in our context.
So that is a quick walk through of that process as well as a piece of practical advice. On a broader scale, I think this project raised interesting questions about the use of data in libraries and reduced transactional costs.
Data-Driven Decision Making
An interesting part of this program was a paradoxical lack of data in certain areas and an excess of data in others. Perhaps a better characterization was that we had usage data in a variety of formats that took effort to use and interpret. For example, although we wanted to leverage usage data about our print collection to inform and prioritize our digital purchases, items in our reference collection didn’t have circulation or browsing information. Instead, we had to rely on different sources of usage data – recollections, impressions, stories, etc.
This wealth of anecdotal information and lack of hard data is quite different from the information provided by the eReference platform. Obviously, all you get from them is hard data without any real sense of how they are using it. This data is wonderful but is much more complicated to interpret and evaluate. I’ll return to this point later on.
Transactional Costs and Infrastructure
I thought the easiest way to talk about transactional cost and infrastrucutre was to simply say acquiring thousands of eReference titles was fundamentally different than acquiring thousands of print reference books. This is especially true in terms of cataloging, processing, and physical infrastructure. Using the language of the article, some of this is the difference between print books and ebooks (no physical processing or shifting!). However, the more fundamental shift was not in print vs. electronic but in our reliance on the vendor’s digital network. We relied on their cataloging services. We spot checked a few records and ran reports to get an overall impression, but these process was not were near as complete as our cataloging procedures for print books. However, our evaluation process before loading and on on-going use have confirmed that these vendor supplied records are totally adequate for our use. I think many small schools fear the “flood” of vendor supplied records because it is viewed as an attack on “traditional cataloging” – I’m here to say that in my experience, these fears are largely unfounded: we’ve been pleased with the quality of these records. Furthermore, it would have been wildly impractical to maintain old practices – in fact I would argue that most DDA programs would require a change in processing and (for better or worst) an increased reliances on outside digital networks.
Multivendor DDA for General Use
Our major DDA program has a been a multivendor program done for general use ebooks. Our goal in this project was to create a “critical mass” of ebooks so that ebooks would become normal for our student community. We thought that a DDA program could quickly provide this “critical mass” of ebooks as well as be an interesting experiment for us. I want to focus on the word “experiment” because that was and is how I view this DDA program. I also think this language resonated across the library – as a way to demonstrate innovation cost saving to administrators and as a way to assuage certain fears.
- We set aside funding. We started with $3,000 – this amount was large enough to provide a sufficient data set without overwhelming our budget or fundamentally changing other programs. I say this “experiment” is on-going because we haven’t fully expended this fund yet. Our plan was to assess the results at the end of this process and evaluate whether we would want to continue this program.
- I had to take care of a lot of administrative work.
- Create a fund structure and lots of other ILS settings
- Determine what vendors and platforms you want to include
- With vendors, determine short term loan models, etc.
- Create “collection profile” that specifies what sort of titles we want included in our DDA pool. To make this profile, I pulled circulation information from our ILS as well as collection information. We based on profile for DDA on our print collection and print circulations – more on this later.
- With CARLI, our state-wide consortium, set up batch loading procedures.
- Then we tested to make sure all the parts fit together. I’ll say now that it was been wonderful to work with our our partners – our main book vendor and CARLI – though I’ll have more to say about how these groups intersect at the end of my time.
- Maintain the program
- Load new records each week.
- When licensing rights change, delete old records from the system.
- Monitor all short term loans and purchases.
Hopefully this overview of the process has implicitly conveyed a point – setting up a DDA program takes a far amount of work at the local level as well as lots of coordination with larger networks.
As a way to pivot from the practical to the philosophical, I want to share a success story from this project. After doing all this work, I was naturally very curious to see what our first DDA loan would be. Our first loan of this program was “Religious Ethics and Migration: Doing Justice to Undocumented Workers.” This was a tremendous success for a number of reasons:
- It seemed to fit our collection scope as a Christian University focused on social justice and
- The author is actually a North Park professor and
- The book was on reserve for a class.
So our first DDA loan was a perfect fit for our collection. These felt like great news (and a relief!) and showed how this new service could extend our collection to meet the needs of our community.
So our first loan was a tremendous success…but it was also a little unsettling and raised questions: “Why didn’t we order an electronic copy of this book (it’s a North Park author and a required text)? Do we want a print copy and an electronic copy? Also – I wasn’t aware that this record was even in our DDA pool. It just shows that the lines between the library, vendor, and our community were blurred. This “blurring” is particularly clear when I process the file of new records, delete the old records, and process the short term loans and purchases. It’s very apparent that we don’t have local control of this collection in the same way we control our print collection.
Implications/Questions for the Future
Overall, this project is 100% doable for small institutions with limited budgets and staff time. I’ve provided a brief overview or our experience so far – feel free to reach out to me directly with additional questions or concerns – I’m happy to share more. As I said, this program is still on-going and I’d like to conclude my time with two questions that I’ll be exploring when this experiment is over.
Data-Driven Decision Making
In both of these projects, we’ve assumed that print collection and circulation information would correspond to eReference and eBook usage. This assumption seems logically but also seems somewhat troubling. By looking at the data and talking to my community, I’d like to see how closely print usage matches electronic usage. While I anticipate general correspondence, I also wouldn’t be surprised to see striking differences – differences that perhaps to the changing patterns of research and learning mentioned in this article and elsewhere.
I’ve talked a bit about how success in our PDA programs relied heavily on partnerships between our institution and the different vendors. In particular, I’ve mentioned what is essentially outsourced cataloging, shared technical infrastructure, and the blurred lines between library collection and vendor service.
However, North Park also exists in a highly networked environment with our consortial network – CARLI – and we haven’t fully studying the impacts of our DDA programs on our consortial network. How do the various DDA programs of member institutions affect the consortium as a whole – especially in terms of a shared catalog and interlibrary loan? Essentially, I’m curious to know how
Overall, for a small liberal arts school, these experiments in DDA have been very positive. We’ve been able to provide a lot of content to our community at relatively little cost, we have learned a lot from this experiment, and we are thinking more critically about the future of collections and collection development.