Mailing List Message #114934
From: Cliff Lynch <>
Sender: <>
Subject: Some interesting reading
Date: Wed, 30 Jun 2021 23:55:00 -0400
For a change, I'm sharing these in a reasonably timely fashion. I hope these are helpful. I thought it would be better to aggregate these rather than send separate emails if readers feel strongly that I shouldn't do this, let me know.


First off, the University of Notre Dame obtained an IMLS grant looking at a number of machine learning issues and held a really good focus group session (I was very grateful to be able to participate in some of this)   prior to the pandemic. Eric Lease Morgan and his colleagues have now completed the outputs from this work, which are extensive and really valuable. With his permission, here's Eric's summary of the work and the outputs.

 Goal - To understand the unique current practices of domain
  experts, librarians, and computer science specialists and to
  identify possibilities to use topic modeling and NLP to enhance
  or augment current library classification in order to meet
  current cross-disciplinary research needs

  Findings: 1) Interest in machine learning is high and appears to
  be on a precipice, 2) The biggest issues with cross-disciplinary
  research are not discovery related, 3) There is a high need for
  interdisciplinary collaboration, 4) Community effort for greater
  ROI, 5) "Garbage in, garbage out,"; machine learning requires
  good data, 6) Ethics are a really big concern for machine
  learning, especially regarding bias, and 7) There is a need for
  greater machine learning literacy

  Recommendations: 1) Increase the the community, 2) Develop
  machine learning education for scholars and library
  professionals, 3) Form learning communities and networks, 4)
  Create and curate a clearinghouse for machine learning models, 5)
  Support consortia around subject strength to develop machine
  learning tools, 6) Develop processes to enhance discovery tools,
  and 7) Support diversified machine learning innovations

As a bonus, we also published a freely available edited volume of fourteen essays on machine learning, entitled "Machine Learning, Libraries, and Cross-Disciplinary Research: Possibilities and Provocations". From the preface:

  This collection of essays is the unexpected culmination of a
  2018–2020 grant from the Institute of Museum and Library Services
  to the Hesburgh Libraries at the University of Notre Dame... The
  resulting essays cover a wide ground. Some present a practical,
  "how-to" approach to the machine learning process for those who
  wish to explore it at their own institutions. Others present
  individual projects, examining not just technical components or
  research findings, but also the social, financial, and political
  factors involved in working across departments (and in some
  cases, across the town/gown divide). Others still take a larger
  panoramic view of the ethics and opportunities of integrating
  machine learning with cross-disciplinary higher education,
  veering between optimistic and wary viewpoints.

For more detail and full access to the final report as well as the edited volume, please see:

  1. project home page -
  2. final report -
  3. edited volume -

Again, thank you for your participation!

Eric Morgan for Team IMLS Machine Learning Grant
Hesburgh Libraries
University of Notre Dame

Next, a very interesting report that looks at publishing patterns over time and what they suggest about evolving national roles in science and its communication. This was done by the Center for Security and Emerging Technology at Georgetown University. Curiously, I've not seen a lot of coverage or discussion of this report (though perhaps I'm just not paying attention to the right places). See

Also in the general area of research and national security issues, this is a useful piece that also comes out of the same group which relates to our work last year on Science Nationalism developments. See


This is a paper that's just come out looking closely at how open science practices have evolved during the pandemic, focusing on COVID-19 research. We will need to see a lot more analysis of the issues raised here across various disciplines in the coming months. See


 I've been tracking exemplars of possible post-PDF successors to the traditional scholarly paper for over 20 years; while there have been many of these, I don't think any have really seen adoption at scale. Here is a new set of examples released by eLife called "executable research articles" which are interesting because of their connection to the reproducibility and replicability movements in scholarship, and also because they use increasingly common and widely adopted tools. Here's the announcement and the pointers:

The collection follows the launch last year of new open-source ERA technology that lets eLife authors publish articles that treat live code and data as first-class citizens. This article format sets a new, open standard for the transparency, interactivity and reproducibility of published research. Authored with popular tools such as Jupyter Notebooks and R Markdown, ERA is available for all papers published in eLife.

Our collection showcases the breadth and variety of possibilities this new format offers, and you can view it at

You can also read about the experiences of our first authors in an accompanying blog post, at


Finally, here's a link to a good piece from Steven Bell of Temple University that EDUCAUSE has just put out looking at what's happened to the relationship between library reserves and textbooks, largely (but certainly not entirely -- these trends were already well underway) as a result of the pandemic. See


Clifford Lynch
Director, CNI

Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster