From: "Clifford Lynch" Sender: To: CNI-ANNOUNCE Date: Thu, 23 Jun 2011 10:18:42 -0400 Message-ID: X-Original-Return-Path: Received: from [64.134.240.154] (HELO [10.142.2.222]) by cni.org (CommuniGate Pro SMTP 5.3.13) with ESMTPS id 15374965 for cni-announce@cni.org; Thu, 23 Jun 2011 10:17:31 -0400 Mime-Version: 1.0 X-Original-Message-Id: X-Original-Date: Thu, 23 Jun 2011 07:04:54 -0700 X-Original-To: cni-announce@cni.org Subject: Study on Journal Article Data Mining from Publishing Research Consortium Content-Type: multipart/alternative; boundary="============_-903283845==_ma============" --============_-903283845==_ma============ Content-Type: text/plain; charset="us-ascii" ; format="flowed" There's a very interesting new report out on Journal Data Mining; it was prepared by Eefke Smit and Maurits van der Graaf on behalf of the Publishing Research Consortium, so it has a strong publisher perspective, but as far as I know it's the first extensive look at the issues involved in practical and operational large-scale data mining of the journal literature. One of the really interesting things that emerges from the report, at least the way I read it, is that many of the commercial publishers seem to be thinking about literature mining as a separate activity, not included in traditional electronic subscription arrangements (site licenses) that they have with research libraries. (Indeed, many such licenses forbid bulk downloading of journal articles, which in the absence of text mining facilities built into the vendor platforms is a prerequisite for such mining; even if such facilities exist, they essentially mean that the publishers control the evolution of mining technology). Rather, the publshers seem to envision a future where they'll do business directly with potential literature miners. This is one of several issues framed by the report which I think merit very careful thought by research library leaders, and broad conversations engaging faculty. The report is at: http://www.publishingresearch.net/documents/PRCSmitJAMreport20June2011VersionofRecord.pdf and there is an accompanying press release at http://www.publishingresearch.net/Media_page.htm DIsclosure: I was one of the many people interviewed for this study, presumably at least in part because of my 2006 paper on open computation. Clifford Lynch Director, CNI --============_-903283845==_ma============ Content-Type: text/html; charset="us-ascii" Study on Journal Article Data Mining from Publishing Resea
There's a very interesting new report out on Journal Data Mining; it was prepared by Eefke Smit  and Maurits van der Graaf on behalf of the Publishing Research Consortium, so it has a strong publisher perspective, but as far as I know it's the first extensive look at the issues involved in practical and operational large-scale data mining of the journal literature. One of the really interesting things that emerges from the report, at least the way I read it, is that many of the commercial publishers seem to be thinking about literature mining as a separate activity, not included in traditional electronic subscription arrangements (site licenses) that they have with research libraries. (Indeed, many such licenses forbid bulk downloading of journal articles, which in the absence of text mining facilities built into the vendor platforms is a prerequisite for such mining; even if such facilities exist, they essentially mean that the publishers control the evolution of mining technology). Rather, the publshers seem to envision a future where they'll do business directly with potential literature miners.

This is one of several issues framed by the report which I think merit very careful thought by research library leaders, and broad conversations engaging faculty.

The report is at:

http://www.publishingresearch.net/documents/PRCSmitJAMreport20June2011VersionofRecord.pdf

and there is an accompanying press release at


DIsclosure: I was one of the many people interviewed for this study, presumably at least in part because of my 2006 paper on open computation.

Clifford Lynch
Director, CNI
--============_-903283845==_ma============--