Tuesday, March 15th, 2011

VIAF, OCLC and open data

Yesterday I released a service called “LC AuthoritiesThing.” The service solved a problem many have had with the LC Authorities website. Although a fine searchable resource, LC Authorities does not have stable URLs. Links die after a short period and are tied to sessions in a way that prevents sharing URLs during that period. LC AuthoritiesThing provides a window into the LC Authorities site which allows hard, reliable links. Various catalogers have thanked us for making the service, as it will allow them to refer to authority records more easily.

As an update to the post I took notice of VIAF, the Virtual Authority File, recommended to me as a substitute by a cataloger on Twitter. I assumed (apparently wrongly) that VIAF would at some point supercede LC Authorities. And I wrote that VIAF wasn’t a good substitute because it is an OCLC project, and encumbered by licensing restrictions.

Since then, I have received a diversity of communications that I am wrong. Although its data is hosted by and its services were developed and served by OCLC, VIAF is not an OCLC project, and the project has no access terms. Thomas Hickey from OCLC even wrote on this blog that full dumps are also available, although they must be approved somehow by project leaders.

This is welcome news. LibraryThing will be submitting a request for a full VIAF dump, and we’ll see where that goes. We will also look into automated harvesting of the website, or at least the LC portion of the data.

So much so good. But the situation is illustrative. Select people within the library community may believe that VIAF is free. But every public indication is that it is not free.

These indications include:

  1. OCLC copyright notices on every single VIAF.org page, and all VIAF-related pages on OCLC.org.
  2. Links to the OCLC Terms and Conditions from multiple VIAF.org pages, including the Privacy page.
  3. A robots.txt file that prohibits automated access to result pages.
  4. The “About VIAF” project page prominently states “Use of our prototypes is subject to OCLC’s terms and conditions. By continuing past this point, you agree to abide by these terms.”

As all catalogers surely know, the OCLC Terms and Conditions are lengthy and explicit. Among other things they prohibit commercial use, automated use, storage of data, and use of the data for cataloging (!). They state that OCLC has sole and arbitrary discretion to discontinue access to anyone for any reason. They state that exceptions to the terms requires permission in writing from OCLC.

Meanwhile, apart from a blog comment from Thom Hickey, I can find no assertions that OCLC terms don’t apply to VIAF, no mention of dumps or of a process to get them.

VIAF is to be commended for its openness and lack of terms. This is a great move forward for open bibliographic data. But it needs to make greater efforts to make others aware of this state of affairs, and define the level and character of openness. (It’s still unclear to me whether VIAF asserts any ownership, or whether it is all in the public domain.) And VIAF should make efforts to remove multiple statements asserting that OCLC terms apply to VIAF data.

Labels: cataloging, oclc

2 Comments:

  1. Jami says:

    I’m very curious: what’s the status of the request? any news?

    “LibraryThing will be submitting a request for a full VIAF dump, and we’ll see where that goes. We will also look into automated harvesting of the website, or at least the LC portion of the data.”