Archive for the ‘oclc’ Category

Tuesday, March 15th, 2011

VIAF, OCLC and open data

Yesterday I released a service called “LC AuthoritiesThing.” The service solved a problem many have had with the LC Authorities website. Although a fine searchable resource, LC Authorities does not have stable URLs. Links die after a short period and are tied to sessions in a way that prevents sharing URLs during that period. LC AuthoritiesThing provides a window into the LC Authorities site which allows hard, reliable links. Various catalogers have thanked us for making the service, as it will allow them to refer to authority records more easily.

As an update to the post I took notice of VIAF, the Virtual Authority File, recommended to me as a substitute by a cataloger on Twitter. I assumed (apparently wrongly) that VIAF would at some point supercede LC Authorities. And I wrote that VIAF wasn’t a good substitute because it is an OCLC project, and encumbered by licensing restrictions.

Since then, I have received a diversity of communications that I am wrong. Although its data is hosted by and its services were developed and served by OCLC, VIAF is not an OCLC project, and the project has no access terms. Thomas Hickey from OCLC even wrote on this blog that full dumps are also available, although they must be approved somehow by project leaders.

This is welcome news. LibraryThing will be submitting a request for a full VIAF dump, and we’ll see where that goes. We will also look into automated harvesting of the website, or at least the LC portion of the data.

So much so good. But the situation is illustrative. Select people within the library community may believe that VIAF is free. But every public indication is that it is not free.

These indications include:

  1. OCLC copyright notices on every single page, and all VIAF-related pages on
  2. Links to the OCLC Terms and Conditions from multiple pages, including the Privacy page.
  3. A robots.txt file that prohibits automated access to result pages.
  4. The “About VIAF” project page prominently states “Use of our prototypes is subject to OCLC’s terms and conditions. By continuing past this point, you agree to abide by these terms.”

As all catalogers surely know, the OCLC Terms and Conditions are lengthy and explicit. Among other things they prohibit commercial use, automated use, storage of data, and use of the data for cataloging (!). They state that OCLC has sole and arbitrary discretion to discontinue access to anyone for any reason. They state that exceptions to the terms requires permission in writing from OCLC.

Meanwhile, apart from a blog comment from Thom Hickey, I can find no assertions that OCLC terms don’t apply to VIAF, no mention of dumps or of a process to get them.

VIAF is to be commended for its openness and lack of terms. This is a great move forward for open bibliographic data. But it needs to make greater efforts to make others aware of this state of affairs, and define the level and character of openness. (It’s still unclear to me whether VIAF asserts any ownership, or whether it is all in the public domain.) And VIAF should make efforts to remove multiple statements asserting that OCLC terms apply to VIAF data.

Labels: cataloging, oclc

Thursday, May 21st, 2009

Non est potestas: OCLC Policy withdrawn

Non est potestas super terram quae comparetur ei / There is no power on earth that compares with it (frontispiece to Hobbes’ Leviathan)

The OCLC Review Board finally put OCLC’s Policy to bed. In a short speech to the OCLC Members Council, board chair Jennifer Younger, affirmed “that a policy is needed, but not this policy.” After the drubbing they got from the ICOLC and ARL—neither of which took into consideration OCLC’s recent push into the software market—you can be sure OCLC will take its board’s advice.

While the result was the right one, and I’m sure the members are good, conscientious librarians, I’m not going to echo others’ praise about their decision. The writing was on the wall. If they had pushed forward, OCLC would have met even more hostility than it already engendered. The speech itself, like their “push poll” survey, show where the OCLC Review Board’s sympathies lie. I don’t think you could read the ICOLC or ARL reports against it and conclude OCLC “gets” it. It was, as one of my email correspondents put it “all about protecting WorldCat, identifying ‘threats,’ and ‘appropriate use by members.'”

I think one of Dr. Younger’s phrases nicely encapsulates the flaw in OCLC’s approach, namely “we must revisit the Social Contract between OCLC and its members.” I’d like to go into the phrase a little deeper, not without some fun-poking.

The Social Contract. The phrase “social contract” is an interesting one. The idea also appeared in the the ARL report*—and may well go have been used before. As everyone knows, it’s a key concept in political theory (hence the Hobbes’ frontispiece above) as the thing that, some believe, makes government power legitimate.

Why, therefore, does a cooperative need to express itself in terms of state-formation, rather than a voluntary cooperative? Why does OCLC want to cast itself as a government?

Sitting in OCLC’s Brobdingnagian Dublin, OH headquarters, with thousands of OCLC workers shuttling about like so many ministry secretaries, and an interior hall bedecked with flags like the United Nations, it must be easy to think of OCLC as a sort of “government of libraries.”**

Architectural criticism aside, OCLC’s answer might be that, unlike cooperatives, a government gets to enforce its will more broadly. If you run afoul of a cooperative, the mutual consent that bound you to the cooperative is withdrawn, and you leave or they kick you out. But if you break the rules of a state, they put you in prison, whether you consent or not. Indeed, while philosophers have sometimes proposed a formal ceremony of consent, states act on non-consenting members all the time. Anarchists go to prison too (indeed, they tend to go to prison for being anarchists). This model, therefore, fits in better with OCLC’s plan to bind former members and non-members use of library data. As a cooperative, OCLC is a purely member institution. With a “social contract,” OCLC get to dictate more like a state.***

If the idea of OCLC-as-state is accepted, however, there’s a gap between the ARL report‘s idea of a “mutual social contract”—a social contract between citizen equals—and Younger’s description of the “Social Contract between OCLC and its members.” The latter is a nice description of the more antique view of a contract between citizens and their sovereign. Even if libraries accept a government over them, I suspect they would be more comfortable with the more modern view of a contract between equals.

Nullum timeret? As libraries consider the matter, one factor can be put to rest: Fear.

Writing for the Next Generation Catalog for Libraries list, Karen Coyle speculated that libraries were responding to OCLC through organizations like ARL and ICOLC, out of a “combination of ‘strength in numbers’ and ‘safety in numbers.'”

“I’ve seen a remarkable tendency of libraries to not want to confront OCLC. Remember that the proposed policy had penalties for mis-use of records, and severe ones at that (loss of rights to use OCLC records altogether). There is an intimidation factor involved. Agreed, as a member organization the members should not be afraid, but I think they are.”

That fear should be much diminished. Members spoke up, and OCLC backed down again and again (by my count they postponed, revised, revised, revised, delayed for revision and now shelved). Libraries—and, quite importantly, lots of people who merely love libraries—rose up and forced OCLC to back down.

The frontispiece to Hobbes’ Leviathan, shown above, quotes the Vulgate of the first clause of Job 41:33, non est super terram potestas quae conparetur ei (“There is no power on earth that compares to him.”). No doubt Hobbes or anyway his cover-designer felt this a good description of the legitimate sovereign.

But for our purposes, the second clause is particularly apt, qui factus est ut nullum timeret, “There is no power on earth that compares to him, who was made to fear nothing.” Job thought that whales were fearless.

Well, OCLC, you’ve met the architeuthis. And he is all of us.

See also:

*”In the eyes of the community, the guidelines expressed a mutual social contract, and the new Policy represents an authoritarian, unilaterally imposed legal restriction.”
**The trees growing inside are also a nice touch. The Assyrian throne room just had a carving of a tree, and the White Tree of Gondor was outside. 😉
***Similarly, nobody minds if you belong to two cooperatives. But states tend to be jealous about allegiance. AS ICOLC wrote, libraries are involved in a “complex set of relationships,” of which, “OCLC is one vital component among many that libraries will use.” That’s not really a “social contract” idea.

Photo by Flickr user OZinOH.

Labels: oclc

Tuesday, May 12th, 2009

OCLC Policy, Good night

The International Coalition of Library Consortia, a very loose but extremely large group of library consortia, just released a Statement on the Proposed OCLC Policy for Use and Transfer of WorldCat Records.

For a while there it looked like OCLC was going to succeed in locking down the world’s library data, converting a wonderful sharing and coordination tool into an unbreakable data monopoly. But, together with OCLC’s recent, revealing decision to enter the library systems market, the ICOLC statement effectively ends that possibility. OCLC isn’t getting its new Policy, or anything like it.* Good night, OCLC Policy.

The details are worth a look. The ICOLC’s statement was short, signing onto the “substantial and broad” concerns highlighted by the Association of Research Libraries. It goes on to add three concerns, two of which address the risk to innovation—a topic the ARL report barely touched on:

  1. The proposed policy appears to freeze OCLC’s role in the library community based on historical and current relationships. We share the concern, voiced by many, that the policy hinders rather than encourages innovation, and we urge the Review Board to carefully examine this issue. It is unclear that the policy has been constructed with a focus on an evolving role of OCLC in enhancing the missions of an international library community with diverse and complex interests.
  2. The scope of the proposed policy goes well beyond any concerns about inappropriate commercial exploitation of WorldCat records. It applies as well to non-commercial uses. ICOLC member consortia are member-created, member-driven innovation agents. Our initiatives are generally non-commercial and undertaken with member approval based on member needs. Any OCLC record use policy should account for the rich and diverse innovation that takes place through many consortia.
  3. The proposed policy is legally murky. There is no mechanism for negotiation of terms and conditions nor is it clear what constitutes acceptance by member libraries. A new policy must address these problems.

As significant as the content was the list of signatories. Lyrasis, the former Palinet and Solinet, includes over 2,500 members. With “regional base and national scope” (their words), and about to merge with Nelinet, bringing their members to 4,500, Lyrasis is a major player. They’re no longer just a “regional service provider” for OCLC, and can be expected to collaborate or compete with OCLC as its members’ interests lead. They were joined by many of the big regional and state networks out there—MINITEX, NERL, the Florida Center for Library Automation, the Washington Research Library Consortium, the Michigan Library Consortium, WiLS, four Canadian consortia and both the Swedish and Finnish national libraries. Some of the signatories ought to have been sympathetic. Orbis Cascade, a source of much original cataloging, is also an important OCLC partner in developing consortial software. In Ohio, OCLC’s home state, OhioLINK, OHIONET and INFOhio all signed. Other members will add their names to the list as they affirm it.

The Next Step. It’s time now for the library world to step back and consider what, if anything, they want to do about restricting library data in a fast-moving, digital world. Some, including some who’ve deplored OCLC’s process and the policy, want restrictions on how library data is distributed and used. Once monopoly and rapid, coerced adoption are off the table, that’s a debate worth having, and one with arguments on both sides.

From my perspective, restrictions on the use and transfer of cataloging data—which is not usually copyrightable and is most frequently created by bodies responsible to the public good—is legally dubious and ethically stingy.

Instead, libraries should embrace “radical openness,” a commitment to sharing what they know freely, something that looks less radical in light of the library’s historic dedication to the free exchange of information. Selling other people’s library records isn’t a real threat, but, if it were, the answer would be more openness, not less. When you sell tickets, you get scalpers. But nobody makes money selling passes to Central Park. (A few people make money walking dogs around it. Most just enjoy the free grass and sunshine.) And in a world that’s looking less and less friendly to the long-term success of libraries, an unwavering commitment to sharing and openness may well be libraries’ saving grace.

So, three cheers to ICOLC for speaking up on this issue. Now, librarians and library programmers, let’s get back to work. Let’s earn our freedom.

Artwork: “Flickr is Freedom.” Creative Commons, Attribution, by Timtak.

*I note with some interest that Edward Corrado, whose OCLC posts have been very perceptive, isn’t quite as excited about this as I am. Where he wishes the statement was “worded a little stronger” I take great solace in phrases like “The proposed policy appears to freeze OCLC’s role in the library community based on historical and current relationships.” I’m hoping some others weigh in. The library world is, I think, somewhat exhausted by the whole OCLC Policy affair, and now that the organizations are weighing in strongly and negatively, the bloggers and newslist-ers who raised the initial questions—and were excoriated for it—may no longer be as necessary.

Labels: oclc

Friday, April 24th, 2009

OCLC news reactions

This post follows on The OCLC End Game, posted early this morning.

Library Journal‘s Josh Hadro did an excellent follow-up article. Besides citing this blog post, Hadro got responses from Carl Grant, president of Ex Libris on OCLC’s tenuous non-profit status—I’ll have another post about that soon—and a number of bloggers. Iris Jastram/Pegasus Librarian’s thoughts deserve quotation:

“I’m pleased that this is yet another competitor against the current lumbering giants in the ILS market, and I like the idea that (if I understand correctly) this will add a hosted option to the ILS market. … On the other hand, this means that that pesky new policy on the transfer and use of OCLC records really wasn’t just about protecting a bunch of member-produced data after all. There were bigger plans afoot, and these plans involved leaning even farther toward the vendor model rather than the service model. And if OCLC is a vendor rather than a service, that new policy feels even more like a land-grab rather than an effort to protect member investments.”

Ms. Jastram’s misgivings are comforting to me, at least, as her previous thoughts on the OCLC Policy were more mixed. Ultimately, the fate of OCLC’s Policy will be decided by the people in the middle—the fair-minded people, not the ones who equate OCLC with the Matrix, The Empire or the All Your Bases villain.*

The Smithsonian Libraries on the OCLC Policy. I missed this, but on April 2 the official blog of the Smithsonian Institution Libraries weighed in on the OCLC Policy and the ACRL/ARL response (PDF), “support[ing] the recommendations” emphasizing a number of points. Among these were:

  • “The policy should recognize and affirm traditional library values of cooperative cataloging and shared bibliographic information without any claim of ownership of the bibliographic records.”
  • “OCLC’s new policy should recognize, and not be in conflict with, existing legal obligations or requirements that may apply to some OCLC member libraries (such as federal libraries).”

It’s great to see a federal library making such a public statement. Having been passed by in the OCLC Policy discussion—Federal librarians have told me they were amazed OCLC thought it could unilaterally change licensing terms with government entities—and not included on the ARL/ACRL board either, at least one is lending its voice to the criticism. Hooray for them. James Smithson, who left his estate for the “increase and diffusion of knowledge”—and to a country he had never even visited!—would, I think, be proud.

*Chris Bourg left a comment to the effect that the AYBABTU reference was purely humorous, and she does not consider OCLC a villain, even if she thinks I’ve got a good argument. Now, can anyone think of a way to tape Jay Jordan saying “You have no chance to succeed make your time”? I’m thinking we could sky write it over OCLC headquarters in Dublin, OH and secretly film OCLC employees puzzling it out. Ideally, though, he’d need to wear the bionic monocle.

I will never run out of interesting Flickr chess images. This one’s by Shyald, from a series.

Labels: oclc, worldcat, worldcat local

Friday, April 24th, 2009

The OCLC End Game

Two years ago I predicted what OCLC, the library-data organization, was after with it’s WorldCat Local pilot program—”They’re trying to convert a data licensing monopoly into a services monopoly.” To illustrate, I changed the OCLC logo to the Death Star.

I was hardly alone in this speculation. But this concern was soon overtaken as OCLC brought forth it’s Revised Policy for Use and Transfer of WorldCat® Records. The Policy, which turned a de facto data monopoly into a legally enforceable one, became a focus of intense debate in the library world. On the one side just about every library blogger with a keyboard, and eventually a review board at the ACRL/ARL, raised questions about the idea of anyone “owning” records meant for sharing and most frequently produced by government entities. On the other side, OCLC’s defenders (in truth, mostly employees), talked of OCLC’s “curation” of community content, of “protecting members’ investment,” of the “best interest of libraries,” “OCLC’s public purposes” and of’s role as an essential “switching mechanism” to local catalog (references: 1, 2, 3).

Yesterday, OCLC unveiled the end game that brings everything together. As reported by Marshall Breeding in Library Journal:

“This new project, which OCLC calls “the first Web-scale, cooperative library management service,” will ultimately bring into WorldCat Local the full complement of functions traditionally performed by a locally installed integrated library system (ILS).”

The new service will be “free” to (paying) WorldCat First Search customers.

The move to “web scale” (OCLC-speak for “web”) catalogs was an inevitable one, and is a good one. It’s silly to have every library in the country running their own racks of servers. The economics of server architecture, equipment and systems administration make a single, hosted solution economically superior. It makes particular sense for OCLC. With a large percentage of world libraries’ data sitting in servers for copy-cataloging purposes, a locally branded and faceted web-app. catalog was the next logical step.

The move casts new light on its Policy defenses. OCLC isn’t “curating” library records; it’s leveraging them to enter a new market. It wasn’t “protecting members’ investment,” it was investing members’ money, intended to support OCLC’s core mission, to build a new service. WorldCat isn’t a “switching mechanism” to local catalogs. It will replace them.

I’d love to follow them. I’d love to make a large-scale hosted library catalog. I think LibraryThing could do a lot better. OCLC is full of smart people, but it develops slowly and has shown singular inability to produce social features that anyone would want to use. I think Talis, AquaBrowser, LibLime and Equinox could do better too. And I think, if library programmers got together, they could make truly open community-run service—something others, like LibraryThing, could provide plug-ins for.

We’d all love to try, but we aren’t allowed. According to the Policy, you can’t build the sort of truly “web scale” database that would make such a project economically viable. Anything that replicates the “function, purpose and/or size” of WorldCat is not “Reasonable Use.” Any library participating in such a venture would lose its right to OCLC-derived records, something that would literally shutter most public and all academic libraries in the country. When it comes to large-scale online catalogs, there can be no competing with OCLC.

Let me be clear: I have no problem with OCLC developing software. They do good work. I for one think WorldCat/WorldCat Local is a better product than most server-based OPACs.

But, now more than ever, OCLC must end its attempts to restrict and monopolize library data. It was ugly and unfair for OCLC to claim ownership over what is largely public data. It is obscene to leverage that data monopoly into a software monopoly.

Chess images from Flick users malias and furryscaly. Chess outside makes me think of the Deus’ song Slow. What is it with Europeans and outdoor chess sets anyway?

Labels: oclc, worldcat, worldcat local

Monday, February 23rd, 2009

Research libraries clobber OCLC Policy

The Association of Research Libraries released its report on the new, now delayed OCLC Policy, and it’s a doozy—a forceful rejection of both the process and content of the Policy.

The full report makes for enjoyable reading—outside of Dublin, Ohio anyway. The task force members, research-library heavyweights all, fully and finally put to rest the notion that the only people bothered by OCLC’s power grab are open-data crazies and evil commercial companies.

There appears to have been a significant split. The majority felt it “desirable to have a policy that limits large-scale redistribution of records that could be harmful to the collective” and a minority did not. (It’s great to hear that a team of veterans had at least one member willing to reject the whole structure of cooperative-restriction!) But if the majority felt some policy was called for, they were apparently unanimous in condemning OCLC’s unilateral, non-consultative approach and concerned by a host of issues, large and small. Surveying the current Policy they urge a “fresh start.”

Vague legal language, unclear goals, worrying process, the split between the “nice” FAQs and the actual language of the Policy, issues of clouded ownership and responsibility for bibliographic data, termination provisions, the lack of respect for federal libraries and the legal impossibility of binding them without explicit renegotiation—it’s all here! There’s even a legal opinion, attached to the document, pouring cold water on the idea that the Policy will have any “downstream” effect on parties that haven’t explicitly agreed to it (ie., LibraryThing members). In all, a good drinking game could be invented—every time the ARL report validates or recapitulates a point made on this blog, and on other opponents‘, drink. (If you’re going to Code4Lib this week, I’ll buy the drinks!)

Most striking are the report’s vision of OCLC as a cooperative, and the ways the OCLC policy undermined that trust:

“The collective activity of shared cataloging is a source of deep pride and success in libraries in the U. S. and around the world. OCLC was created as, and is viewed as, a membership organization formed for the purpose of enabling this collective activity…. Members view WorldCat as a collective enterprise, not as a product that they license for use. …”

“The new Policy is clearly intended as a unilateral contract, unilaterally imposed on any entity using records from the WorldCat database, including member libraries…. The member community has seen the introduction of the new Policy as a fundamental change in the nature of the relationship between OCLC and its member libraries. In the eyes of the community, the guidelines expressed a mutual social contract, and the new Policy represents an authoritarian, unilaterally imposed legal restriction.”

Now let’s see what comes of this. OCLC has a needle to thread. The ARL report sets a high bar for consultation and consensus—higher than I think OCLC can reach without rethinking its whole communication model. And the core research-library concerns are serious*. I don’t think they can address them without failing to ensure what I believe to be the Policy’s true intent—establishing a permanent and lucrative data monopoly.

My prediction: Keep an eye on OCLC’s “regional service providers.” Various signs, including what reporters call “highly-placed sources” confirm that OCLC/regional tension is at an all-time-high, with OCLC increasingly rewriting the rules there too—selling directly to libraries in unprecedented ways. I think we can see in these moves a common historical pattern: when the structures that give a powerful institution strength start to weaken, it reaches for a new level of authority not based in the previous structure and therefore not susceptible to weakening. (In this case, OCLC is moving from a robust, often mediated cooperative to a unmediated, contractually-drawn licensure.) Sometimes the effort succeeds; sometimes the attempt crystalizes opposition and hastens and ensures the institution’s decline.

*Even if they picked the members of the Review Board, they may still face trouble from that direction. I doubt that OCLC’s Review Board has what the ARL board apparently had—members who apparently questioned the very idea of restricting access and use!—but all but one of the board members are academic/research librarians and can be expected to understand and appreciate the concerns raised by their ARL colleagues.

Labels: arl, oclc

Thursday, February 19th, 2009

Seeing parallels

Steve Lawson wrote this wonderful piece for his blog See also…, reprinted here (by permission) in full:

There is a large organization whose main business isn’t producing information, but instead hosting and aggregating information for many thousands of users on the web. Users upload content, and use the service to make that content public worldwide, and, likewise, to find other users’ content. Then one day the large organization decides to change the rules about how that information is shared, giving the organization more rights–to the point where it sounds to some people like the organization is trying to claim ownership of the users’ content, rather than simply hosting it and making it available on the web.

A small but vocal and influential group of users object to the policy change. The organization protests that it isn’t their intent to fundamentally change their relationship with their users and that legal documents tend to sound scarier than they really are. Most customers are either unaware or unconcerned by the change in policy, but the outcry continues until the organization backs down a bit, sticking with the old policy for the time being. The future, though, is up in the air.

Facebook? Or OCLC?

Perfect, just perfect.

Labels: facebook, oclc, open data, steve lawson

Sunday, February 15th, 2009

Why Wirral? One partial explanation.

A recent article in the Telegraph describes a worrying fall-off in library books and library usage in the UK.

Over the past six years books in public libraries in the UK have fallen 12%, from 116 million to 103.2 million. Library check-outs have fallen faster—16.5%. According to the Telegraph, UK librarians are bracing for another round of declining numbers, coming amid budget shortfalls across the board—and expecting to get their budgets slashed.

Reflecting on these problems, the CEO of the Museums, Libraries and Archives Council (MLA) told the Telegraph:

“[W]e live in an age where books can be bought cheaply from supermarkets or the internet so the reasons to visit a library have changed for many users.”

Wirral as a microcosm. Cuts have started. The Wirral council system in NW England (LibraryThing Local), is closing 11 of 24 branches.

They sure don’t deserve it. Taking a look at the Wirral Libraries website, anyone can see they’re doing a lot of things right. The branches look well-organized and inviting. They’ve got a fair number of computers and free Wifi. They have a special outreach program for the house-bound. They even lend toys!*

But they are doing one thing very wrong—namely that Wirral, like most libraries, isn’t really “on” the web.

People are finding things in supermarkets and the internet because it’s easy to do so. On the internet, one-stop shopping means that a huge panaply of useful and interesting things are available from a single, unified and well-understood interface—from local bars, to local bands, to some 600 pizza and 400 curry joints in the area (Man, I love Britain!). Many of these resources are not only in Google searches, but Google will plot them on a map for your convenience.

What isn’t online are library books! The Wirral Libraries’ catalog, a Talis Prism OPAC, hardly registers in Google, which knows only 7,000 pages, from a library with more than 300,000 items. Worse, virtually every Wirral page in Google is broken. On the right are a representative sample of what Google knows about from the Wirral catalog. Each link has the same title. And each links to an expired session that proclaims:

You can, of course, get to the Wirral Libraries catalog if you know that’s where you want to go—fifth link down, then the top rounded button on the right. That’s not the same thing.

And even if you find a book, you can’t bookmark it for yourself or forward it to a friend–the links will die off in a few minutes. In refusing to allow links and spider, the Wirral website sets itself apart from the other websites Wirral residents might use. The rest of the web just works—it’s in your search box, where most internet-aware people do most of their information finding.

Lastly, where is WorldCat in all this, the “switching mechanism” and “point of concentration” (Karen Calhoun) OCLC provides libraries as an alternative to the “lunacy” (Roy Tennant) of libraries being on the web for themselves? Nowhere. None of the Wirral Libraries are in it, and WorldCat doesn’t list a copy of Harry Potter in the Deathly Hallows closer than 60 miles away (postal code: CH46 6DE?). One may speculate that Wirral wasn’t willing to pay for the service, which anyway gets quite insignificant traffic.***

Who’s to blame? Wirral Libraries’ misfortunes are no doubt many, and not being part of the web is not the largest. But it’s a part. Wirral citizens aren’t seeing their library appear in their search results. They aren’t as aware of its riches as they might otherwise be. If they were aware, it’s likely they’d use these resources more, and the system would be easier to defend politically.

It won’t do to blame Wirral for this. Library vendors have long handicapped their products in this way, and Wirral Libraries surely bought their Talis Prism system a while ago.** Budgets are short—and getting shorter. Both the web and this recession have hit libraries by surprise.

But refusing to participate in the central information technology of the age has its costs. And the leaders of Libraryland who advocated and continue to advocate for closed solutions, closed data and staying out of search indexes—except as “negotiated” with Google—have contributed to this situation. The respected guides have taken libraries off the great river of information, and left them grounded on the shore. Now someone’s coming for the boat.

I hope the residents of Wirral fight like hell to keep their libraries open. Then they should fight like hell to make their libraries truly open.

*I don’t know how common this is in Britain. I get the sense it’s not too common in the US, but it happens. The Hingham Public Library in Hingham, MA lends practically everything, from toys to paintings on the wall.
**It’s ironic that Wirral’s OPAC was made by Talis, now one of the more progressive and forwarding thinking library vendors. I’ll put this in a footnote to avoid “shilling,” but if Wirral can get a new OPAC, I’ll arrange for them to get LibraryThing for Libraries for free until they get back most of their funding. Maybe Talis would kick in an incentive to upgrade their OPAC?
***WorldCat is supposed to be the central website of Libraryland, but third-tier websites like LibraryThing and Dogster—the social network for dog lovers!—are currently beating it.

Labels: indexing, oclc, riverine metaphors, web, wirral, wirral libraries

Sunday, February 1st, 2009

The evil 3.26%

The question has arisen of why I advocate against OCLC’s attempt to monopolize library data. Roy Tennant of OCLC, an intelligent, likeable man whom, although we disagree on some issues, has done more for libraries than most, accused me of writing and talking about the issue because:

“… your entire business model is built on the fact that you can use catalog records for free that others created and not contribute anything back unless they pay (yes, there is a limited set of data available via an API, but then they need the chops to do something with it).”

Fair enough. Let’s look at the numbers, and the argument.

I did a comprehensive analysis, available here as a text file, with both output and PHP code. If anyone doubts it, send me an email and I’ll let run the SQL queries yourself.

The numbers. As of 6:17pm Sunday, some 3.5 years after LibraryThing began, our members have added 35,831,904 books from 690 sources:

  • 85.48% came from bookstore data (almost exclusively Amazon).
  • 4.88% were entered manually by members
  • 9.63% were drawn from library sources

Now, where did that 9.63% come from?

These sources were in every case free and open Z39.50 connections our members accessed through us. Very frequently they accessed records of their own academic institution, but in any case, these members accessed these records alongside everyone else—libraries, museums, public agencies of one sort or another and all the students and scholars who use RefWorks, EndNote and other such services. Meanwhile LibraryThing has never been asked to stop accessing a source. On the contrary, libraries frequently ask to include themselves on our list of sources.

Of the 9.63%, by far the largest source is the US Library of Congress, the source of 2,203,182 books, or 6.15% of the total. The Library of Congress is a Federal organization, created for the benefit of the country and falling under the government-wide rule that public work is for the benefit of the public, and cannot be copyrighted or otherwise “owned.” As long as technology was there the Library of Congress has allowed access to its cataloging data; the OCLC policy change will not affect that.* We are grateful the Library of Congress does this. But insofar as we are taxpayers and support American notion of public ownership of public resources, I will not apologize for it. (On the contrary, I feel that OCLC should apologize for attempting to restrict and profit from public work.)

3.26%. That leaves 3.48%—more appropriately 3.26%**—the evil sliver upon which our “entire business model is built.” Take a look at the top fifteen here:

  • Koninklijke Bibliotheek — 130,406 books (0.36%)
  • National Library of Scotland — 80,826 books (0.23%)
  • British Library (powered by Talis) — 80,205 books (0.22%)
  • Gemeinsamer Bibliotheksverbund (GBV) — 77190 books (0.21%)
  • National Library of Australia — 72,896 books (0.2%)
  • Helsinki Metropolitan Libraries : 70,551 books (0.2%)
  • The Royal Library of Sweden (LIBRIS) : 63,430 books (0.18%)
  • Italian National Library Service : 60,643 books (0.17%)
  • Vlaamse Centrale Catalogus : 58,936 books (0.16%)
  • LIBRIS, svenska forskningsbibliotek — 54,339 books (0.15%)
  • ILCSO (Illinois Libraries) : 28,517 books (0.08%)
  • Yale University : 26,885 books (0.08%)
  • Det kongelige Bibliotek : 24,564 books (0.07%)
  • University of California : 20,098 books (0.06%)
  • : 19,628 books (0.05%)

With 690 possible sources, it’s a long, long tail. We take 2087 from the Russian State Library, 1067 records from the Magyar Országos Közös Katalógus, 286 from Princeton, 106 from Koç (in Izmir), 63 from Hong Kong Baptist, 4 from the Universidad Pública de Navarra, etc.

It should be apparent to anyone looking at the above that the 3.26% is largely about satisfying the needs of foreign LibraryThing members–a small percentage of our membership and hardly central to our “business model.” Equally clear is the government orientation of the list—only one, Yale—is a private institution. The rest are all government agencies. Of course, no records actually came from OCLC itself!

All-in-all, library data from non-federal sources is a negligible component of LibraryThing’s content. LibraryThing is not some big plot to capture library records. That idea is simply not in the figures.

Do we give back? What of the second half of the accusation, that we “not contribute anything back unless they pay” and the bit against APIs.

First, assuming Roy means LibraryThing data generally, it’s absurd to suggest that because LibraryThing draws 3.26% of its data from free, unlicensed sources, our members’ data and services are owned by OCLC or its members. OCLC no more owns members’ tags and reviews on bibliographic metadata than Saudi Aramco owns the furniture I bring home in my car. Who in their right mind would every accept a list of titles and authors from a library, if that meant ceding ownership over what you think about the book?

LibraryThing and OCLC both have terms. But LibraryThing license terms are unlike OCLC’s in a number of ways. LibraryThing members knew what they’re getting, unlike OCLC members, who thought they were sharing with other libraries, but find themselves the lynchpin of a monopoly. From our inception LibraryThing has reserved a right to sell aggregate or anonymized data. We also sell some reviews—giving members the option to deny them to us. All our member data is non-exclusively licensed, so members can do anything they want with it outside of LibraryThing, and members can leave at any time. Neither is true of OCLC members’ data under the Policy.

Cataloging data. That leaves LibraryThing cataloging data, of which we have three types. We don’t have any legal responsibility to make it free, but we do so anyway.

First, we would be happy to offer downloads of original or modified MARC records! We haven’t done so in order to avoid attracting a suit from OCLC. But perhaps we were mistaken. If OCLC would like us to start releasing our MARC records to others, someone should let us know. We will release them under the same terms they were given to us—freely.

Second, our Common Knowledge cataloging (series, awards, characters, etc.) is free and available to all. We can’t think of a better way to provide it other than through an API, but we’re all ears if Roy knows of a better way. And if OCLC would like to admit it to WorldCat, without subverting its always-free license, they don’t even need our permission. Go on, OCLC, make my day!

Thirdly, there’s ThingISBN, which was directly patterned on OCLC’s xISBN service. Despite Roy’s criticism, they are identical in format and delivery so if there’s something wrong with its XML APIs, OCLC has only itself to blame. Indeed the only difference is cost: ThingISBN is completely free, both as an API and as a feed; xISBN, which member data creates, is sold back to members.

Stop killing the messenger. It’s time for OCLC to recognize they made this mess, not others. They have perpetrated some astouding missteps—from attempting to sneak through a major rewrite of the core member policy in a few days without consultation, to a comic series of rewrites and policy reversals, culminating in withdrawing the policy entirely for discussion. (It now seems clear they did so on the heels of a member revolt, whether general or just of some key libraries.)

It’s also important to see that, before OCLC started threatening companies and non-profits doing interesting but non-competing things with book data—notably LibLime, Open Library and LibraryThing—they had none of the problems they have now. Now, by attempting to control all book data, they’ve spurred the creation of LibLime’s ‡Biblios system, a free, free-data alternative to OCLC and, well, sent me, Aaron Swartz of Open Library and dozens of prominent library bloggers into orbit.

Being caught so flat-footed can’t feel nice. It must be hard feeling like royalty and discovering your subjects think themselves a confederacy. But this is no time for OCLC to start attacking the credibility of its opponents. Surely LibraryThing is an unusual case—a company that has an opinionated, crusading—okay, loud—president. But the thousands of librarians and other individuals who supported our calls, or raised other objections to the OCLC policy are not less well-motivated than OCLC and its employees. They do not love libraries less. They are, rather, concerned that OCLC’s urge to control library metadata threatens longstanding library traditions of sharing, and sets libraries on a path of narrowness and restriction that will surely prove no benefit in this increasingly open, connected world.

*I need to write a blog post on this, but I was recently informed that whatever changes OCLC makes cannot touch federal libraries without explicit authorization. That is, federal law does recognize clauses like “if you continue to use” or “we can change this at any time.”
** It should more accurately be 3.48%, because we are getting our British Library records through Talis, who have a contract with the British Library.

Labels: oclc

Thursday, January 22nd, 2009

The Guardian asks “Why you can’t find a library book in your search engine?”

The OCLC data-grab has hit the “real” media—an article in the Guardian. The article asks the simple question, “Why you can’t find a library book in your search engine?”

It’s an obvious question. The answer isn’t quite as simple as they put it. Libraries would be in Google if their library catalogs could be spidered. But they’d still be hampered by OCLC in various ways. Anyway the coverage of OCLC, Open Library, and LibraryThing are spot-on. And the subtle nationalist angle—an American site!—can’t hurt.*

Three cheers for the Guardian. Next up, the New York Times? We can hope.

*Did you know OCLC invaded Iraq?

Labels: guardian, oclc