Casey « The Thingology Blog

Author Archive

Monday, March 30th, 2009

LibraryThing at Computers In Libraries 2009

LibraryThing, your favorite makers of libraries in computers, will be at Computers in Libraries this week. We’ll be passing out free stuff and showing off our new LibraryThing for Libraries feature so if you’re at CIL, stop by booth 214 and say hi. Unfortunately, we’re rhino-less this time, but we do have T-shirts and laptop stickers (and Tim.)

Our new feature allows our catalog enhancements to run even on items that don’t have an ISBN. Check it out in action on this 1948 edition of Tom Jones, or this 1937 edition of David Copperfield

There’s no ISBN on those items, but our code is still smart enough to load the right tags and recommendations info. It uses a combination of our new What Work API and the LibraryThing Connector (the JavaScript that powers LTFL) to pull title and author information out of the catalog’s HTML and then match it against our system. This new feature should help our academic libraries in particular, since they tend to have a lot of older pre-ISBN books.

Labels: apis, CIL, CIL2009, conference, librarything for libraries, rhinos

Tuesday, March 24th, 2009

Polaris support for LibraryThing for Libraries

Following on yesterday’s announcement of Koha support, we’re happy to announce that LibraryThing for Libraries catalog enhancements are now available for Polaris OPACs.

First off, we probably owe the Polaris people a public apology for this being such a long time coming. They first contacted me about integrating LTFL in their systems a year and a half ago, when we only had 5 or 6 customers. One of their libraries had asked about it, and as a company, they’ve decided to be incredibly responsive to the cutting edge things their libraries want to do. They’ve kept pushing us (on behalf of their customers), even as technical and non-technical obstacles (mostly non-technical) have prevented us from seeing it through.

It’s a great corporate philosophy, and far too rare in the library world. Now that everybody takes our phone calls and wants to work with us, they deserve a lot of credit from being down from day one. It’s unsurprising to me that they scored among the highest customer satisfaction of any commercial ILS vendor in a recent poll; clearly service is a high priority for them.

Want to see the catalog enhancements in action? Here are a couple of examples from our first Polaris customer to go live, Glendora Public Library: (dogs), (fantasy). Several more Polaris libraries are testng it.

Because of the way Polaris’ system works, you currently have to press the LibraryThing button to get the content for a particular item. In the next version of Polaris, not only with LTFL be installable without editing any template files, but there will be no LibraryThing button; our content will load when somebody clicks on the “full display” button. So far, we haven’t added review support, but we’re happy to do it if there are interested customers.

Currently we have two installation options: the first only requires a single line of code to be added to your templates, but it does the LibraryThing button instead of loading with the details. This is what Glendora is using. The other installation option (provided by an engineer at Polaris) requires more involved editing of their templates but makes the current version of Polaris work with LTFL like the forthcoming version will.

Interested in getting LibraryThing for Libraries for your Polaris catalog? Contact us through the Interested? form.

Labels: librarything for libraries, ltfl

Wednesday, March 12th, 2008

Street-grade metadata of unknown origin and quality

“Steroid” Scandal Rocks Major League Libraries:

Why not make the top quality stuff available to everyone? That’s the only way to really level the playing field,” says metadata advocate Harley Trion. “If we close down the labs creating high-quality metadata, you will see widespread adoption of street-quality metadata like social tagging and folksonomies, because that’s all you will be able to get. I’d rather know that my kids were using metadata that is made in a clean lab with experts and quality assurance processes than have them experimenting with street-grade metadata of unknown origin and quality.”

Sadly, street-grade metadata has already polluted our most venerable cataloging institution, the Library of Congress. Check out this MARC record for Fourth Comings, by Megan McCafferty (interesting part in bold):
=LDR 01506cam 2200361 a 4500 =001 14768798 =005 20070917102332.0 =008 070314s2007\\nyu\\\\\001eng\ =010 \$a 2007010818 =020 \$a9780307346506 =020 \$a0307346501 =035 \$a(OCoLC)ocm86109925 =035 \$a(OCoLC)86109925 =040 \$aDLC$cDLC$dYDX$dBAKER$dBTCTA$dWIQ$dYDXCP$dDLC =043 \$an-us-ny =050 00$aPS3613.C34$bF68 2007 =082 00$a813/.6$222 =100 1$aMcCafferty, Megan. =245 10$aFourth comings :$ba novel /$cMegan McCafferty. =250 \$a1st ed. =260 \$aNew York :$bCrown Publishers,$cc2007. =300 \$a310 p. ;$c25 cm. =650 $aDarling, Jessica (Fictitious character)$vFiction. =650 $aYoung women$vFiction. =650 $aPeriodicals$xPublishing$vFiction.

=650 $aChick lit.

=651 $aBrooklyn (New York, N.Y.)$vFiction. =856 42$3Contributor biographical information$uhttp://www.loc.gov/catdir/enhancements/fy0743/2007010818-b.html =856 42$3Publisher description$uhttp://www.loc.gov/catdir/enhancements/fy0743/2007010818-d.html =856 41$3Sample text$uhttp://www.loc.gov/catdir/enhancements/fy0743/2007010818-s.html
Chick Lit is now a subject heading in the Library of Congress. We’ve entered the asterisk era of metadata.

[Tim adds: I’ve known about the Chick Lit LCSH for some time now, first spotting it while giving a talk on how great the Chick Lit tag was! I think it’s a great move, but also strange in light of well-established policies against adding subjects afterwards. The LCSH “Chick Lit” missed chick lit’s actual heyday! Anyway, I’m not betting on the LC getting into all the great tags—steampunk, cyberpunk, paranormal romance and, of course, vampire smut.]

Labels: chick lit, library of congress, metadata

Thursday, February 21st, 2008

Taxation without web presentation

The Library of Congress recently signed a deal to accept 3 million dollars worth of “technology, services and funding” from Microsoft towards building a new website powered by Microsoft’s Silverlight plug-in. I (Casey) usually leave the blogging to Tim, but I’ve got to say something about this.

Microsoft, in general, is very good to libraries, and libraries are very good to them. Microsoft gets huge tax breaks for donating software licenses — something that doesn’t really cost them a thing — and libraries get software they couldn’t afford otherwise.

This is a different beast, however. It sounds like Microsoft technologies will be used from the ground-up — if you use Microsoft’s Silverlight to do the front-end, your developers pretty much have to use Visual Studio and Microsoft languages, your database admins have to use MS SQL Server, and your systems admins have to use Windows and IIS. In any case, it seems unlikely that Microsoft would consult on a project and not recommend you use Microsoft as much as possible.

Once you’re locked in to the entire Microsoft stack, you pretty much can’t change a single piece without completely redoing your entire IT operation from top-to-bottom. When the free deal expires or you need new servers, you end up having to buy new Microsoft licenses and software. It’s like giving somebody a kitten for a present — they’ll still be paying for and cleaning up after your gift 10 years from now.

Most disturbingly, users are locked in, too: anybody using an iPhone, an old version of Windows, any version of Linux, or any other operating system or device not supported by Silverlight will be unable to use the Library of Congress’ new website. How is that compatible with the principles of democracy or librarianship? It’s taxation without web presentation. And how exactly is that a quantum leap forward? (If the LOC really wanted to make a quantum leap, it would open up its data.)

Giant package deals are the wrong way to make both technical and business decisions about software; it doesn’t matter who’s doing the packaging, or how. You should be able to use the best operating system for the job, the best database for the job, and the best programming language for the job. You should be able to hire developers and systems administrators, not Microsoft developers and Windows administrators, and should give them the freedom to use the best solution, not the Microsoft solution. Sometimes the Microsoft solution is best, sometimes it isn’t, but that’s something that shouldn’t be dictated unilaterally.

“I take comfort when I see one of our competitors looking to hire Microsoft developers instead of software developers, for reasons the hacker/entrepreneur Paul Graham explained well:

If you ever do find yourself working for a startup, here’s a handy tip for evaluating competitors. Read their job listings. Everything else on their site may be stock photos or the prose equivalent, but the job listings have to be specific about what they want, or they’ll get the wrong candidates.”

“During the years we worked on Viaweb I read a lot of job descriptions. A new competitor seemed to emerge out of the woodwork every month or so. The first thing I would do, after checking to see if they had a live online demo, was look at their job listings. After a couple years of this I could tell which companies to worry about and which not to. The more of an IT flavor the job descriptions had, the less dangerous the company was. The safest kind were the ones that wanted Oracle experience. You never had to worry about those. You were also safe if they said they wanted C++ or Java developers. If they wanted Perl or Python programmers, that would be a bit frightening– that’s starting to sound like a company where the technical side, at least, is run by real hackers. If I had ever seen a job posting looking for Lisp hackers, I would have been really worried.”

But it’s disappointing to see an institution you respect, admire, and fund with your tax dollars going down that same road. It’s even more disappointing because the Library of Congress does make smart decisions about technology. They announced another major project a few months back that took an entirely different approach to selecting the tools they would use. The people behind the World Digital Library sat down and thought about the best tools for the job, and they came up with an interesting and eclectic list: “python, django, postgres, jquery, solr, tilecache, ubuntu, trac, subversion, vmware”. Those tools are free, open-source, designed with developer productivity in mind, aren’t tightly linked to each other, and don’t inherently limit who can access your website. That’s what should matter.

Labels: library of congress, microsoft, open data, open source

Monday, December 3rd, 2007

MARCThing: A simple, self-contained MARC and Z39.50 application

Over the past couple of weeks, LibraryThing has been rolling out major improvements to our cataloging system—a new system for retrieving and parsing book information we’re calling “MARCThing.”

MARCThing is a major advance for LibraryThing. We’ve sunk months of development time into it, but we’re not going to keep it to ourselves. We will be releasing all the code for non-commercial use in libraries and elsewhere.

When the dust settles, LibraryThing members will be able to draw on nearly 700 data sources worldwide, with greatly improved foreign character support and better data manipulation behind the scenes. With MARCThing underneath we will be able to introduce many new features and to reach a truly global audience. But we are confident that developers outside of LibraryThing will find many other, equally compelling uses for MARCThing, and make useful changes and extensions.

What it is. When I was given the task of improving LibraryThing’s cataloging system and other involving library data, I immediately thought of Solr, one of the most influential pieces of software to come out in the past couple of years. The big idea behind Solr is that it provides a “magic box”—an easy, self-contained interface to some very powerful but complex technology, the Lucene search engine. Solr hides the messy details of Lucene from the developer and provides all sorts of extra goodies in a self-contained package. The net result is you can instantly stick an extremely powerful search engine into your project with almost no work. This combination of power and ease-of-use has quickly made it a developer favorite, and spawned all sorts of interesting projects that never would’ve come out without Solr.

I wanted my own magic box that would handle the two main protocols used by libraries to transfer cataloging data, MARC and Z39.50, without anyone having to go into the details of how they work. And since I didn’t want to have to find or build another magic box, ever, I wanted something that could be easily used from any programming language.

Writing it was pretty easy—I used Django for the web part, Pymarc for MARC, and PyZ3950 for the Z39.50 support. With a good software library, working with Z39.50 or MARC records isn’t hard. The hard (or at least time-consuming) part of MARCThing was tracking down servers and dealing with oddball cases. There are many lists of Z39.50 servers out there, but the data is often incomplete, incorrect, or out of date. When you do find a Z39.50 server, oftentimes it’s non-standard in some way, or only has limited functionality. So the process of connecting to libraries using Z39.50 is fraught with guesswork and manual fiddling. That’s bad. The whole point of a standard should be to free you from guesswork.

How to use it. Using MARCThing is simple. Either send it some MARC records or what Z39.50 server you want to search and what you want to search on, and get back XML (or a variety of other formats) that you can use in applications without having to know a lick about library cataloging. All the messy details (and there are a lot of them) are hidden from view. Everything just works. You don’t need to know what a nonfiling indicator or a use attribute is, or the difference between MARC8 and UTF-8. You just need to know how to make an HTTP request.

What I hope is that this inspires allows people not in the library world to do cool things with library data. It’s sad that working with library data is such a hassle — there are so many underused resources out there. I won’t go too much into the technical problems with Z39.50 and MARC, but I do have a recommendation for anybody involved in implementing a standard or protocol in the library world. Go down to your local bookstore and grab 3 random people browsing the programming books. If you can’t explain the basic idea in 10 minutes, or they can’t sit down and write some basic code to use it in an hour or two, you’ve failed. It doesn’t matter how perfect it is on paper — it’s not going to get used by anybody outside the library world, and even in the library world, it will only be implemented poorly.

Open source plans. LibraryThing was already the only major cataloging site that used any library data. (The rest use Amazon’s data exclusively, a severe hurdle to book lovers in the US and an absolute barrier to those in most other countries.) It took us a long time to develop, and we have limited resources. We are not eager to give our competitors such a valuable tool — they can get their own library geeks. At the same time, we are eager to encourage non-profit use and to license its non-competing commercial use for a token amount.

We’re thinking of releasing the code under the Creative Commons Attribution-Noncommercial-Share Alike license, but it will depend on what people want to do with it. If you were bitten by a radioactive librarian and suddenly gained the power to search 700 libraries worldwide, what would you do?

Stay tuned; code is coming soon!

Labels: django, librarything for libraries, marcthing

Tuesday, October 30th, 2007

LibraryThing for Libraries: October

October was another good month for LibraryThing for Libraries, with 7 new libraries coming on board.

First up on the list is the Los Gatos Public Library in California. Although a very small library, they show yet again that you don’t need to be big to innovate. They’ve promoted LibraryThing for Libraries extensively on their blog; they’ve even made a cool little video on YouTube of the widgets in action.

Library number two is the East Brunswick Public Library in New Jersey. Much more than most libraries, EBPL has really positioned themselves as a part of their community. You can see this in their offering of notary and passport renewal services at their libraries and their involvement with the community TV station, EBTV. I like seeing libraries that try to integrate themselves into their patrons’ daily lives like that. For an LTFL action shot, here’s “Treasure Island” in their catalog.

Next up is the Institute of Technology Tallaght, Dublin, our second library in Ireland. Along with our first one, the Waterford Institute of Technology, they’re a part of our nefarious plan to get every Irish engineering major hooked on LibraryThing.

Number four is the Kingston Information and Library Service in Melbourne, Australia. They have the honor of being our first Australian library, but we’ve got a bunch more on the way, starting with number five, the Australian Tax Office. The ATO’s running LibraryThing for Libraries on their intranet only, so I don’t have a demo URL for them. I’d like to thank them for coming up with a righteous javascript hack to make our widgets work with SirsiDynix’s new EPS/Rooms system.

Arlington Heights Memorial Library in Illinois is next up. I’ve already had a couple of other prospective customers ask to have their installation “look like Arlington Heights.” You can see why — they’ve done a great job blending LibraryThing for Libraries into their III catalog.

Finally, GMILCS is a consortium of academic and public libraries in New Hampshire. GMILCS runs SirsiDynix’s Horizon Information Portal catalog. It’s been cool to work with so many of the same people I knew when I used to support Horizon Information Portal for Dynix. Tim will be giving a talk about LibraryThing for Libraries at the CODI, the annual SirsiDynix user conference tomorrow along with Colleen Medling of the Salt Lake County Public Library. It should be a good one, so if you’re at CODI, check it out.

Along with picking up pencils, spatulas, and other stuff with vendor names on them, and talking to a lot of people in denim shirts, annual user conferences are always a good place to learn about new ways to make the software you’re stuck with do new things. That’s really valuable when change happens so slowly in the library software world; I remember helping GMILCS out when they first brought up their current catalog back in 2002. 5 years is a lifetime on the internet, and the gap between the speed that enterprise library software moves and the speed the web moves only seems to be getting bigger and bigger. So it’s vital for software vendors to make catalogs that can be modified, extended and customized both internally and externally. Customers shouldn’t have to wait for years for the shiny next generation product to get new features. It’s not just up to the vendors, though; customers need to keep finding ways to improve their out of the box systems (like David Pattern’s interesting new HIPPie project), library managers need to create a culture where change is embraced, and services like LibraryThing for Libraries need to keep adding more new functionality to existing systems. Legacy library software is inescapable — major upgrades will always be a gigantic chore, and even minor changes to the core of the system will often have huge repercussions on dozens of staff and thousands of patrons. That should not keep libraries from constantly making improvements to their public interfaces.

Labels: ahml, arlington heights, ato, australian tax office, codi, east brunswick, ebpl, gmilcs, iii, itt tallaght, kils, kingston, librarything for libraries, los gatos, ltfl, sirsidynix, slco

Tuesday, September 25th, 2007

LibraryThing for Libraries: Richland County, Cal State – Channel Islands and San Francisco State University

Richland County Public Library

San Francisco State University (source)

Cal State University – Channel Islands (source)

LibraryThing for Libraries just passed another milestone: we now have too many customers to keep track of in short-term memory.

Our first new library is the Richland County Public Library. We’re really excited to have them on board, since they’re the biggest public library we’ve worked with so far—at nearly three million checkouts a year. They’re doing a lot of simple yet innovative things, like offering reference via instant messaging and having a kid-friendly website. Of course, they have a blog too. I have a soft spot for large public libraries, having worked in one for several years and having lived in big cities with great library systems (Denver, Salt Lake City and Seattle) for most of my life. We hope to be adding many more large public libraries in the coming months.

Our second library is the San Francisco State University library. With around 30,000 students and four million items owned by their library, they’re a big one too. They’ve got one of the best- looking and easy-to-use library websites I’ve seen (and I look at a lot of them – occupational hazard). Their electronic resources librarian did an excellent presentation on LibraryThing for Libraries a few weeks back.

Our third library is Cal State University – Channel Islands, located in the beautiful area between Los Angeles and Santa Barbara. They’re our first Voyager customer, and we’d like to thank them for helping us work out how to make Voyager work with our widgets. They’ve also volunteered to be our latest data source for book searching.

Photo credits: (1) Courtesy Richland County Public Library. (2) CSUCI bell tower by Flickr:AIBakker (CC Attribution-NonCommercial-NoDerivs 2.0). (3) Library by Flickr:relic (CC Attribution-NonCommercial-NoDerivs 2.0)

Labels: csuci, librarything for libraries, rcpl, richland county, sfsu

Monday, September 17th, 2007

Evilness — Opposition — Policy and Procedure

Someone recently called LibraryThing for Libraries out over our terms and privacy policy. Guess what? They were right to do it!

The policy was vague. It didn’t describe what we actually use library data for and how we use it. It gave us potential room to do bad things.

Well, we don’t want the room. We’ve always treated our user and library data carefully, and we always will. So we’ve written it again, this time as a straight-jacket.

You can read the full text here, but the Cliff’s Notes version is this:

We don’t collect any data from or about library patrons;
We only use a library’s data to enrich their own catalog
We’re not allowed to change the policy suddenly

If anyone feels we’ve left anything out, let us know.

Labels: librarything for libraries

Tuesday, September 4th, 2007

LibraryThing for Libraries: Randolph County, Bowdoin and Clarement Colleges

Bowdoin College (source)

The Libraries of Claremont Colleges (Honnold/Mudd Library) (source)

Randolph County Public Library, Asheboro Public Library (source)

We just added three new and very different members to LibraryThing for Libraries—a public library system in North Carolina, a liberal arts college in Maine, and a collegiate consortium in California.

The first is the Randolph County Public Library, a system of seven libraries in the Asheboro, North Carolina area. On their blog, the library has described LibraryThing for Libraries as “stunning” and a “quantum leap.” We couldn’t agree more.

Randolph County is also our first public demonstration of LibraryThing for Libraries within what is probably the most widely-used online catalog, the Horizon Information Portal (HIP) from SirsiDynix. Up until now, our live libraries have all used WebPac and WebPac Pro from Innovative Interfaces. As we promised, LibraryThing for Libraries works with any library OPAC, and just great with HIP.

Check out Randolph County Public Library searches for regency fiction or the novel Eragon.

The second is Bowdoin College, located in Brunswick, Maine, just up the road from LibraryThing’s global HQ in Portland. Bowdoin is a small liberal arts college with about 1,700 students. For a small library, they are doing a lot of innovative things and have a good-looking, easy-to-use website. They’ve put in a neat little JavaScript tooltip to explain what tags are that we just might have to steal. Check out LTFL in action here and here.

Libraries of the Claremont Colleges serves Pomona, Harvey Mudd, Claremont McKenna, and several other colleges I couldn’t get into. They’re our largest collection to date, with LibraryThing providing data on over 173,000 of their titles! Reflecting the diversity of the colleges they serve, they have a wide collection of materials, from combinatorics to gender studies. The alternate editions widget is proving especially useful for academic libraries, as can be seen for this translation of the poetry of Catullus.

It’s extremely gratifying to watch how quickly LibraryThing’s data keeps growing. LibraryThing for Libraries was originally envisioned as a product for public libraries, but LibraryThing’s continued growth is making that distinction seem less relevant. We’re now up to three academic libraries, with several more in the pipeline, and we’ve even started working with a couple of corporate/special libraries.

In the three months since our first library started using LibraryThing for Libraries, we’ve gone from 17 million tags and 13 million items to 23 million tags and 18 million items. Every item and tag added to LibraryThing improves the reach and power of LTFL. It’s really cool to be involved with a product that gets better and more powerful every minute of the day.

Photo credits: (1) Bowdoin College photo by Flickr:cybertaur1 (CC Attribution). (2) Honnold Mudd Library by Jarod Hightower-Mills (Public Domain). (3) Asheboro Public Library photo by Flickr: Asheboro Public Library (CC Attribution-ShareAlike 2.0)

Labels: bowdoin, claremont colleges, librarything for libraries, randolph county public library