Archive for the ‘Open Shelves Classification’ Category

Friday, June 19th, 2009

Project Managers Sought for OSC

Due to an increase in work commitments for both Laena and David, new
project managers are sought for the Open Shelves Classification project. Below is a status report of the project. Interested leaders should contact Tim Spalding (tim@librarything.com).

OSC status report June 2009:

One year into the project, here is what we have accomplished so far:
-Many wide-ranging discussions were held in the LibraryThing Build
the Open Shelves Classification group and the OSC blog.

-Optional facets were agreed upon initially as the way to handle audience, format, and language.

-An initial list of top level categories was compiled by the end of 2008 and put out for review.

-In January 2009, LibraryThing members tested these categories by applying them to works in LibraryThing using the ClassifyThis feature.

-In January, a brainstorming meeting was held at the ALA midwinter meeting and was attended by librarians and non-librarians.

-In February, the feedback from the testing was used to further refine the top level categories.

-Starting in February and running through May, small groups began to construct the secondary levels for certain categories.

-Throughout the spring, Laena and David did outreach for the project, writing pieces for the PLA blog and the IFLA
newsletter
, and reached out to libraries in an unsuccessful search for public library data.

-In May, the current list of categories of the OSC was added to sandbox of the National Science Digital Library Metadata Registry.

Categories with second levels in development:
-Art
-Biography & Autobiography
-Design
-Fiction
-History
-Performing Arts
-Religion
-Science

After working on the project for a year, we have the following recommendations:
-The project needs a steering committee structure for leadership. The
project is too large in scope for one or two librarians to manage
without other leadership.

-More involvement and leadership from public librarians! They know
the intended audience of the OSC best.

Labels: Open Shelves Classification, OSC

Monday, February 23rd, 2009

Classify your heart out

Here it is, the revised list of top level categories. These have been vetted by all of us for awhile and it’s time to start building subcategories. We’ve created threads in the Group to discuss the subcategories of each top level. Keep in mind that these need to be comprehensive, but not excessively granular. Take a look at this example of possible subcategories for PETS.

After more of the second levels are fleshed out, we plan to have a new classify-this feature to test out the classification system on books in LibraryThing.

Until then, classify and discuss!

Labels: Open Shelves Classification, OSC

Monday, February 9th, 2009

Open Shelves Classification Update

Hello! Well we have been busy since Tim announced the classify-this feature. The OSC group has been extremely active with over 300+ posts about the top level categories (not to mention insightful threads popping up to discuss second level categories). Thank you for your feedback! Meanwhile, at the Midwinter meeting of the American Library Association we were able to have a really valuable face-to-face conversation with LibraryThing users.

We have been processing all your feedback and working on version 2.0 of the top level categories. Before we get to that, we wanted to let everyone know that we do read all the posts in the Open Shelves Classification group. Because of the high quantity of posts (and our day jobs) we cannot comment or respond individually as often as we would like.

Some key points after discussion, feedback and analysis:

The number of categories in the top level. As decided last summer, we will have more rather than fewer top level categories. The top levels are not supposed to represent an even distribution of all possible branches of knowledge. Instead, the OSC top levels should represent the largest categories that public libraries will want to use. [Similar to how Library of Congress classification was built to meet the needs of the Library of Congress, while Dewey’s system tried to contain all recorded knowledge.]

Complaints about specific topics in the top level. Remember, there is no value judgment in a topic being placed at the top level or underneath a broader topic. For now, topics like Pets, Gardening, and True Crime are present because of feedback from public librarians that these are heavily requested books that are often pulled out into their own sections. As a guiding principle, the OSC will be statistically tested, so some of our top level categories may change as actual libraries begin to reclassify their collections.

-The nature of classification. Any classification system forces us to choose one topic for the book, even though that book may be about more than one topic. This is not a flaw in the OSC categories but in the nature of classification. Libraries will still use multiple subject headings in the catalog to capture all the topical aspects of the work.

Facets. As talked about a few months ago, we currently plan on the top level categories being only topical while other aspects of the work will be represented by facets. For example, format will be captured in a separate facet. [And to clear up any lingering confusion, Comics will be a format facet.] Another facet talked about was audience. This means children’s books will be tagged in the audience facet. We envision that these facets will be optional and libraries can use them if, for example, they want to pull out all the comics and shelve them in a unique section. Alternatively, the facet could be ignored and then graphic novels would be intershelved with other like topics. Here is a picture of what we are envisioning:
Classification versus Signage. The top levels categories have nothing to do with
signage. This is particularly true with children’s books, which can be grouped/displayed as the library desires (e.g. picture books, infants, board books, etc.).

We will posting an updated version of the top levels very soon, so stay tuned!

Labels: Open Shelves Classification, OSC

Monday, February 2nd, 2009

OSC gets the once-over at ALA in Denver

As most of you know, back in July the Open Shelves Classification was conceived as a free, crowdsourced alternative to the Dewey Decimal System. The Group has been very active during initial development, and the top levels are being heatedly debated.

David, Tim and I held an OSC open-discussion at the American Library Association (ALA) conference in Denver. A great group of people participated in a lively debate about the project.

To summarize:

There was some room confusion with the Marriott and, unfortunately, many people left before it was all figured out.

10 people attended: Tim, Laena, David, a mix of public librarians, academic librarians, and one interested non-librarian. The librarians were catalogers, reference librarians, and one library director.

Comments during the meeting included:

The random works feature is not that useful because half of all the works are fiction and fiction is not broken out at the top level.

If a public library may reasonably want to aggregate at a certain level (e.g. fiction or science) then it should exist as a top level. No one aggregates at non-fiction, hence it is not useful.

Working on the second level for fiction should happen sooner rather than later.

Children’s books are a challenge.

Perhaps using an audience facet would help (for example, CH, YA)?

  • Yes, but the topics of some books are hard to determine. Should they be put in fiction? If so, a scope note is needed.
  • Speaking of which, there is no good way of dealing with series when written by separate authors, like Spongebob Books.

How should series be handled in a classification?

The Darien library is reorganizing their collection, particularly children’s books, in interesting ways (here, you can listen to Gretchen Hams tell you all about it).

For OSC to be successful, it must be easy to implement for public libraries.

It must be inexpensive to go from DDC-OSC.

A crosswalk is essential!

  • There needs to be a way to determine how much space is needed ahead of time to move the books around.
  • It must be easy to print labels.
  • Backstage Library Works was a company that moved Duke University Libraries from DDC to LC, so there must be models out there on how to do this.

An audience facet would be a good way to handle reading level as well, either by grade or age.

  • Example: 0-1, 1-3, 9-10, etc.
  • There is a tension between having too many optional facets and universality.

The facets need to transcend stickering, the current practice in most public libraries.

We need a reality check before getting to far down the road with proposed schedules for OSC– will it work in an actual library?

  • We could upload a library’s MARC records into LT and try it there virtually before asking a library to use it.
  • Two potential public libraries were listed as testing grounds.

So far, the top levels testing on LibraryThing has provided the following results:

  • 56641 acts of classification
  • By 1000+ users
  • On 22,000+ works

What are the biggest tags in LibraryThing, can we use those to determine the levels?

  • They were looked at and evaluated, hence True Crime is a top level.
  • This can’t really be done in an automated way.

What is the product plan for OSC?

  • The data is open source & free.
  • If people want to package services around the data (such as reclassifying books for you), then that is a possibility, but we do not see this developing for at least a year or so.

What does “shelf-ready” mean?

  • A vendor puts on labels, dust jackets, tattle tape, creates catalog records for a public library.
  • Different people at the meeting had differing levels of success with outsourcing their books to be made shelf-ready by vendors.

Is bleed over between categories in OSC a bug or a feature?

  • Memoirs/Autobiographies was seen as a bug.
  • Others such as Pets/ScienceAnimals were not seen that way.

Putting categories in an order may help people’s confusion of where to put things.

  • This is called “flow” in bookstores.
  • E.g. Cooking—Health—Sports or Biography—History—Poly Sci

Confusion arose over facets.

  • You add and delete depending on the libraries needs.
  • Huge collection? Use them all. Small and only need Science and Religion? Go ahead, the system is flexible.

The top level testing will stop and the levels will begin to be re-worked this week.

How should Art, Architecture, Design, and Photography be handled?

  • After much discussion, the consensus was reached that Art, Architecture, and Design should be separate top level categories, but that Photography would go under Art.

The first test round has been closed. Visit the Open Shelves Classification group for details.

Meeting like this was great and very helpful in making OSC usable. Another meeting is planned in New York for early April–we’ll keep you posted!

Labels: Open Shelves Classification, OSC

Tuesday, January 20th, 2009

Open Shelves Classification: First draft live and at ALA Midwinter

If you’re at ALA Midwinter in Denver on Saturday, come talk about this interesting new project. See below for details.

Back in July I blogged to start something called the Open Shelves Classification, a free, crowdsourced alternative to the Dewey Decimal System, and created a Group for it. Soon afterward two librarians, Laena M. McCarthy of the Pratt Institute and David Conners of Haverford took over leadership of the project. For the past six months they and a growing contingent of LibraryThing members, some librarians, some not, have been working to come up with basic principles and working on pieces and on the numbering system. They’ve also done some interesting work testing the proposed top level against real library records. Much of their work is collected on the Open Shelves Classification Wiki. Laena did a nice post on the OSC on the Public Libary Association blog.

The OSC team has reached some agreement on a first drag of the “top level categories,” some fifty categories that, it is hoped, all books fit into somewhere. And you are invited to help classify works in LibraryThing!

Want to help? Go to a work page in LibraryThing and scroll down to the bottom. You’ll find a chart of the top-level categories. If you see a good match, click on it. You’ll be prompted to say whether you know the book yourself or not. And then you’ll get to see how your classification vote match up with anyone else on the site.

You can classify anything in LibraryThing. If you want to help the most, however, click the “Find a random work” link here or below the classification chart. It’ll take you to a random work, but also contrive to get multiple members classifying the same works. The idea is that it’ll give us a good idea what categories are easy and obvious, and which are causing doubt.

Whatever you find, come and talk about it on the Open Shelves Classification group.

In Denver on Saturday? Laena and David are going to be at the ALA Midwinter show in Denver this weekend. (So are Sonya, Casey and I.) To move the OSC along we reserved a conference room at the Courtyard Marriott (Google Maps) from 1-3pm on Saturday, January 24th. Anyone at ALA is invited to come, as indeed are regular LibraryThing members–the Courtyard is outside the velvet rope.

Labels: ala, ALA midwinter, alamw2009, Open Shelves Classification, OSC

Sunday, December 21st, 2008

uClassify library mashup? (with prize!)

I keep up with the Museum of Modern Betas* and today it found something wonderful: uClassify.

uClassify is a place where you can build, train and use automatic classification systems. It’s free, and can be handled either on the website or via an API. Of course, this sort of thing was possible before uClassify, but you needed specialized tools. Now anyone can do it—on a whim.

Their examples are geared toward the simple:

  • Text language. What language is some text in?
  • Gender. Did or a man or a woman write the blog? It was made for genderanalyzer.com (It’s right only 63% of the time.)
  • Mood.
  • What classical author your text is most alike? Used on oFaust.com (this blog is Edgar Allen Poe).

Where did I lose the librarians—mood? But wait, come back! The language classifier works very well. It managed to suss-out Norwegian, Swedish and Dutch reviews of the Hobbit.** So what if the others are trivial? The idea is solid. Create a classification. Feed it data and the right answer. Watch it get better and better.

Now, I’m a skeptic of automatic classification in the library world. There’s a big difference between spam/not-spam and, say, giving a book Library of Congress Subject Headings. But it’s worth testing. And, even if “real” classification is not amenable to automatic processes, there must be other interesting book- and library-related projects.

The Prize! So, LibraryThing calls on the book and library worlds to create something cool with uClassify by February 1, 2009 and post it here. The winner gets Toby Segaran’s Programming Collective Intelligence and a $100 gift certificate to Amazon or IndieBound. You can do it by hand or programmatically. If you use a lot of LibraryThing data, and it’s not one of the sets we release openly, shoot me an email about what you’re doing and I’ll give you green light.

Some ideas. My idea list…

  • Fiction vs. Non-Fiction. Feed it Amazon data, Common Knowledge or LT tags.***
  • DDC. Train it with Amazon’s DDC numbers and book descriptions. Do ten thousand books and see how well it’s guessing the rest.
  • Do a crosswalk, eg., DDC to LCC, BISAC to DDC, DDC to Cutter, etc.

Merry data-driven Christmas!


*A website that tracks new “betas.” Basically, it tracks new web 2.0 apps. It also keeps tab of their popularity, according to Delicious bookmarks. LibraryThing is now number 12, beating out Gmail. Life isn’t fair.
**Yes, we’re going to get it going for reviews on the site itself. Give us some time. Cool as it is, we’re pretty busy right now. Note: You can’t give it the URL alone. You have to give it the text of the review.
***We may do this with tags. We already do it very crudely, using it only for book recommendations.

Labels: Dewey Decimal Classification, Open Shelves Classification, uclassify

Wednesday, August 6th, 2008

Open Shelves Classification: Update and Summary

Note: This post was created by David and Laena, but reposted by me for a stupid technical reason. (Tim)

Hello Librarythingers, librarians and classification fans, we are happy to join you as facilitators of this exciting project! To learn more about us, see Tim’s introduction. Hadrian’s library (above) seemed an appropriate illustration, as we strive to create a new system upon the building blocks of the old.

To reiterate the initial goals of the project, Open Shelves Classification (OSC) is a free, “humble,” modern, open-source, crowd-sourced replacement for the Dewey Decimal System.

It will also be:

  • Collaboratively written. The OSC itself should be written socially–slowly, with great care and testing–but socially. (This is already underway via the group Build the Open Shelves Classification and the LibraryThing Wiki.)
  • Collaboriately assigned. As each level of OSC is proposed and ratified, members will be invited to catalog LibraryThing’s books according to it. (Using LibraryThing’s fielded bibliographic wiki, Common Knowledge.)

And include:

  • Progressive development. Written “level-by-level” (DDC’s classes, divisions, etc.), in a process of discussion, schedule proposals, adoption of a tentative schedule, collaborative assignment of a large number of books, statistical testing, more discussion, revision and “solidification.” This has already begun.
  • Public-library focus. LibraryThing members are not predominantly academics, and academic collections, being larger, are less likely to change to a new system. Also, academic collections mostly use the Library of Congress System, which is already in the public domain. This is also the place and audience that has demonstrated the most need for change (see BISAC and other non-Dewey conversions already underway).
  • Statistical testing. As far as we are aware, no classification system has ever been tested statistically as it was built. Yet there are various interesting ways of doing just that. For example, it would be good to see how a proposed shelf-order matches up against other systems, like DDC, LCC, LCSH and tagging. If a statistical cluster in one of these systems ends up dispersed in OSC, why?


Where are we now? Since its inception, there has been consistent and productive discussion on the LibraryThing group Build the Open Shelves Classification, and circeus began an excellent wiki Open Shelves Classification that summarizes the current OSC consensus. The wiki is where the work will be staged as it is developed by all of us. So far, the wiki includes consensus on materials that must be included, call number requirements and proposed scheme, and the choice of top-level classes.

Where do we go from here? We feel that the most important issues to determine are:

  • Top-level classes. Findability is key. Terms need to be familiar and clear (not abstract), roughly 12-15 categories, and relavent to the public library audience and their needs. Library data would be very helpful here! (OSC is focusing more on task (what people find: history, gardening, sci-fi) versus audience (who is finding: children, women, dogs) when determining top-level terms.)
  • Alpha-numeric decisions and punctuation. TBD. A numbered system that doesn’t require equal digits is so far the most popular format (10.6.245.20). As for punctuation, the debate continues–dots or dashes?
  • Factors be determined locally or at a later stage of development. We need to be as focused and specific in our tasks as possible, and there are many decisions we will not be undertaking. (For example, Cutter numbers and possibly non-book materials.)

David and I are simply facilitators, and we need LibraryThing Members to help monitor threads and contribute valuable content. Please comment below if you want to volunteer to monitor a particular thread to make sure we do not miss anything. Also, people should continue to add content to the wiki as consensus emerges from the threads. Although theoretical discussion is fascinating, examples from your library or your personal experience are what will make the OSC usable.

We look forward to working with our fellow LibraryThing members!

Labels: Open Shelves Classification, OSC

Tuesday, August 5th, 2008

Open Shelves Classification: Welcome Laena and David

Back in July I proposed the Open Shelves Classification (OSC), a new, free, crowdsourced replacement for the Dewey Decimal System. I also created a group to start in on the project.

The proposal included a call for a volunteer to lead the group. I was happy to write the software, and members would create the OSC, but someone with a library degree was needed to shepherd the project and make the occasional tough decision.

I’ve found two: the LIS team of Laena McCarthy and David Conners. It turns out, I already knew them. Abby and I met with Laena and David, back at ALA 2007, when they were MLS students doing a joint LibraryThing-related project called Folksonomies in Action. They impressed us then. It was extraordinary to talk to librarians with a deep understanding and creative take on the ideas LibraryThing was exploring. Since then Laena and David have started promising careers as librarians and professors. So, after receiving word they were interested in the project, we are only too happy to bring them on.

Laena M. McCarthy (user: laena). Laena is currently an Assistant Professor and Image Cataloger at the Pratt Institute in NYC. Her bio contains the priceless bit:

“Previously, she worked in Antarctica as the world’s Southernmost librarian, where she provided a remote research station with access to information. She incorporated into the library the first permanent art gallery in Antarctica.”

Laena’s teaching and research focus on the application of bottom-up, usability-centric design and collaboration. She is currently researching image tagging, FRBR for works of art & architecture, and information architecture. Her work has been published in Library Journal and the forthcoming Magazines for Libraries 2008.

In her free time, among other things, she can be found making jam, competing in food competitions, scuba diving and writing.

David Conners (user: conners). David is the Digital Collections Librarian at Haverford College in Pennsylvania. At Haverford, David works to make the College’s unqiue materials, such as the first organized protest against slavery in the New World, available online. He also oversees the College’s oral history program and the audio component of Special Collections exhibits such as “A Few Well Selected Books.”

David’s research interests include subject analysis, FRBR, and, occasionally, doped ablators. His work has been published in Library Journal, The Serials Librarian, and Physics of Plasmas.

The torch is passed! From this point on, it’s their project to direct. But we’re in agreement on their role: They aren’t royalty, they’re facilitators. They’re there to listen and to encourage conversation. They’re there to guide things toward consensus. They’re there too see the project stays on track and true to its goals. They’re there to propose forking the project or moving it elsewhere, if that’s what it needs and the community wants it.
Laena and David are doing this for fun and interest. As a fun side-project with no financial component—OSC is by definition public domain in every respect—we can’t pay them. But we’ve promised to help pay their way to LIS conferences, if someone wants them to talk about it. (At least one group already does.) And there’s the hope that, if OSC can accomplish its goals, they will have helped create something highly beneficial for libraries and library patrons everywhere.

If you’re interested in the project, come join the group and find out more.

Labels: DDC, dewey decimal, Dewey Decimal Classification, open data, Open Shelves Classification, OSC

Tuesday, July 8th, 2008

Build the Open Shelves Classification

This mural is said to depict Dewey and the railroad service he gave to Lake Placid, FL. It’s time to throw Dewey under the train.

I hereby invite you to help build the Open Shelves Classification (OSC), a free, “humble,” modern, open-source, crowd-sourced replacement for the Dewey Decimal System.

I’ve been speaking of doing something like this for a while, but I think it’s finally going to become a reality. LibraryThing members are into it and after my ALA panel talk, a number of catalogers expressed interest too. Best of all, one library director has signed on as eager to implement the system, when it comes available. Hey, one’s a start!

The Call. I am looking for one-to-five librarians willing to take leadership on the project. LibraryThing is willing to write the (fairly minimal) code necessary, but not to lead it.

As leaders, you will be “in charge” of the project only as a facilitator and executor of a consensus. Like Wikipedia’s Jimmy Wales, your influence will depend on listening to others and exercising minimal direct power.

For a smart, newly-minted librarian, this could be a big opportunity. You won’t be paid anything, but, hey, there’s probably a paper or two in it, right?

Why it’s necessary. The Dewey Decimal System® was great for its time, but it’s outlived that. Libraries today should not be constrained by the mental models of the 1870s, doomed to tinker with an increasingly irrelevant system. Nor should they be forced into a proprietary system—copyrighted, trademarked and licensed by a single entity—expensive to adopt and encumbered by restrictions on publishing detailed schedules or coordinating necessary changes.

In recent years, a number of efforts have been made to discard Dewey in favor of other systems, such as BISAC, the “bookstore system.” But none have proved good enough for widespread adoption, and license issues remain.

The vision. The Open Shelves Classification should be:

  • Free. Free both to use and to change, with all schedules and assignments in the public domain and easily accessible in bulk format. Nothing other than common consent will keep the project at LibraryThing. Indeed, success may well entail it leaving the site entirely.
  • Modern. The OSC should map to current mental models–knowing these will eventually change, but learning from the ways other systems have and haven’t grown, and hoping to remain useful for some decades, at least.
  • Humble. No system–and least of all a one-dimensional shelf order–can get at “reality.” The goal should be to create a something limited and humble–a “pretty good” system, a “mostly obvious” system, even a “better than the rest” system–that allows library patrons to browse a collection physically and with enjoyment.
  • Collaboratively written. The OSC itself should be written socially–slowly, with great care and testing–but socially. (I imagine doing this on the LibraryThing Wiki.)
  • Collaboriately assigned. As each level of OSC is proposed and ratified, members will be invited to catalog LibraryThing’s books according to it. (I imagine using LibraryThing’s fielded bibliographic wiki, Common Knowledge.)

I also favor:

  • Progressive development. I see members writing it “level-by-level” (DDC’s classes, divisions, etc.), in a process of discussion, schedule proposals, adoption of a tenative schedule, collaborative assignemnt of a large number of books, statistical testing, more discussion, revision and “solidification.” 
  • Public-library focus. LibraryThing members are not predominantly academics, and academic collections, being larger, are less likely to change to a new system. Also, academic collections mostly use the Library of Congress System, which is already in the public domain.
  • Statistical testing. To my knowledge, no classification system has ever been tested statistically as it was built. Yet there are various interesting ways of doing just that. For example, it would be good to see how a proposed shelf-order matches up against other systems, like DDC, LCC, LCSH and tagging. If a statistical cluster in one of these systems ends up dispersed in OSC, why? 

I have started a LibraryThing Group, “Build the Open Shelves Classication.” Members are invited to join, and to start working through the basic decisions.

Labels: DDC, Dewey Decimal Classification, Open Shelves Classification, OSC