In a recent AP article about mass digitization at Michigan (available here via Salon), my colleague John Wilkin was amusingly misquoted as characterizing some comments of Brewster Kahle’s as “theoretical,” when John meant polemical.” John has a nice blog post on the on the subject, with responses and rejoinders from both Brewster and from Carl Malamud. The question at hand is a little bit theological (with traces of both theoretical and polemical). Just how open do you have to be to be called “open,” and, much to the point, is it reasonable to call something that can be read anywhere in the world by anyone with an internet connection “locked up”?
I commend the discussion to you. It’s partly about the perfect being the enemy of the good (always a problem in policy making).
As a researcher, I find Google Books useful but full of inescapable flaws: the entire book is not available, some not at all, and some in what is a worthless form–“snippets.” Plus, I don’t trust its cataloging. So, yes, there’s a great deal “closed up” about Google.
As for Michigan, I am not a student there, so while your services sound wonderful for your patrons how are they relevant to others other than as an example of what other libraries could do? University libraries can provide terrific resources to those with privileges but isn’t that just an archipelago of closed shops?
April 27, 2008 @ 10:06 am
Mary,
Neither Michigan nor Google can show much of work that is in copyright. Google shows snippets; UM permits search but does not display text. These works are closed in order to comply with copyright law and there is not much we can do about it.
The discussion among John, Brewster, and Carl is about work in the public domain, and you should be able to see the entire text of the public domain works that we have on line at lib.umich.edu, whether or not you are affiliated with the University of Michigan.
April 27, 2008 @ 10:36 am
> and you should be able to see the entire text
> of the public domain works that we have on line
well, we can “see” it, but only _one_page_at_a_time_…
so as far as downloading the text for an entire book,
that’s extremely difficult to do, unless we automate it.
and wilkin says that, were such automation to be done,
you would shut it down, per your contract with google.
so no, i don’t think you can really claim your books are
“open”. you might not have “locked them up”, but you
make them so inaccessible, you might as well have…
so if _anyone_ is engaging in polemics here, it’s umich.
it’s neither here nor there, because google itself is
making the scans available, in full, in one download,
for (most) public-domain books. so we can do the
o.c.r. on these scans ourselves, if we want the text.
and it’s for this very reason that i myself have disputed
brewster’s take on the situation as being inaccurate…
but honesty compels me to point out that you have not
been completely honest yourself in describing things…
-bowerbird
April 27, 2008 @ 9:31 pm
oh, and just so you don’t forget, and think that i forgot,
or never noticed… yes sir, i am quite well aware that
umichigan is _way_ ahead of the pack on the issues of
sharing access to these books with the general public.
the taxpayers of michigan have — through the effect of
decisions the u.m. library has made on their behalf —
gifted that library to the whole world via cyberspace,
and that is a present of immense worth to the world…
they didn’t keep their great library to themselves, they
_shared_ it us, and for that, michigan can feel _proud_.
along with the new york public library — and allow me
to extend similar credit to the taxpayers of that city —
umichigan is blazing trails for other libraries to follow.
have we noticed that you two are alone on the trail?
yes we have.
and we hope other libraries will soon join you.
have we noticed the heads of other libraries involved
with google have not yet engaged us in conversation,
like you have, right here?
you bet we have.
and we’re waiting to have that conversation with them,
and very glad that we are already having it with you…
so you can be assured that we do appreciate umich,
and what you’ve done, we appreciate you very much.
but “open” is open, and “not open” is not open, and
we know what “open” means when it involves text…
it means pulling that text into our word-processors
and _playing_ with it. running concordances on it.
computing word-frequencies. playing reg-ex matches,
search-and-replace games. reflowing the paragraphs,
changing fonts, blowing the text up to 98-point type.
mixing up the chapters so they tell a different story.
all this, and more, is what it means for text to be open.
if you make us look at the text through a shop-window,
a different window for each page, sure we can “see” it,
but it’s really difficult for us to feel like it is _ours_…
and that’s the thing… public-domain text _is_ ours…
-bowerbird
May 1, 2008 @ 2:03 am
so i am here, on this may day, to say that
we would like our text, please. and we’re
willing to be worker-bees. we fully intend
to accept our responsibilities as _owners_
and add value to that text. our first action
will be to help umichigan correct the o.c.r.
but to proof that text, we must collect it…
so we’ll need to know exactly what kinds of
“automatic” downloading you’re forbidding.
we will need very detailed specifics on this.
i’ll take up the issue over on wilkin’s blog,
since carl malamud has already asked the
question directly there about downloading,
but if you as the head of the library system
want to give him some direction, please do.
-bowerbird
May 1, 2008 @ 2:13 am
i tried many times to post this on john’s blog,
but it won’t stick. (and now it reports it’s a
duplicate.) perhaps you can pass this along?
this post is in reply to carl’s message about a
“democracy club” that would re-mount books.
***
i’ll first say that this post is not “hypothetical”,
as carl qualified his. not at all. this is _real_…
i intend to re-mount public-domain books…
google mounted ’em, umichigan mounted ’em,
and i’ll mount ’em too, my duty as a member of
the public, to whom the public-domain belongs.
i will mount scans — from google — and text,
which i’ll get using multiple methods, such as
downloading it from google, umichigan, etc.
if you’d like the chance to discuss this with me,
do begin. we can do it here, or any public site
of your choosing. i’d like to learn about your
“powerful options” for “in situ” use of books,
and your “collection builder” tool, as i’d love to
“leverage your investment in permanent curation”.
i think i can help you with that, and will enjoy it,
especially your call to help improve your books,
as i am sure that this is something we can do…
at the same time, i will mount the books myself,
and i will provide tools for others to do it too…
so we need to view this discussion as an attempt
to find _synergy_ in our efforts. and, in that vein,
i _reject_ your implication that remounting books
on other sites is “dispersing the effort to copies in
multiple places”. no, to the contrary, i believe that
lots of copies keeps stuff safe, so is a good thing.
i’ll cooperate to make sure improvements i make
find their way to your books as well, _providing_
you will make reasonable efforts to accept them…
i repeat: this is _not_ a “hypothetical” discussion.
my intentions are concrete; i have already posted
public-domain books, like “books and culture”,
which was google’s first public-domain prototype.
i have even documented how i cleaned the o.c.r. (took me just one hour) i’d scraped from your site.
i intend to host thousands of books myself, and
show others (with greater resources) how they too
can host tens of thousands more, up to the level
where we are collectively hosting millions of ’em.
since my intentions are firm, this dialog must be
specific. i need to know _exactly_ what you mean
by “automated downloading” that you will outlaw.
i will not flout your rules, i will live within them;
but to do that, i’ll need to know them _precisely_.
so we should begin by discussing “scraper” tools,
like the ones i build. what demands do you make
to keep such programs inside your boundaries?
-bowerbird
May 9, 2008 @ 3:05 pm