[ Content | Sidebar ]

Microsoft Exits the Mass Digitization Business

May 31st, 2008

Last week Microsoft announced that it will cease its Live Search program and the associated programs of mass digitization that it has been undertaking with many libraries. The response in the library world has generally been one of resigned sadness that the only big player other than Google is getting out of the free (to the libraries) mass digitization business. From an article in the Chronicle of Higher Education:

“Microsoft was a little slower off the mark than Google,” says Anne R. Kenney, university librarian at Cornell University. Her library has supplied both Microsoft and Google with books and articles for digitization. “It would have meant an awful lot of additional investment in this area for Microsoft to be a real competitor.”

In the same article, I am quoted as saying “The more the merrier. I don’t like a monopoly, and I like it when there’s lots of money behind an extremely important project.” I continue to wish that there were folks with deep pockets lining up to provide free digitization of the world’s library collections. Alas, there is no one in line that I know of, and with Microsoft’s departure, the only serious player is Google.

Speaking of Google, (as I find myself doing rather frequently) a recent posting on Ars Technica includes the following remark, which is misleading in several ways:

If people think that corporations are the right way to access the history of human discourse, [Brewster] Kahle says they’re in for “a series of very rude shocks.” (The University of Michiagn (sic), which has thrown in its lot with Google, does not agree.)

I want to emphasize, yet again, that I completely agree with Brewster Kahle that it would be a very bad thing if a single corporation were in control of the cultural record. Indeed, it would be bad if, as is the case with much of audio and video, the control were divided up amongst several corporations. Nonprofit organizations, emphatically including research libraries, are the natural stewards of information that will be of value to society for the indefinite future, precisely because we are driven by a mission of preservation and access, rather than by profit. Good thing, then, that the University of Michigan and other universities whose collections are being digitized by Google continue to hold the original copies of their print works, and also receive and preserve copies of the image files and associated text files that are produced by Google’s nondestructive scanning of these works.

I will miss Microsoft, and I hope that others will take its place – again, the more the merrier. In the meantime, the University of Michigan Library now has well over a million digitized books in its catalogue, with the number growing by thousands every day. Visit us online at www.lib.umich.edu. Our catalog will allow search of all of the digitized works, and full view of those that are in the public domain.

Will You Still Love Me Tomorrow?

May 3rd, 2008

Actually, the title of the song is “Will You Love Me Tomorrow?” It was written by Carole King and Gerry Goffin, and hit number one on the charts, sung by the Shirelles, in 1961. King covered it herself in the album, Tapestry, ten years later. The King version is accompanied by piano, instead of violins, leaves out the sha-na-na stuff, and is, although the words and notes are identical, a good bit sadder. And there is a third version, by Dave Mason, which closes the third episode of Studio 60 on the Sunset Strip, a defunct TV show by Aaron Sorkin that I like a lot. The Mason version is sort of in-between, but closer to the Shirelles, and makes an episode that otherwise ends on an up note bittersweet, which is just what Sorkin wants to do. But I digress.

What leads me to share this rather old news with anyone who might be reading this entry is that today’s technology allows me to check on all three versions in the privacy of my own home, lifting not much more than a finger. I happen to have the King version on vinyl and Studio 60 on DVD, but all three versions of the song are available on iTunes.  Studio 60 is not available legally on the web, leading one to wonder what NBC is thinking and nicely illustrating the point that the fruits of information technology are limited more by the practices of content-owners than by the technology itself.

But the technology, even the simple stuff that I am talking about here, is remarkable. When King and Goffin wrote the song, and when King sang it again in 1971, it was inconceivable that a family argument over whether a song was optimistic or pessimistic could be resolved (well, not resolved, but greatly enriched) via home access to a variety of media, all at reasonably high fidelity and low cost.  I’m proud to have sons who care about such matters and I continue to be amazed (as they are not) at what we can do.

By the way, it’s a great song, in all three versions.

John Wilkin and others on Openness and its opposites

April 26th, 2008

In a recent AP article about mass digitization at Michigan (available here via Salon), my colleague John Wilkin was amusingly misquoted as characterizing some comments of Brewster Kahle’s as “theoretical,” when John meant polemical.” John has a nice blog post on the on the subject, with responses and rejoinders from both Brewster and from Carl Malamud. The question at hand is a little bit theological (with traces of both theoretical and polemical). Just how open do you have to be to be called “open,” and, much to the point, is it reasonable to call something that can be read anywhere in the world by anyone with an internet connection “locked up”?

I commend the discussion to you. It’s partly about the perfect being the enemy of the good (always a problem in policy making).

On choosing a Creative Commons License

April 22nd, 2008

I recently changed the Creative Commons license on this blog from Attribution-Non Commercial to Attribution, for a number of reasons.

My reasons are all related to a general point of view about commerce, one that is highly unoriginal (having, famously, been well articulated by Adam Smith in 1776) but powerful nonetheless. The profit motive often leads to great things, and also to good small things. It’s useful to have people out there trying to make money, especially if they are trying to make money by creating works of value, rather than by defending business models and manipulating legislatures to preserve monopolies or squelch competition. As I cannot imagine anything that I write on this blog providing the key to anyone’s predatory monopolies, the remote possibility that what I write could be used in a way that would generate monetary value carries with it the associated possibility that it will be of social or personal value to a user or a reader. The more use the better, and the more that people are looking for customers and users, the better.

I also have a more political motive. I have been known to criticize the behavior of publishers from time to time, but I would not want it thought that I am generally opposed to commerce, or commercialization, and I fear that some people involved in commerce see the “Non Commercial” license as synonymous with “anti-commerce.” It’s okay with me if someone makes some money from my work, even when I don’t. And if the new use is clever or innovative, that’s even better.

The second reason, also closely related, has to do with my attitude towards copyright law. If you believe, as I do, that the purpose of copyright is to “Promote the progress of science and the useful arts”, then it is more important that the work be out in the world being read, and contributing to a larger discourse, than that strangers not be able to make money from it. One maximizes the influence of the work by maximizing potential uses of the work, recognizing that commercial uses have just as much power to promote progress as non-commercial uses, and recognizing that the constitutional basis of copyright — authorizing Congress to grant monopolies for a limited period — contemplated such uses from the get go. (Of course, the limited period has become a bad joke.)

I should point out that under this strategy – the maximizing influence strategy – one would never sign away exclusive rights to one’s own work, because exclusive rights drastically limit the potential distribution channels, and the potential impact.

So, if anyone can make money from this posting, have at it!

Oxford, Cambridge and Sage Sue Georgia State

April 16th, 2008

It is with dismay that I read in today’s New York Times that three distinguished academic presses, Oxford, Cambridge, and Sage, are suing Georgia State for copyright infringement with regard to course websites. I cannot know the merits of the case, but two points are telling. One is that the transaction seems to be between attorneys for the presses and Georgia State, not between the leadership of the universities. For all of the flowery language that we often hear from university presses about the importance of a robust nonprofit publishing sector in service to the academy, the issue here is plainly about the profits of the “nonprofit” publishing sector. Perhaps I am wrong, and the Vice-Chancellors of Oxford and Cambridge have been in touch with the President of Georgia State to discuss the missions of learning and teaching, but I’d bet not.

The second point is that, according to the Times, Cambridge University Press licenses pages for electronic reserves at 17 cents per student per page, for up to 20 percent of a book. The marginal cost to Cambridge of permitting such use is the billing cost, so the 17 cents is essentially all profit. If a student is willing to go to the effort, of course, she can take the book out of the library, and photocopy the pages for five or six cents each. The photocopy alternative is not as useful to the student as the scanned version on the course site, but note that Cambridge bears exactly none of the costs of making the scanned version available; Georgia State bears that cost.

Digital technologies have the capability of greatly reducing the overall social cost of making scholarly materials available to college students. Cambridge’s mission statement would seem to suggest something other than the lawsuits as a principal mode of engagement with other institutions of higher learning: “The mission of the University of Cambridge is to contribute to society through the pursuit of education, learning, and research at the highest international levels of excellence.”

Things have come to a pretty pass when academic institutions sue each other over academic matters. Even if the publishers prove to be right on the merits, the lawsuit ought to be the last resort, and student use of academic materials produced by academic institutions ought be priced at something like marginal cost, rather than at the price that maximizes profit. And one wonders why three rich and distinguished institutions would go after an urban university that is much less well-resourced.

The Michigan of the East goes Open Access

February 16th, 2008

Since everyone else is talking about the new open access mandate from Harvard’s Faculty of Arts and Sciences, I figure I might as well jump in, too.

There are any number of details that will have to be worked out before we know how the mandate will be implemented, and we will probably never know the precise effect on the world of scholarly publishing. But the vote of the Harvard Faculty of Arts and Sciences makes a point that should be widely applauded in the academy. Harvard University Librarian Robert Darnton put it well in his op-ed before the faculty vote:

The motion before the FAS provides a way to realign the means of communication in a way that will favor learning. It will be a first step toward freeing scholarship from the stranglehold of commercial publishers by making it freely available through our own university repository. Instead of being the passive victims of the system, we can seize the initiative and take charge of it.

What almost all faculty care about almost all of the time is the dissemination and use of their work, not its commercial consequences. We have always known this, of course, although organizations that purport to speak for the interests of authors frequently place inordinate emphasis on authors’ commercial interests. What the Harvard faculty has done is give us all a visible and powerful affirmation that what really matters is academic work itself, and not the profitability of particular industries that have grown up around it.

Faculty time and effort, in research, writing, and reviewing, are by far the most valuable ingredients of scholarly publication, and there is enormous scope for universities and faculties to reclaim publication and the associated profits from commercial enterprises. The problem of limited, over-priced access to scholarship is a big one, and the more different ways we try to fix it, the better our chances that a few of them will work. The declaration by Harvard’s faculty focuses on one strategy — mandated (or at least default) deposit into institutional repositories. But more important than the choice of strategy, the declaration reminds us of how much is at stake and why it matters.

It is somewhat troubling that some academic publishers and academic societies have expressed concern that the Harvard mandate will put them at mortal risk, while merely trimming the profits of the big commercial publishers. Plainly, we in the academy have an interest in robust nonprofit scholarly publishing, but we should not fall for the idea that the only way for nonprofit publishing to survive is through policies that assure huge profits to the big players. (There is an analogy to agricultural policy here. In the name of preserving the “family farm,” governments around the world provide billions in subsidy to agribusiness.)

For now, let me repeat that the big news in the Harvard vote is that it helps all of us to focus on the main point — which is that scholarly publishing, through a variety of mechanisms, is first and foremost about making scholarship public, not making money. So, strange as it may sound coming from Ann Arbor: Go Crimson!

A Letter to the Editor of the New York Times

February 15th, 2008

I’ve always thought of blog posts as basically being open letters to some editor or other. In this case, I attempt to take the New York Times to task for coding Hillary Clinton as the winner of the Democratic primaries in Michigan and Florida. In both states the Democratic National Committee promised NOT to seat any delegates elected in those primaries. None of the major candidates campaigned at all in either state; Kucinich was the exception that proved the rule in Michigan.  Mike Gravel was on the ballot, but didn’t campaign, or if he did I missed it.

Anyhow, here is my letter:
In today’s (Feb. 15) Times, you credit Senator Clinton with having won primaries in Michigan and Florida. Well, yes, but those primaries were essentially uncontested, because the Democratic National Committee, in an effort to prevent large and diverse states from voting before Feb. 5, ruled that delegates from Michigan and Florida would not be seated and candidates who campaigned actively would be punished. (Obama and Edwards were not on the ballot in Michigan, and the Party did not even count write-in votes.) It is bad enough to have been deprived of my franchise. The Times should not compound the insult by mischaracterizing the event. I don’t know if your delegate counts include Michigan and Florida. If they do, the counts are corrupt. In any case, Michigan and Florida should be marked on your map as “no contest and no delegates” rather than as victories for Senator Clinton.

Paul N. Courant

One Million Digitized Books

February 2nd, 2008

Today the University of Michigan Library is celebrating a significant milestone: We have just put the one millionth book digitized from our collections online. (I recommend clicking on the link. The page is pretty cool.) As far as I know Michigan is the first library to have one million books from its own collections digitized and available for search (and, when in the public domain, available for viewing.)

One million is a big number, but this is just the beginning. Michigan is on track to digitize its entire collection of over 7.5 million bound volumes by early in the next decade. So far we have only glimpsed the kinds of new and innovative uses that can be made of large bodies of digitized books, and it is thrilling to imagine what will be possible when nearly all the holdings of a leading research library are digitized and searchable from any computer in the world.

Yesterday the Library had a party to recognize the all the people who made this milestone possible. A lot of books have to be barcoded, moved, and moved again in order for a project like this to work, and there are many parts of the process where people could simply have questioned whether the effort was worth it. To the enormous credit of our library, there has been tremendous enthusiasm for both the work and its purposes. We all eagerly await (and it won’t be long) the next million, and the millions after that.

MPAA Bad, Universities Good

January 25th, 2008

From yesterday’s Chronicle of Higher Education

In 2005, when the Motion Picture Association of America stepped up its campaign against college movie pirates, officials with the trade group said that 44 percent of the film industry’s domestic losses were the result of illegal downloads on campus networks.

That statistic — which came from a report by a research firm called L.E.K. — was certainly striking. But it was also wrong, MPAA officials now say. According to the Associated Press, a “human error” compromised the study: In fact, the MPAA says, just 15 percent of the movie industry’s domestic losses can be attributed to campus piracy.

Like most humans, I am overwhelmingly sympathetic to human error. I am less sympathetic to using data that are subject to error to push people around and to lobby for draconian legislation, while refusing to make the underlying data available for study and examination. The University of Michigan asked the MPAA for their study years ago, and has also asked the Recording Industry Association of America (RIAA) to provide the data upon which they base their own remarkable claims about the prevalence of file sharing of copyrighted materials on college campuses. This university, and many others, have great expertise in the analysis of such data. In an important sense, it’s what we do. We also have a culture of openness, in which we allow others to examine and criticize our work.

The MPAA reports that it is going to have an independent third party check the original study and report on it. Here’s an idea: Why don’t they simply make the study public and let the world have at it?

Partly as a result of the deeply flawed MPAA study, Congress asked the University of Michigan to respond to questions about file sharing. We post such things, of course, and the questions and answers are available on our copyright website under “House Judiciary Committee Survey of University Network and Data Integrity Practices”.

And, as long I’m talking about our friends in the big media companies, it’s worth noting that AOL, a much bigger Internet Service Provider than all of the colleges and universities put together, is owned by Time Warner, which in turn is a member of both the RIAA and the MPAA. Surely a great deal of illegal filesharing is undertaken by AOL users. It is puzzling that RIAA and MPAA want colleges and universities to employ mechanical measures that would restrict what their students can do, but they have not pressed AOL to impose the same restrictions. (Actually, it’s not puzzling at all, but it ought to be.)

Recessions and Libraries

January 23rd, 2008

In this post, I get to be both an economist and a librarian. I want to argue that recessions pose at least two kinds of problems for academic libraries, one of them quite obvious, the other one less so.

The obvious problem is that recessions bring with them reductions in income – the stuff that state legislatures and student households use to support universities, and wealth – the stuff that constitutes university endowments. Much has been written recently about the terrific endowment growth that universities experienced during fiscal year 2007. Well, the stock market has been falling quite sharply for the last several months, and I’ll bet that the number of universities whose endowments grow appreciably in the current fiscal year will be fairly small. So, the sources of the money that we spend on collections and services are likely to be under stress in the next year or so, and libraries will get to share in some of the pain that our institutions will experience. (There is a longer discussion, that I will provide at some point, about the problems that arise when institutions that collect with an eye to the needs of users over years and decades have to deal with the vagaries of budgets that bounce around from year to year. Briefly, it would be good for both us and our universities to try to smooth out the effects of the business cycle, but that’s not easy to do.)

The less obvious problem has to do with the indirect effects that a recession will have on the behavior of publishers and media companies as they continue to press Congress for protection against all and sundry, most emphatically including libraries. Bill Patry has a number of nice discussions of the remarkably disingenuous rhetorical turn undertaken by publishers, the RIAA, and other representatives of copyright holders as they cloak simple greed in the language of the moral high ground.

Claims that copyright involves human rights or is a property right are based on the theory that copyright is also a natural right — a right that exists independent of legislative enactment, even if there are legislative enactments. In the United States, copyright is not a natural right, since the Supreme Court has said so twice, first in 1834 in Wheaton v. Peters, and then in 1932 in Fox Film Corp. v. Doyal. Yet, rhetoric based on a natural rights basis for copyright are behind all the claims that those who use copyrighted works without permission are thieves or pirates. If copyright is instead a limited privilege that parcels out limited control to copyright owners, one might view issues differently. [Patry Copyright Blog, Jan. 18, 2008]

What does this have to do with recessions? Well, one of the things that happens when times are tough is that those who are having tough times seek public relief. Quite appropriately, Congress and the President are now working on developing a stimulus package to aid the economy as a whole, and the Federal Reserve has just implemented a cut in interest rates designed to forestall a recession. But individual industries will also seek specialized relief, and will attribute their problems to causes for which they have favorite cures. The favorite cure for the media companies, of course, is ever-tighter intellectual property laws, with ever-greater limitation (or at least a climate of fear) around legitimate fair uses. Just watch, if a recession unfolds, as the media go back to Congress and ask for protection against the public and the libraries, even though the causes of their current problems are changes in technology to which they have adapted badly, as well as the recession itself.