title: Buckets: Smart Objects for Digital Libraries
authors: Michael L. Nelson
date: August, 2000
report: PhD Dissertation

Discussion of digital libraries (DLs) is often dominated by the merits of various archives, repositories, search engines, search interfaces and database systems. While these technologies are necessary for information management, information content and information retrieval systems should progress on independent paths and each should make limited assumptions about the status or capabilities of the other. Information content is more important than the systems used for its storage and retrieval. Digital information should have the same long-term survivability prospects as traditional hardcopy information and should not be impacted by evolving search engine technologies or vendor vagaries in database management systems.

Buckets are an aggregative, intelligent construct for publishing in DLs allow the decoupling of information content from information storage and retrieval. Buckets exist within the Smart Objects and Dumb Archives model for DLs in that we "push down" many of the functionalities and responsibilities traditionally associated with archives (making the archives "dumber") into the buckets (making them "smarter"). Some of the responsibilities imbued to buckets are the enforcement of their terms and conditions, and maintenance and display of their contents. These additional responsibilities come at the cost of storage overhead and increased complexity for the archived objects. However, tools have been developed to manage the complexity, and storage is cheap and getting cheaper; the potential benefits buckets offer DL applications appear to outweigh their costs.

We describe the motivation, design and implementation of buckets, and introduce two modified forms of buckets: a "dumb archive" (DA) and the Bucket Communication Space (BCS). DA is a slightly modified bucket that performs simple set management functions. The BCS, also a modified bucket, provides a well-known location for buckets to gain access to centralized bucket services, such as similarity matching, messaging and metadata conversion. We also discuss experiences learned from using buckets in the NCSTRL+ and Universal Preprint Server (UPS) experimental digital libraries. We conclude with comparisons to related work, discussion about possible areas for future work involving buckets, and the impact to date of buckets by early adopters.

id: ncstrlplus.larc//nelson-phd
notes: The journal model is the Journal of the American Society of Information Science (JASIS), which uses the APA citation style.
Written Dissertation:

Dissertation (MS Word) (810 KB)

Dissertation (PDF) (7 MB)

Presentation Slides:

Presentation Slides (MS Powerpoint) (1 MB)

Presentation Slides (PDF) (2 MB)


Bucket Demo (Quicktime) (12 MB)

Dumb Archive (DA) Demo (Quicktime) (10 MB)

Bucket Communication Space (BCS) Demo (Quicktime) (13 MB)

Other Material:

PhD Proposal

MS Thesis (PDF)

All previous publications