Research?

Teaching

Downloads

Classes

Publications

memento2009

Summary

Van de Sompel, Herbert, Nelson, Michael L., Sanderson, Robert, Balakireva, Lyudmila L., Ainsworth, Scott G. and Shankar, Harihar, "Memento: Time Travel for the Web," arXiv, November 2009, p. 14.

Abstract

The Web is ephemeral. Many resources have representations that change over time, and many of those representations are lost forever. A lucky few manage to reappear as archived resources that carry their own URIs. For example, some content management systems maintain version pages that reflect a frozen prior state of their changing resources. Archives recurrently crawl the web to obtain the actual representation of resources, and subsequently make those available via special-purpose archived resources. In both cases, the archival copies have URIs that are protocol-wise disconnected from the URI of the resource of which they represent a prior state. Indeed, the lack of temporal capabilities in the most common Web protocol, HTTP, prevents getting to an archived resource on the basis of the URI of its original. This turns accessing archived resources into a significant discovery challenge for both human and software agents, which typically involves following a multitude of links from the original to the archival resource, or of searching archives for the original URI. This paper proposes the protocol-based Memento solution to address this problem, and describes a proof-of-concept experiment that includes major servers of archival content, including Wikipedia and the Internet Archive. The Memento solution is based on existing HTTP capabilities applied in a novel way to add the temporal dimension. The result is a framework in which archived resources can seamlessly be reached via the URI of their original: protocol-based time travel for the Web.

Bibtex entry

@ARTICLE { memento2009,
    ABSTRACT = { The Web is ephemeral. Many resources have representations that change over time, and many of those representations are lost forever. A lucky few manage to reappear as archived resources that carry their own URIs. For example, some content management systems maintain version pages that reflect a frozen prior state of their changing resources. Archives recurrently crawl the web to obtain the actual representation of resources, and subsequently make those available via special-purpose archived resources. In both cases, the archival copies have URIs that are protocol-wise disconnected from the URI of the resource of which they represent a prior state. Indeed, the lack of temporal capabilities in the most common Web protocol, HTTP, prevents getting to an archived resource on the basis of the URI of its original. This turns accessing archived resources into a significant discovery challenge for both human and software agents, which typically involves following a multitude of links from the original to the archival resource, or of searching archives for the original URI. This paper proposes the protocol-based Memento solution to address this problem, and describes a proof-of-concept experiment that includes major servers of archival content, including Wikipedia and the Internet Archive. The Memento solution is based on existing HTTP capabilities applied in a novel way to add the temporal dimension. The result is a framework in which archived resources can seamlessly be reached via the URI of their original: protocol-based time travel for the Web. },
    ARXIVID = { 0911.1112v2 },
    AUTHOR = { Van de Sompel, Herbert and Nelson, Michael L. and Sanderson, Robert and Balakireva, Lyudmila L. and Ainsworth, Scott G. and Shankar, Harihar },
    JOURNAL = { arXiv },
    KEYWORDS = { Digital Libraries,Information Retrieval,Memento },
    MONTH = { nov },
    PAGES = { 14 },
    TITLE = { Memento: Time Travel for the Web },
    URL = { http://arxiv.org/abs/0911.1112 },
    YEAR = { 2009 },
    PUBDATE = { 200909 },
}

History Print Recent Changes Search

Page last modified on March 01, 2014, at 11:00 AM