This document is http://videdot.com/ideas/fight-404

At: /ideas/fight-404

Fighting the spread of 404 Not Found

What practices and tools can we put to use to try to stem the spread of Web resources vanishing behind a 404 Not Found response.

Firstly, better management of the locations we create and publish. This broadly amounts to thinking about where we put things and decoupling what things are from how we serve them today. Thus, http://example.org/cgi-bin/flibble/v2.1/serveResult.pl?document_id=32315 is a "bad" name for a document, and is likely to be vulnerable to various changes in the server-side technologies used to serve that document. This argument is best summed up in the Berners-Lee rant on the topic.

Seconly, in domains where we assert some control we can be diligent in tracking the 404s we serve, and build the appropriate wiring to send people merrily on their way. More to follow, particularly about specific tools and packages to manage the logs and subsequent redirection tables.

Lastly, for domains that have fallen into disuse, perhaps because a company or organisation has folded, or simply not kept up their registration of the domain, we can seek solutions to move that domain into some permanent archive. This may require clever diplomatic approaches to how we assign names and numbers, in order to provide long-term names. More to follow, particularly about timeless domains and URNs, plus how projects like archive.org fit into this broad world view.

Refs to follow - especially wrt Academic publishing.

No comments yet. Add one.