This document is http://videdot.com/old/project-report.html

At: /old/project-report.html

Project Report

Final Report for MEng Part IV Degree Project in Computing at Imperial College, London.

If you are thinking about printing this document, there is no separate 'printable version' but you should read the printing guidelines at the bottom.

Abstract

The aim of this project, videdot, has been to integrate a number of developments in various fields into a single appliance-like package.

videdot takes advantage of the organ-bank of cheap PC hardware by tackling the key problems in software, and merely presuming that appropriate codecs and drivers will be available for the latest hardware of the day. Thus, as everything gets faster, cheaper or packaged into cheap hardware this system will see the benefits of these advances.

The drive here is towards more open use of the network. videdot uses collaborative filtering to suggest and speculatively record useful shows, it supports direct interaction between users to provide sensible human suggestions and can be hooked into any available file-sharing network through a flexible search programming interface.

videdot is built on top of BeOS, a freely available, closed source operating system.

Acknowledgements

Ian Harries
My supervisor, for agreeing to supervise me, and offering useful insight, advice and guidance.
Mary
My wife, and a wonderous source of support, care and sanity in a crazy world.
Chris Jackson
For his differing viewpoint, which has challenged my often stubborn ideas with grace and understanding
Tom Coerkamp
For not getting too upset with random calls at all hours, and the free expression of ideas over beer.
Rob Chatley
For a clarity of expression I would do well to mimic.
Gef Howie
Who is never afraid to think differently, and cogently
Nic Hargreaves
For a breath of rationality
To friends and family as yet unmentioned
Who may have wondered where I disappeared to, my apologies
Adam Kirchoff
For developing an elegant solution for driving the VideoRecorder, instead of a hack.
Be, Inc.
For delivering an operating system which works
The Press Association
For agreeing to supply structured XML TV listings
Future i
For hosting and the opportunity to change the world
The entire ArsDigita London office
For their support, wisdom, a friendly working environment and patience with my very odd timekeeping
The World Wide Web Consortium
For enabling me to produce accessible documentation for the whole project, and helping make the Web somewhere where information can actually be found and used.
BeShare folk
For their responsiveness, in building a community and showing how searching across attributed files can be so potent
Jef Raskin
For persuading me how to care about user interfaces, moving me past knowing that I did care.
The staff and patrons Finnegan's Wake and Esquires
For coping with all of us conversing in purely technical language, in an array of mental states.
Film and Television Programme Makers - especially jms
For giving so much blood, sweat and tears to tell their story to the world. Without them there would be no video to record.
NTL
For their demonstration of how you can break a user interface to an Electronic Programme Guide

Contents

1 Introduction

The aim of videdot is to produce an appliance which runs on a variety of cheap hardware, offering a simple installation to convert a relatively ordinary PC into a dedicated appliance.

videdot is not aimed as an application for existing PC users, but instead at the more general home market. This project is aimed at replacing your current VCR, rather than as a way to record television on your PC.

There are now a number of suppliers fitting standard PC-clone organ-bank parts into small, thin, quiet cases. With this, the ever-falling price of the necessary components for a PC, and the wisdom of dedicating a box to something which will likely benefit from stealing all of the processing power, it is quite viable to build a black box solution while still using off-the-shelf parts.

This project also focusses on the personal aspect of the problem. It ought to be possible to deliver more and better information to the user, and allow them to collaborate with other users, by making full use of the growing network.

There has been incredible growth of a number of peer-to-peer networks in recent times. Rather than attempting to build a new bespoke scheme for collaboration, videdot instead allows for add-ons to be written which could hook any videdot box into a much wider network of file- and information-sharing networks.

The rest of this report follows the following structure:

The Background chapter covers what is available or in the pipeline in this field. It also looks at the peculiar nature of obtaining reliable television listings and what we can hope to gain from using the network.

The Requirements chapter examines how the requirements were gathered, as well as describing them.

We then look at the overall design, and why various decisions were made in the Design chapter.

Implementational Detail covers the way things were done in considerably more detail, in particular looking at the way listings are handled locally, and the code that runs on the various supporting servers.

After a discussion of how the project was evaluated, and the results, this report closes with a section examining the achievements, difficulties and future directions for this work.

The User Guide, more detail about the underlying operating system, BeOS, and the format of the listings feed are attached as appendices.

2 Background

2.1 Current Products

There are a number of current computer-based video recorders available. We look here at a representative sample of the available appliances and services:

TiVo

TiVo, Inc.[tivo-web] is a services company, not a manufacturer. They write the software and run the service, while they license hardware manufacturing to a number of companies - Philips, Sony, Thomson, and Hughes. In the UK, presently, Thomson are the only manufacturer supplying the devices.

TiVo's main source of revenue is from it's subscription service, so it is not particularly interested in selling the boxes without the TiVo service. The boxes are still usable without service, but only with manual recordings, so many of the most interesting features disappear.

Under the subscription package the TiVo device dials up daily to fetch listings from TiVo. It also receives suggestions of useful programmes, which TiVo may speculatively try to record. Software upgrades are also delivered over the dial-up connection both to subscribed boxes and also unsubscribed boxes that are still connected to the telephone line. There is no way to back out of an upgrade, which caused some consternation recently when a new upgrade disabled various features for non-subscribers. Notably the ability to manual record shows - more information can be found on the Web - supplied by a user[fsk-tivo] and in a discussion forum[avs-tivo]

The feature TiVo is probably most noted for is the ability to pause live television. You can press pause, answer the telephone, keep watching just behind your recording and perhaps even catch up during advertisement breaks.

There are a number of things that the TiVo does, which are not at all obvious at first glance. It is always recording, so carries a thirty minute cache of whatever channel you are watching through it. Largely because of this it does not use PDC as the box contains a single tuner, so tuning to a particular channel to listen for PDC codes. Curiously it does not use this buffer when starting a recording manually, although this is a planned future feature.

TiVo's closest competitor is ReplayTV, who offer a similar product, with certain differences in where they collect their revenue and a number of different features. A third party Website offers a feature-for-feature comparison.[tivo-replay]

There are quite complete discussions of TiVo's various features on the unofficial TiVo Frequently Asked Questions site.[tivo-faq]

RecordTV

RecordTV are presently not providing their streamed Internet Television service, while they "resolve legal issues". Their service revolved around allowing people to come to a website and mark off programmes to be recorded, and then return after they had aired to download the video file. Since launching in 1999 they have faced a number of legal challenges and may well be mired in legal issues for a long time to come. Although the scheme they use is very sensible, in many ways, their case as a whole is highlights the dangers of building a central point without having appropriate arrangements with anyone who may feel aggrieved.

HomeChoice

HomeChoice offer a streamed video service over an Asynchronous Digital Subscriber Line (ADSL) connection. They use a particular form of ADSL which is tailored to streaming video.

The programming they offer is not ordinary television, however. The licensing agreements they have with programme producers is for specific shows, rather than scheduled programme channels. Having clear licensing arrangements for the programming they distribute puts them at a distinct advantage - legally - of course.

The other manner in which HomeChoice is interesting is in their interface. They allow the user to control the video being streamed to their set-top box, and watch whatever show they like at any time.

2.2 The Near Future

There are a number of developments on the near horizon too. Although it is worth bearing in mind that until any device is actually launched, any number of press releases stating that company X is working with company Y will not make them available. The most interesting developments here are those that wrap network access, streamed media and locally captured content, especially given the obviously superior quality of capturing the digital stream directly within the set-top box, compared with converting and recapturing.

All of the network providers have shied away from providing set-top boxes which output in any digital form, with the obvious concerns for piracy. Systems have been developed and demonstrated in the past, but they are not widely available. Notably the Royal household is rumoured to have a Sky

Pace

Pace have licensed the Dreamcast technology from Sega for use in set-top boxes. With a large hard-disk Pace are pushing their PVR as both a games console and a video recorder. Games would be streamed to the PVR over a broadband connection.

Due: Spring 2002, expected price £200 - £400 but this may well not be seen to end users, where - for example - cable company may still own the hardware.

TV Linux Alliance

A collaboration involving many of the big players of the PVR-world to provide a consistent, well-known layer on top of the Linux Kernel and existing TV APIs which is particularly targetted for embedded television and PVR use.

http://www.TVLinuxAlliance.org/

Nokia

Nokia's Media Terminal is a Linux-based home entertainment device.

Development tools for the Media Terminal are due in the 3rd quarter of 2001, while Nokia have said that the device itself will be available 'in Sweden in the middle part of 2001 and later on in the year in Europe and North America.'

Nokia and OnDigital are working together in the UK to produce a Media Terminal for the OnDigital terrestrial digital platform.

Microsoft's X-Box

The postulated features for the X-Box are still in flux. It is certainly a popular rumour, that Microsoft's new console will include PVR features. It will be interesting to watch how they intend to leverage the X-Box's network capabilities with their existing interests. Past experience suggests that they will tie them together very intimately.

Sky

After much trumpeting last year, including a press release, that Sky were working with TiVo to produce a digibox with integrated PVR capabilities, to be delivered last autumn.

Sky are also providing TiVo's technical support and billing in the UK.

Separately, Sky will also be offering a box based on NDS' XTV, according to a report on the digital spy website.

It reportedly has a 40 gigabyte harddrive, is integrated with Sky's EPG and it is suggested it will be able to archive recordings onto tape. It is due to roll out under the 'Sky Plus' brand sometime in .

Digital Terrestrial

OnDigital, the UK's digital terrestrial provider, have various upcoming offerings which include PVR features, most notable is the Nokia Media Terminal, described above.

Cable Companies

There are now practically two cable companies in the UK, NTL and Telewest. For their digital TV offerings, both are using Liberate's platform, on various hardware, complicated somewhat by the way they have built themselves out of smaller cable companies by acquisition. In particular, NTL still have two quite distinct digital platforms, their original one, and the one they use in old Cable and Wireless areas.

Liberate runs on a number of manufacturers hardware, typically Pace for the boxes deployed by NTL and Telewest in the UK to date. They are also listed - along with most of the industry - as partners by NDS, whose XTV system Sky will be using as a recording set-top box.

2.3 Listings Supply

There are various sources of listings data. A common method, known as 'screen scraping', is to simply scrape this information from publically accessible sources, usually on the Web.

There are also a number of companies distributing listings commercially. It is these companies, or their clients who typically run the sites people might scrape the data from. Obviously, as businesses these people can be understandably upset, and it raises the possibility of legal problems.

In the interests of collecting well-structured, clean data I approached the Press Association for samples of their XML feed. They agreed to supply listings for the final stages of the project, to allow for user testing and demonstration. This provided a much cleaner source of listings data, which will remain stable even when people redesign their websites.

The reasoning behind this choice is discussed in Design Decisions : Listings Data below. There are also some oddities with the Press Association's XML feed, which is covered in Implementation Detail : Fetcher below, which is the part which parses the listings and stores the information on the local machine.

At present there is no truly free source of television listings data. Although there may be a persuasive argument that the individual channels should want people to know what they are showing and when, especially as more and more channels come on to the airwaves. There may also be pressure from pragramme makers for this as a form of promotion to drive more viewers to their shows. However, it is well beyond the scope of this project to attempt this sort of political effort and we will instead focus on what we can best do with what is available.

2.4 Programme Delivery Control

Programme Delivery Control (PDC) is a scheme used on terrestrial broadcasts in the UK to indicate delayed and overrunning programmes.

PDC works using Programme Ident Labels (PILs), which uniquely identify a programme by its start time within a certain window (around that day).

The PIL code is broadcast as part of the teletext data on analogue television. It is first sent just before the programme starts, and then once per second throughout the broadcast of a programme.

There are a few pitfalls with using PDC:

  1. You need to know the original intended start-time for a broadcast. This is problematic if a show moves before the listings feed is sent. This is a reason to have as good a listings feed as possible, with such changes marked up.
  2. In order to listen for the PDC information, you need to be tuned to the appropriate channel. Thus, without two tuners one of the biggest benefits disappears - you cannot know when to switch channels to see if it has started yet.
  3. There are occasions where using PDC will cause problems. It is not always well advertised that a channel will be broadcasting PDC codes reliably, or whether the signal strength will necessarily remain good. Using unreliable codes is substantially worse than just ignoring them.

videdot doesn't use PDC presently as the benefit to making it work is outweighed by the potential for problems and confusing the user. Moreover, PDC will be substantially less available in the near future, as it has not been carried forward on to any of the digital platforms.

2.5 Network Effects

There has been huge growth in the past year of various file-sharing networks. However there is more to peer-to-peer networking than copying music files.

One of the strongest factors in a service taking off online is how well it binds its members together. In particular, one attribute which it has been claimed aided Napster's swift rise was that ordinary, selfish behaviour had a beneficial effect for all. In Napster's case, when you download a file, the default behaviour is for you to thereafter be sharing it to the rest of the Napster-using public.[P2P-2001]

Letting people collaborate, and delivering more and better results to them for participating are both core aims of this project.

The literature discussing online communities is quite broad, but particular highlights include Howard Rheingold's classic text on the social interactions online[rheingold] and Philip Greenspun's examination of building collaborative communities on the Web[philg].

3 Requirements

3.1 Elicitation

The requirements for videdot were collected in a largely informal manner, and then written up into a sensible set of goals.

The main source of information was in observing people use existing systems, and talking to them about their desires and frustrations with them. The most oft-sounded outcomes from these discussions were:

3.2 Core Features

The choice to make videdot an appliance-style project, rather than a PC application leads to a number of choices about the interface. It ought to work simply, with single, obvious, well-labelled buttons. It is desirable for it to work from keyboard only as well, since this translates more directly to remote control presses, rather than necessarily needing a mouse to operate it.

For an appliance, in particular, people have different expectations over power, in particular. You do not expect to have to shut down your VCR, and on occasion people are more likely to pull the wrong plug out for their HiFi than they are for their computer. So, it should work properly and smoothly after an interruption to power, which means keeping the necessary information on physical disk, rather than in memory, and using a filesystem which copes well with unannounced power outages. Boot times in general are also a factor, but waiting to run a full filesystem check on a many-gigabyte hard disk is hardly an option.

The core features, as identified at the start of the project, were:

Be reliable and predictable
Addressing the main frustration people expressed with their more recent and complicated home-entertainment devices.
Flag programmes in the listings simply
By being able to look across several channels at once for a particular time and see what will be recorded
Chose channels
This could be done by having the user wade through page upon page of channel names. On the other hand, given the large number of channels available, and the manner in which they are sold, it makes sense to allow the user to simply chose the packages they have subscribed to.
Fetch listings for chosen channels
Which will need a good source of structured listings. It is anticipated that structuring an available source will be necessary or desirable.
Search in listings and recordings
Preferably using an incremental search method, so that while the search term is typed, more refined results are displayed progressively.
Annotate recordings with all available meta-information from the listings
Storing this information obviously allows for more potent searching.

3.3 Desirable Extras

At the outset of the project, a number of features were identified as desirable, rather than essential. In the Evaluation chapter we shall look more at which of these have been implemented.

As shall be seen in the Design chapter, even where these features have not been implemented, they have generally been accomodated, to allow for more features to be added over time, past the end of this project's life as an academic project.

Scheduling prioritised recordings, and resolving conflicts intelligently
To cope with the potential clash of scheduled recordings properly. Clashes are all the more likely given 'No Click Recording'.
'No Click Recording'
Schedule for recording anything which, from historical information, looks as though it will be of interest.
Record subtitles, and allow for searching within them
This is useful as archives of recorded programmes become larger. Being able to ask for a news programme, between certain dates and mentioning 'hijack' and be returned to the correct point in the bulletin is obviously a potent feature.
Pause live TV
Could be quite hard, particularly in ensuring the system can keep up on modest hardware.
Control over the quality/space tradeoff
So that a recording can take an appropriate amount of space given the show, or squeeze a recording into whatever space may be left.
Offline storage
Preferably retaining the metadata, so that powerful, local, fast searching is still possible

3.4 Using the Network

There are a number of desirable features to use the network to its full potential.

Accepting requests over the network to schedule recordings
These can simply be treated as suggestions, or by allowing authenticated suggestions, and the local user to choose who they trust, quite powerful scheduling.
Suggesting shows, programmes or series based on others' behaviour
Even a relatively crude measure for suggesting programmes automatically could offer substantial
Requesting the recording of a particular show from a neighbour
To cope with, say, recording two things at once with only one tuner. This is a step short of sharing everything across the network.
Broad sharing of everything between a number of neighbours
Via an existing peer-to-peer network, through our own centralised point where the searching occurs, or with a bespoke scheme.

4 Design

4.1 Key Decisions

Building on the decisions taken in determining the target audience and their requirements, the key decisions taken early on in designing videdot were:

Choice of Operating System

In choosing what system to build the system on top of the main factors were:

BeOS is an operating system developed by Be, inc. which runs on x86 and PowerPPC processors.

With the particular needs of this project, the filesystem is one of BeOS' strongest points. Microsoft's NTFS has - in theory - many of the desirable features of BFS, arbitrary attributes and searching, journalling and 64-bit addressing. However, these features are not core to Windows NT, which must be able to work on a straight FAT partition, so support for them is less well-developed. Only with the next release of Windows (XP) will the OS ship with a querying tool for searching across arbitrary attributes.

Linux has growing support for a variety of attributed filesystems, but again support is not fundamental to the system. BeOS' BFS queries are not only well-supported, but also surpass XFS, JFS, ext3fs, reiserfs and NTFS in supporting both node watching and live queries. This allows for a query to keep running, and when the result set changes, messages are sent to the appropriate thread.

Powerful querying on the local system influences much of the rest of the design, and the BFS, and the depth of support for features like live queries are the biggest single reason for choosing BeOS for this project.

Another advantage to choosing BeOS in the short term is the possibility to move the project to BeIA in the future, for a more commercial implementation. BeIA is Be's Internet Appliance platform, which shares much of its underlying structure with BeOS, but with some more features which are particularly suited to Internet Appliances.

A more complete description of BeOS is in Appendix B. For information on BFS, the designer's own book on the subject is a superb resource.[giam]

The disadvantages to choosing BeOS are principally that:

Local processing and storage

There are three main possibilities for where to record and store the video with a networked computer-driven video recorder:

The first possibility requires a powerful bank of servers which would become a natural bottleneck in a popular system. It also has some legal problems, if the people operating the servers have not licensed the programmes appropriately. It is the sort of approach that HomeChoice have taken, generating revenue by offering licensed programming and films to subscribers. They have dealt with the lack of high capacity general network links by using a specialised network, dedicated to this purposes, running over BT's VideoStream form of ADSL.

The second possibility is attractive, especially if the network is made of various disparate nodes each of which contributes an appropriate amount of resource to its usage. Potentially it scales very well. However, at present the infrastructure to support such a scheme is not in place. The closest homes in the UK come to sufficient network capacity is those with cable modems and some of the more expensive, high-capacity subscriber line services.

videdot is designed to fit now and extend happily into a world with more substantial and widespread network connections. With the current state of networking, especially to the home, it makes a lot of sense to record what you want locally, and then keep finding new ways to share information with the rest of the world. There is a substantial gain from having many, disparate, distributed nodes behaving independently. The system can, conceivably, support itself. Since peer-to-peer networking is being developed apace by others, it follows to allow for hooks into those networks from appliances. This way, videdot benefits from these developments, and avoids building a large architecture which must grow as the number of clients grow.

Listings Data

'Screen scraping' is the process of taking data intended for human eyes, and deconstructing it to infer structure to the content. Nowadays, it typically involves pulling apart websites which have the information you want and rewriting it in a more structured form. There are a number of problems with this approach.

Firstly, it is of questionable legality, most websites supplying listings explicitly prohibit doing this, and relying on it for a successful project could mean facing litigation. Secondly, it is utterly fragile. When whatever website changes its design, the screen scraper breaks. What is worse, it may be possible for it to break in such a way as it is not at all obvious the data now gathered is actually wrong.

This becomes all the more likely if the maintainers of the foreign website actually try to make screen scraping difficult. It would be quite simple to find ways to keep changing the information, without changing its presentation in a browser substantially, but making it fluid enough that the regular expressions governing the screen scraper generate utter nonsense.

This fragility leads screen scrapers to use a number of websites, to try and gauge the reliability of the information. While this will improve the quality and reliability of the results, it also raises the amount of work required in maintaining all of the patterns used to match against each site.

Instead, this project uses a suitable listings source, supplied in a structured form. The Press Association have supplied XML listings for the duration of the project, and for a commercial application of this project, would be paid a suitable fee for the supply of listings to the videdot clients. The DTD for the XML used is included in Appendix C. Other listings suppliers can be supported also, since the component which loads the listings into the filesystem is independent of the rest of the code. The only contract between the 'Fetcher' and the rest of videdot is the attributes attached to the listings files.

4.2 Overview

This section deals primarily with what is done, and gives an overall picture of how the client works. For more technical detail, please refer to the Implementational Detail section below.

Looking at a single installation of videdot, there are several major components:

Local <em>videdot</em> installation

The videdot application starts the whole procedure. On startup it ensures that the Recorder and Fetcher processes are running, starting them as necessary. The videdot application runs full-screen on the device, and provides the interface to the user.

The Fetcher background task periodically goes to fetch listings from the Internet, and stores them in attributed files in the filesystem.

The Recorder background task is invoked by the main application as necessary, with parameters which govern which channel is to be recorded, for how long and with what quality.

There is a separate thread in the application, the ScheduleBot, which queries the filesystem to find programmes which have been flagged to be recorded. It resolves any conflicts and fires off messages to the Recorder as needed. For programmes which are 'bumped', that is deliberately not recorded in place of a more important programme, the ScheduleBot pushes a request upwards to the collaboration server, which will ask any clients that can to record the show, and it can then be found from the network in future.

The Player is only invoked when the user chooses to watch a recording. It is started with the location of the recording to be watched, and the point in the file to begin watching from. The Stop/Play/Forward/Rewind/Pause controls are still handled by the videdot application, so that keyboard and infra-red controls can be handled in one place. This control information is passed through to the Player using messages.

The other substantial part of this project is to support collaboration over the network. The Search Bots allow for a variety of modules to support your file network of choice. They can return search results to the search interface in the main application, and process remote requests as they see fit.

Each videdot installation collects its listings from a Listings Server somewhere on the network. To support different regions and localities properly it makes sense to allow for a number of distributed Listings Servers, so a pair of clients may connect to different Listings Servers.

Given this, there is some work in identifying two instances of a particular programmes as the same show. For our purposes, we deal with this on the Collaboration Server, and do not try and associate one instance of a particular episode of a show with another. There are two reasons for taking this approach. Firstly, even when a channel shows the same episode of a programme twice, it is not necessarily the same content. With a number of shows there are two edits, one pre- and one post-watershed. Secondly, although a programme is marked with the same title and episode name, it does not mean that they even relate to the same programme. In the recent British General Election Party Political Broadcasts were labelled with episode names of the broadcasting party. It would be unfortunate for a keen politico to be outwitted by their video recorder thinking it was intelligent by avoiding recording the Conservative Party's second broadcast since it had recorded the first.

Collaboration is supported through a two mechanisms to begin with, with the capability of adding new mechanisms over time through the Search Bots as described above. We have started with collaborative inferencing and direct user-to-user suggestions. These are described in more detail in the Implementational Detail : Servers section.

4.3 Listings Gathering

The Fetcher component is responsible for periodically getting new listings information from the listings server.

There are two parts to the Fetcher. The first fetches the listings from the listings server, the second takes the XML delivered and parses it into attributed files in the local filesystem.

The details stored in attributes align quite closely with the Press Associations data model.

We have:

Although the type fields are stored as text, they are generally multi-valued. That is a programme could be '(Drama)(Action)', which is how it is represented in the Press Association's XML. This is preserved flat in the attribute, since BFS attributes are single name-value pairs. It is converted to a real list by the ProgInfo objects through which all manipulation of the listings are performed. There is more detail in the Implementation Detail : Fetcher section.

4.4 Main Interface

The layout of the full screen interface has a central view with the current information, single-line status information at the top and bottom, a toolbar off to the left side and a details planel at the bottom, allowing the user to manipulate whatever the currently selected programme is. The central view shows either TV Guide-style listings or search results at any time.

Main application interface

The basic operation is quite simple. Wherever the user selects a programme, it is highlighted and the details appear in the bottom panel. This provides appropriate controls for the programme depending on whether it is flagged to be recorded, in the past, future or currently being shown and (for programmes in the past) whether we have a recording locally.

This consistency is in line with the User Interface design principles of consistent, monotonous gestures for achieving the same result. These principles, and an absence of modes from software, are strongly advocated by Jef Raskin in the Humane Interface[raskin2000]. By always manipulating programmes through the same bar, which never moves and offering the only sensible options in a consistent manner, we can avoid generating unecessary user confusion.

4.5 Scheduling

The ScheduleBot is responsible for watching the filesystem for listing entries which are flagged to be recorded.

It keeps a live query on the filesystem, and receives messages when programmes are flagged or unflagged.

It then builds a conflict-free list of programmes to record, based on the algorithm described in Implementation Detail : Scheduling. It also keeps a list of the programmes which have been bumped from being recorded.

The ScheduleBot also submits, periodically, all the interesting shows, and the bumped list, to the anonymous collaboration server, which will push a low-priority request to all clients to record the clashing shows. From the list of recorded shows, the collaboration server can infer other programmes which might interest this user, as described below in Collaboration.

4.6 Recording

Recording is handled by executing the configured recorder, passing in the appropriate information and leaving it to complete it's task. This allows for suitable upgrading for different scenarios, for example for a set-top box where we can access the MPEG stream directly.

The earliest intended design for this project was quite elaborate, and would have involved collecting the teletext subtitles for programmes. This is still desirable, but falls beyond the scope of this project.

For simplicity, the final approach taken was to use the Be, Inc. supplied VideoRecorder application, and control it through scripting. This is covered in more detail in the following chapter.

4.7 Collaboration

Periodically, the system should post off the user's behaviour, and collect recommendations which it annotates on the locally stored filesystem.

This could be accomplished by a number of parts within the system, since the listings display and scheduling portions of the system are watching the filesystem for changes anyway. Thus, relevant changes will propagate through automatically.

To avoid building a database of intrusive information about users, videdot takes steps to both:

  1. Only keep behavioural information about some pseudonymous user. The information is not tied to any real-world details.
  2. Each session is not tied to any other - so when a user requests suggestions they pass with their request, the identities of the programmes they have recorded and watched in the past and a list of channels they can receive. They receive in return a list of programmes on any of those channels which may be of interest.

The greatest remaining intrusions is in terms of snooping the traffic and identifying the user's IP address. and consequently - for many users - be able to connect one session to another, find their network provider or find their real location. This project does not attempt to address these concerns - should a user want greater privacy than this it would be simplest to anonymise their Web traffic by any of the many available means.

There are three forms of intrusion which it is worth paying attention to. The first is from third-parties snooping on the particular data being passed back and forth from the Collaboration Server. It may be that by associating an individual with a certain programme it is possible to put them at a disadvantage, or in danger. This can be defended against with the use of encrypted links, for example by using HTTP over SSL or TLS. Secondly, it may be that being associated with using the service at all is a danger. Thus, detecting that there is a connection, albeit encrypted between a certain server and client may be undesirable. There are a number of schemes which can do much to hide these sorts of patterns. They typically either involve forwarding the request through a large number of intermediate servers, possibly re-encrypting at each hop. Thus, in order to trace the connection back an adversary would need to approach each server operator in turn. The last potential threat is that the server-operator could infer information about a supposedly anonymous user. One approach to deal with this is to have all requests made by a conceptual cloud of clients, so tracking which client made what request is obfuscated. Thus, with enough participating peers, any peer could disclaim responsibility for a particular request. If it originated from their machine, they can simply claim that they are participating in such a Crowd[Crowds].

Another useful feature is to push requests out to the network for any programmes which are set to record, but are not because they clash with another programme.

The ScheduleBot knows what is to be recorded, and what will not be due to clashe. So, it is well placed to submit the request to the collaboration server with the list of programmes that are considered interesting, and those which it would be useful for someone on the network to record.

So, the ScheduleBot receives from the collaboration server a list of programmes, each tagged with a priority, and those programmes are flagged in the local store.

Since the system also supports sending suggestions to well identified people, it would be unwise to do this through the same server if we wish to have any anonymity. For this reason, there are two collaboration servers in this design. In retrospect it would have been more desirable to implement this by including email features within the videdot application and simply spending appropriately formatted mails. Topologically, this is more appropriate, as it performs all of the store-and-forward features for free.

As it is we use the same authentication mechanism as for the listings server, and simply fetch back the list of programmes with a periodic HTTP GET.

5 Implementation Detail

5.1 videdot Application

Programme Info

The smallest, but most pervasive element in the whole application are the ProgInfo objects. They encapsulate the information held in the attributed files described in the Design : Listings Gathering section above. They are instantiated with a reference to the entry_ref reference to the file in the filesystem.

They then read all the information out of the attributes and present an interface to return information about the underlying programme. Updating the status of the programme file on disk is also wrapped up by the ProgInfo object. However, since the same programme file may be referenced by two ProgInfo objects at once there is also a mechanism to inform them of updates.

Should the owner of a ProgInfo be concerned about changes to the underlying file in the filesystem - as is generally the case - then it must inform the created object of such changes, or ask it to re-read the filesystem entry. Since everything that creates ProgInfo objects does so based on live queries from the filesystem, it is very efficient for the owner to collect the pertinent changes for all of the ProgInfo objects it owns. Live queries are more fully described in Appendix B.

Listings Display

Displaying the listings is

Detail View

This displays the long, formatted string from the currently selected ProgInfo from any other view. It also has a simple control to manipulate the programme entry. This depends on whether the programme is in the past or not, is flagged to be recorded or not, and if it is if the ScheduleBot has bumped it or not. ProgStatusView is the class responsible for both displaying the The possibilities, from VProgStatusView.cpp are:


if (fProgInfo->IsInPast()) {
  if (fProgInfo->IsRecorded()) {
    SetText("Recorded!\nPlay It");
    SetAction(V_COMMAND_PLAY);
  } else {
    SetText("Missed it?\nSearch\nnetwork");
    SetAction(V_COMMAND_SEARCH_NETWORK);
  }
} else if (fProgInfo->IsOnNow()) {
  if (fScheduleBot->PresentlyRecording(fProgInfo)) {
    SetText("On now!\nCurrently\nRecording\nStart\nWatching?");
    SetAction(V_COMMAND_IMMEDIATE_PLAYBACK);
  } else {
    SetText("On now!\nRecord\nFrom\nHere");
    SetAction(V_COMMAND_IMMEDIATE_RECORD);
  }
} else {
  // In the future
  if (fProgInfo->IsFlaggedToRecord()) {
    if (fScheduleBot->WillBeRecorded(fProgInfo)) {
      SetText("(Record)\nWill be\nRecorded\nSee Plan");
    } else {
      SetText("(Record)\nClashes!\nSee Plan");
    }
    SetAction(V_COMMAND_SHOW_RECPLAN);
  } else {
    SetText("Want it?\nSet to\nRecord");
    SetAction(V_COMMAND_MANUAL_FLAG_TO_RECORD);
  }
}

Search Results

Search results are shown in a simple table which sorts by any of its attributes by selecting the heading of the column. When any result is selected the detail view in the bottom panel is updated.

Results are split into pages with 'Next' and 'Previous' controls to step between the pages. Resorting by a different column keeps the currently selected programme in view, or moves to the top of the list if there is no selected programme.

The results view collects new results as messages from all of the running Search Bots, which the user can control through a field of check boxes. Each result message is one of:

V_SEARCH_RESULT_NEW
id which is an identifier the search bot knows it as
prog which is an instantiated ProgInfo object, referencing either a local listings entry, or holding whatever fields the SearchBot can fill in.
V_SEARCH_RESULT_GONE
id as before
denotes that the previously added result has gone away
V_SEARCH_RESULT_UPDATE
id as before
With an optional new prog, which has updated information. If there is no prog field then the search results view will ask the existing ProgInfo to rebuild it's information from the filesystem entry.

Recording Plan View

This is a relatively lightweight view. It simply creates an extended search result view which has an extra control to raise or lower the priority of each result, including reducing it to zero - in which case the entry will disappear from view.

The entries shown are simply any which are flagged to be recorded at all. They are sorted by date order (by default). The results which the ScheduleBot says will not be recorded are flagged as such, so that the user can reorder their preferences if they wish.

That entries can be made to disappear leads to a slight discontinuity in the interface. There is no way to simply undo an action - at present - although it is a desirable feature.

Preferences

A small number of preferences are stored in the filesystem. As is the convention on BeOS, these are stored in the user's home directory in the config/videdot/ directory.

They are configured in the Setup page from the control bar, which allows for the following configuration options:

In the future, this will also be where which channels or channel-packages will be chosen.

There is also an 'Advanced' configuration panel, which is where various settings shich should not need changing in ordinary operation can be changed.

Searches

5.2 Scheduling

This is most succinctly described with reference to the code. The key function is from ProgInfo, in VProgInfo.cpp:

int ProgInfo :: CompareRecordPriority(const void *prog1, const void *prog2) {
  ProgInfo *p1 = (* (ProgInfo **)prog1);
  ProgInfo *p2 = (* (ProgInfo **)prog2);

  int pri1 = p1->GetRecordPriority();
  int pri2 = p2->GetRecordPriority();
  if (pri1==pri2) {
    int auth1 = p1->GetAuthValue();
    int auth2 = p2->GetAuthValue();
    if (auth1==auth2) {
      return (p1->GetDuration()<p2->GetDuration());
    } else {
      return (auth1>auth2);
    }
  } else {
    return (pri1>pri2);
  }
}

The ScheduleBot collects a list of desirable programmes (those that are flagged at all) and sorts them with the above function. Then, we have a list of most important first. We work our way down from the top of the list, adding to the recording schedule any programme which does not clash with any already in the schedule. This is a relatively elegant solution to the problem and leaves us with a conflict-free list, with reasonable resolution for each conflict.

Let us look again at what CompareRecordPriority looks for. We have two measures of how important it is to record a programme, the numeric priority flag and the 'value' of the user who flagged it.

The numeric priority is most important, and only if that is even do we look to the user's importance. Finally, we will favour the shorter programme, since it is likely to cause fewer clashes. Refining this function would be useful, but as is it provides sensible results in most cases.

An alternative approach was considered, which would look to build the total sum weightings of each possible resolution list. In practice the number of cases where it made any difference was minimal if we instead focus on setting sensible priorities for programmes and those asserting them.

There is also a substantial amount of code to efficiently mimic the above behavious when we have a new arrival or departure. For performance reasons, as the list becomes large it is obviously advisable not to rebuild it completely at each change. The intricacies of this code are less interesting than the outcome, that we respond in good time to requests.

5.3 Fetcher

As described in the Servers section below, the Listings server returns a page with a number of links to the suitable listings for this client. To keep things simple, we simply invoke the wget command, and ask it spider to a depth of one from this page. This then fetches, sequentially, all of the listings we want to the directory we passed to wget.

We then invoke the more substantial part of the Fetcher, which uses the Apache Simple API for XML (SAX) to parse and digest each listings file. The XML is described in Appendix C, but there were a number of curious features to the listings:

Even with these anomolies, using officially structured data is still far better than coping with data which has been reverse engineered to put some structure back in. I am genuinely grateful to the Press Association for their help in this regard.

If the Fetcher cannot see the network, it begins by calling a network-up script, then tries again. If it started up the network connection this way it also kills it with a corresponding network-down script. This is not entirely elegant, but is sufficient for most of the usual cases. The scripts go so far as to do reference counting, so they nest appropriately, but that is all.

5.4 Recorder

The original intention was to write a new standalone recording application, which received messages

While a substantial amount of work has been done on this new video recording application, insufficient time was allocated for completing it. The recording in the first version of the software is done by driving the Be, Inc's VideoRecorder application through scripting. It is anticipated that this component will be replaced in due course, to provide some of the extended features detailed in the Conclusions : Future Directions section.

Making this switch from passing messages of a certain signature to simply executing an appropriate script or application with parameters, provides another benefit. It allows for simpler arbitrary recording schemes. It provides a simple interface to using other recording applications which may be written, for example to poke or hack a set-top box and retrieve the MPEG stream.

The Recorder wrapper delivered performs the following steps:

  1. Translate the passed in quality setting to a full configuration format, consisting of quality level, resolution, colourspace, encoding and file format
  2. Puts all of this information in VideoRecorder's initial configuration, together with the filename of the recording file
  3. Invoke VideoRecorder and tell it to start recording.
  4. Sleep for the duration of the recording, then stop VideoRecorder

The invocations and passing of information to VideoRecorder use a scripting language for BeOS called hey written by Attila Mezei. The hey scripts used for this project are based on those written by Adam Kirchoff and published through BeNews.[AdamK]

5.5 Servers

There are several servers used by a typical videdot user:

This only describes the services offered by these conceptual machines. They need not be machines in their own right, but preserving a sensible separation of their concerns is a well motivated design goal. For the purposes of privacy, however, it is desirable that the services which involve identification of the remote client are kept strictly separate from the anonymous collaboration server.

This could be achieved by separating the machines physically, and have separate operators, or by an appropriate partitioning of information even within the same physical server. If any of the servers are performing even moderating logging, it would not be too difficult to correlate information between the different services should the anonymous services be colocated with the authenticated services. Should the clients be using the techniques mentioned in the design section to anonymise their access then this may be less of a concern.

Listings Server

Diagram showing listings server distributing listings to two clients

Each night the Press Association FTP the coming week's listings to the listings server. These listings are then returned to a client making an appropriate request. Since the Fetcher simply uses wget to fetch the listings, the listings server can publish it's listings at any URI it pleases, using whatever the most appropriate protocol is.

For our immediate needs, using simple authentication over HTTP is sufficient. Note that this means that the password is passed in plain text over the Internet. This is appropriate for testing purposes, and avoids needing to acquire - or invent - a signed certificate for HTTP over SSL or TLS, for example.

When considering how secure this ought to be it is worth taking a holistic approach to security. An approach such as Bruce Schneier's "Attack Trees"[schneier2000] is a good starting place. It is always worth considering what the cheapest attack for a system is, and working on making that harder, rather than securing some part of the system utterly while leaving other parts very exposed. In the present arrangement, one of the weakest links must surely be that the main listings feed is being sent insecurely over FTP.

So the videdot box would make a request like:


http://horace:password@listings.videdot.com/

On receiving such a request, the listings server, listings.videdot.com in this case, can see that this is 'horace' connecting. It looks in it's local database of which the last day 'horace' has collected listings for, and sends back a page linking to all available, newer listings.

For each subsequent request that the videdot box makes, the listings server returns the appropriate day's listings file, and also updates the record of the latest day yet collected.

Collaboration Server

Diagram showing collaboration server collecting opinions from a number of clients, and passing recommendations to another

This is another box which accepts requests over HTTP, and responds with a list of recommendations. There are two requests that the videdot box can make:

RATE
Together with a programme identifier, marks this as a programme the user likes (or dislikes). The server always only takes the most recent rating for a programme.
SUGGEST
The server collects together the most up-to-date suggestions, and returns them to the client.

Programmes are identified by taking their title, channel, time of broadcast and episode title together. For whole-series suggestions, the last two fields are not sent.

On the server side, a locally unique identifier is generated for each programme, and this user's assertion is stored.

On returning information, the information is simply returned as slash delimited lines such as:

42/BBC2/992891973/Buffy+the+Vampire+Slayer/Homecoming

To signify that the system has suggested, with a score of 42, that we watch the episode Homecoming of Buffy at the allotted time. For a series suggestion the third and last fields are left blank. The score is used on the client side to flag the programme as appropriate. Note that we do not take any notice of which channels a particular user has subscribed to. This allows the client to let the user know about high-scoring programmes that they cannot receive, and the user could find the programme from the network, a friend, or perhaps alter their subscription package as appropriate. The initial client does not do this at present, but, with the right interface, could be a useful feature particularly as the number of channels or pay-per-view events increases.

The data model used is as follows (in SQL as the most reasonable, broadly understood) syntax:

CREATE TABLE anon_users (
 user_id      integer PRIMARY KEY,
 last_access  timestamp
);

CREATE TABLE programmes (
 prog_id      integer PRIMARY KEY,
 channel      varchar(64) NOT NULL,
 at           timestamp,
 title        varchar(255) NOT NULL,
 episode      varchar(255)
);

CREATE TABLE programme_user_links (
 user_id      integer REFERENCES anon_users,
 prog_id      integer REFERENCES programmes,
 degree       float CHECK (degree>=-1.0 AND degree<=1.0)
);

CREATE TABLE recommendations (
 user_id      integer NOT NULL REFERENCES anon_users,
 prog_id      integer NOT NULL REFERENCES programmes,
 score        float NOT NULL
);

CREATE TABLE user_user_scores (
       from_user integer REFERENCES anon_users,
       to_user integer REFERENCES anon_users,
       score   float
);

So we have tables to store:

By storing when a user last accessed the system we can clear out very old users as desired. Allowing full anonymity may mean that someone receives suggestions based on their own past opinions. This ought not to be a problem, as there should still be other users on the system, and we are taking the sum of all of the opinions. The worst that could happen is that the useful suggestions will have lower ratings than ones the user has already expressed an opinion over. Since we only return new programmes anyway, this will not harm the results we serve to the user.

user_user_scores is probably the most interesting of these tables. It stores a measure of how useful one user's opinions are to another. That is, how much correlation there is between the things they have both 'rated'.


CREATE VIEW user_recommendations
AS
SELECT
	uus.from_user as user_id,
	pul.prog_id as prog_id,
	sum(uus.score*pul.degree) as score
FROM
	user_user_scores uus,
	programme_user_links pul
WHERE
	uus.to_user = pul.user_id
GROUP BY
	pul.prog_id, uus.from_user
;

This view is the one we actually drag out the useful information for the user when they need it. If user_user_scores rates how useful this other user's opinion is, then uus.score*pul.degree, that is how useful they are to us times their opinion ought to describe how much we will like the show. By summing this across all users, then we have a measure of how useful this programme is to us.

The astute reader will have noticed we haven't actually worked out the contents of user_user_scores. Since PostgreSQL (in version 7.0 at least) will not allow a view to calculate a composite function of a composite function from another view, we implement this as a function in the database.

CREATE FUNCTION recalculate_scores_for_user(int4)
RETURNS int4 AS '
	DELETE FROM user_user_scores where from_user = $1;
	INSERT INTO user_user_scores 
	       (select 
		       	pu1.user_id as from_user,
			pu2.user_id as to_user,
			sum(pu2.degree*pu1.degree) as score
		from 
			programme_user_links pu1,
			programme_user_links pu2
		where 
			pu1.user_id = $1
		and
			pu1.prog_id=pu2.prog_id 
		and 
			pu1.user_id!=pu2.user_id
		group by 
			pu1.user_id, pu2.user_id
		);
        -- now return a dummy value since psql 7.0 does
        -- not allow for void functions
	SELECT 1 as ignore_me;
'
LANGUAGE 'sql';

This does two things. Firstly, it drops all of the links from this user to any other. It then inserts the contents of the inner select statement into the user_user_scores table.

For each programme which both users have rated, the product of their degree will generate a larger number if they agree and a sizable negative number where they disagree. Thus we have a measure, by summing these, of how much these people agree on the programmes they have both rated, and thus of how much consensus there is between them.

Messaging Server

Diagram showing messaging server passing on a message from one client <em>videdot</em> box to another

This exists principally to support passing recommendations from one person to another. To do this someone only needs to know the 'nickname' of the person they wish to suggest a programme to. On entering this it sends a request to the Messaging Server, again as a suitably formatted HTTP GET:


http://horace:password@users.videdot.com/jemmima/suggest/BBC2/992891973/
                                         Buffy+the+Vampire+Slayer/Homecoming

So our user horace would like his friend jemmima to know that Homecoming, one of her favourite Buffy episodes will be on at a certain time (in seconds since the Unix epoch) on BBC2. Note that ordinary URL encoding is used, where space becomes + and any other reserved character is replaced by its percent-hex-hex encoding (e.g. %3A represents ':'). It is then up to jemmima's videdot box to cope with this request when it receives it. That is discussed above. As far as the server is concerned it need only deal with the sort of request above, store the requests appropriately (in our case in a Postgres database) and return them to the appropriate user.

A user request is of the form:

http://horace:password@users.videdot.com/retrieve

Which returns the list of programmes formatted as for the Anonymous Collaboration server (slash delimited fields, URL-encoded).

This could be extended to include more person-to-person messaging. However, it would make more sense to achieve this with a well-supported messaging architecture - such as Jabber.

6 Evaluation

6.1 User Testing

The interface, as it was developed, was done so with user cooperation along the way, the results of which have been to change the phrasing of captions in a number of places, add clearer demarcation (the raised edge) between programmes in the listing view, change the manner of highlighting various facts and adding type-colouring.

There are some remaining quirks in the interface, the following are condensed from user comments:

6.2 Resilience Tests

As an appliance it is vital that the system can cope under unexpected conditions. The main test we seek to pass is coping with unexpected power interruptions during recordings.

The underlying operating system helps us in this goal, the Be File System (BFS) is journalled, so we avoid any need to run a block-by-block check even when the filesystem was not unmounted cleanly.

The design of the listings files keeps all information in the file system, and any change of state is written out immediately. Thus, when we restart all we need to find is in the filesystem already.

The final challenge is in appending cleanly to the end of the file, and cleaning up the break in the media file. At the time of writing, the system sucessfully creates two movie files, the first of which is inappropriately terminated (since it was being written as the system went down). With a suitable player, we can play up to the join, and then skip to the other file. A cleaner solution would be to mend and join the files into a single movie with a disconinuity at the break. There is a suitable place to add this, but it remains a possible future extension.

6.3 Collaboration Results

This is one part of the project where proper evaluation has not been possible. More time would have been needed watching a substantial community of interacting users. This is discussed further in the Conclusions.

7 Conclusions

7.1 Achievements

The key achievement of this project has been in building the right framework. While some of the individual components are still quite basic, the overall architecture provides the right platform as an ongoing development.

The fundamental choices made are sound ones. Choosing to build on BeOS, and in particular the features of the Be File System, allow for very flexible, powerful and most importantly fast local searching. Keeping all of the meta-information in attributed files, and running live queries over them keeps the whole architecture very clean.

Choosing to collect listings from a recognised provider, rather than reverse-engineering the structure, gives a considerably cleaner and more stable stream of data.

7.2 Challenges

The most significant obstacle that was overcome in building this project was in dealing with the 'structured data'.

I learnt the lesson quickly that putting some relatively well-defined structure around some data does not necessarily make it any cleaner. This is discussed in detail in the Implementational Detail : Fetcher above, but in short there were a number of unintuitive or simply confusing elements packed into the supplied listings data.

That said, I still believe that taking officially structured data was the right approach, and the problems associated with making screen-scraping work reliably would be far worse than the few quirks to the Press Association's listings feed.

The other sizeable challenge in a project such as this is determining the appropriate scope. It is always very tempting to plan a very grand design, which fulfills any number of perceived shortcomings of existing systems. However, in the time available there was probably never time to implement some of the features that were intended at the outset.

An earlier decision to write simple scripts to handle much of the housekeeping would have allowed much more time to focus on the bolder aspects of the project - making collaboration work effectively and actually building hooks into a number of information-sharing systems.

In retrospect, it may have been wise to avoid recording video directly at all, and instead delivered a highly collaborative listings interface, and then studied in more depth, over a suitable period of time the behaviour of a community of users. Without such a long period of analysis of real use, there cannot be a very rich analysis of the benefits of the collaborative portion of the project. Mitigating this, the approach to allow for plug-ins for a variety of networks will allow for exactly this sort of analysis in the future. Moreover, by avoiding tying ourselves to any one implementation, it will be possible to ensure that videdot remains useful whatever the topically prevailing collaborative network.

7.3 Future Development

As it is, the basis for a reasonable and potent system is now in place, and it will be taken forward from here. There has been some commercial interest in using this system, in an embedded environment - in this case within a television set. In the transition from academic to real-world work there are a few areas which need to be addressed. There are also elements which have always been desirable, but there was never time to do all of them.

Enhancements

Capture subtitles from broadcast shows, add a search bot to search across them and use a player which understands them. The format for the attributed files accommodates storing subtitles. Not including this was largely a matter of time. It should also be handled in concert with handling digital television broadcasts effectively, to avoid being too tightly linked to traditional analogue subtitles.

Direct digital manipulation. It would be a phenomenal boon to capture the MPEG stream directly from a set-top box. The benefit in quality would be astounding. The notion that a digital, well-encoded signal should be taken all the way to RF, and then captured with a low-cost card and re-encoded is somewhat galling. On the other hand, it would be reasonable to assume that the digital television providers would take reasonable measures against anyone doing this. There are two main obstacles - the technical effort of getting to the signal and the legal effort of showing that this is fair use in the home. It certainly deserves more consideration for any future project along similar lines to this one.

More graceful handling of dial-up connections. This would be another relatively simple, but costly in terms of effort to debug in many places. The network-up and network-down scripts approach is simple and covers most of the usual operation, but does not provide useful feedback to the user for any of the odd cases.

Add-ons

Listings sources
Allowing for different sources, covering different territories and alternative sources of the information.
This would mean augmenting the Fetcher component of videdot. Once more sources are identified this could be dealt with cleanly using a plug-in scheme for the parsing performed by the Fetcher.
Alternative encoders
The system will accept any BeOS encoder or decoder, and they would be available to the Recorder. However, this is perhaps not the most friendly interface if the box is intended to be targetted at non-technical users. Be have developed a Management Administration Platform (MAP) for their Internet Appliance platform, BeIA which deals with this issue amongst others. It may be worthwhile moving all of videdot over to BeIA, which has only a marginally different API to BeOS.
Alternative search bots
The implemented design can be extended to cater for any network search, but time precluded implementing an interface to - say - Gnutella or a similar network.

New Features

There are a few key features which are needed to turn this project from a video recorder to a useful Home Entertainment appliance.

Email & Web
The basics of a networked browsing device, with the approach taken such features could be added through BeOS' replicants without too much interference. Replicants are discussed in the appendices.
Extension to more media
In particular DVD and digital audio. This is an obvious direction, and with the development of a number of simple, free and open projects for BeOS should not be too hard. The hardest part will be in making it work simply, straight from installation, and keeping the interface consistent.
Ideally, on insertion of a CD the machine should be able to rip the contents from it and start encoding the tracks to the digital format of choice. For DVDs, simply activating an option in the control bar to start playing and then executing the appropriate player is all that is required.
Remote control
The existing system is quite controllable from the keyboard. With only a few advances, and with available software to map infra-red remote control signals into keypresses, full remote control is within reach.
Very remote control
Allowing remote web access to each videdot box was always an intended aim for this project, but was cut due to time pressures. It requires a much fuller authentication procedure than is implemented presently. The system already caters for scheduling recordings of differing priority, which is how the suggestions feature works. Individual remote users would just need to be tied to a certain priority.
Full multi-channel support
The delivered implementation sticks to terrestrial channels only, principally to avoid the complications of controlling external set-top boxes in changing channels. The listings component already supports multiple channel-groupings (although it is hidden in the interface), which makes adding full multi-channel support largely a problem for the Recorder component.
New player and recorder applications
To handle things like subtitles, and TV-recording specific features, like advert-skipping during playback
Archiving
The scheme defined, with the meta-information separated from the bulk of the data is ideal for storing the bulk data offline, archived onto an appropriate removable medium. Time precluded implementing the interface to such a solution. By keeping the meta-information accessible, local searching can remain potent, and simply prompt for the appropriate volume when needed.

Schedule

A multi-channel version of videdot should be available by the end of the summer, with add-ons for at least one file-sharing network, and automatic CD and DVD handling.

See http://download.videdot.com/ for the latest information.

This document is online in HTML/CSS at http://videdot.com/report/, which also has links to the full source code for the project.

8 References

Books

Practical File System Design with the Be File System Dominic Giampaolo Morgan Kaufmann ISBN: 1558604979
Peer-to-Peer
Edited by Andy Oram
O'Reilly & Associates
ISBN: 059600110X
Virtual Communities
Howard Rheingold
Philip and Alex's Guide to Web Publishing
Philip Greenspun
Also online at http://www.arsdigita.com/books/panda/

Websites

[tivo-web]
Official TiVo Website
http://www.tivo.com/
[tivo-faq]
Unofficial (but very useful) TiVo Frequently Asked Questions
http://www.tivofaq.com/
[fsk-tivo]
TiVo User Page
http://www.cse.buffalo.edu/~fsk/tivo.html
[avs-tivo]
AVS Forum Discussion on TiVo 'upgrade'
http://www.avsforum.com/ubbtivo/Forum1/HTML/007306-7.html
[replay-tivo]
ReplayTV vs. TiVo Comparison
http://ourworld.compuserve.com/homepages/elund/ptv.htm

Appendices

A User Guide

Is now online at http://videdot.com/user-guide

B About BeOS

BeOS is a closed source, but free (as in beer, rather than speech) operating system produced by Be Inc.. Its strengths are particularly in handling media well, with a lightweight, strongly multithreaded architecture, which has good message handling facilities. Another strength, which is of particular interest for this project is the Be File System (BFS), which is journalled, indexed and attributed. BeOS allows programs to run 'live queries', and be sent messages when there are changes.

Attributes and Queries

There are a number of attributed file systems - SGI's XFS, reiserFS, ext3, IBM's JFS as well as BFS. The advantage to them, in general, is that files can be annotated with the appropriate information, and then found using queries, based on those annotations. The key difference with attributed filesystems is that they allow a relatively free-form of annotation, and can usually index the attributes to make searches over indexed attributes fast.

For example, BeOS ships with a People address book 'database', in which any person is a single empty file, with attributes describing their name, address, email, etc. You can then make searches over these files with anything that can query the file system. Thus, a mail client can just ask the file system for any person files anywhere on the file system where the email address is filled in, and present a list of those to the user how it pleases.

Moreover, BeOS supports live queries, which means that when running a query you can be notified of changes to anything within the set of results (new entries, entries that no longer satisfy the query). This is all based on message passing, rather than polling so is very efficient. It is the appropriate way to run the searches needed for this project.

More information about BFS, and the design decisions in making it, are given by its designer, Dominic Giampolo, in his book.

Threads

BeOS naturally creates a lot of threads, which are fortunately quite light. Creating a window, a view or a number of other constructs will implicitly create a thread. This pervasive multithreading means the system in general stays more responsive and is particularly useful on multi-processor machines.

What does this mean for developing under BeOS? In particular it means accessing resources properly, understanding the potential for deadlocks and synchronising access appropriately. Fortunately messages are well supported in BeOS, and are perfect for this task.

Media Kit

The BeOS Media Kit works using Producer/Consumer nodes. Thus you have a conceptual stream of data running from some source, in our case the video capture card, through nodes eventually to a consumer node which writes out the media file on disk.

Messages

A number of objects in the Be API are derived from the BLooper class. Most notably BApplication and BWindow.

Key points to note about messaging are:

Views

The BView class is derived from BHandler, and so can be registered with a BLooper to process messages. Views are the principal means of displaying information to the user, and a number of useful view classes are defined in Be's API.

Much more useful information about these topics are covered in the Be Book, which is essential reading in order to implement anything in BeOS properly.

Useful BeOS resources:

C Listings Feed Format

Document Type Definition

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited by Gary Bielby "PA" Listings -->
<!-- DTD generated for TV Listings-->
<!ELEMENT attr (#PCDATA)>
<!ELEMENT cast (#PCDATA)>
<!ELEMENT certificate (#PCDATA)>
<!ELEMENT channel (#PCDATA)>
<!ELEMENT channel_data (prog_data+)>
<!ATTLIST channel_data 
       title CDATA #REQUIRED
>

<!ELEMENT detail (#PCDATA)>
<!ELEMENT director (#PCDATA)>
<!ELEMENT duration (#PCDATA)>
<!ELEMENT episode (#PCDATA)>
<!ELEMENT film_year (#PCDATA)>
<!ELEMENT genre (#PCDATA)>
<!ELEMENT listings_date (channel_data+)>

<!ATTLIST listings_date
       date CDATA #REQUIRED
>

<!ELEMENT prog_data (channel, time, duration, title, episode?, type?, genre?, film_year?, certificate?, director?, cast?, detail?, attr?)>
<!ELEMENT time (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT tv_data (listings_date)>
<!ELEMENT type (#PCDATA)>

Notes


Each programme contains the following information:


1)	<channel> 				Channel Name
2)	<time>            Broadcast Time 
3)	<duration>        Duration
4)	<title>						Programme Title
5)	<episode          Episode Name/Number
6)	<type>            Programme Category. This will be one or more of the following:

		(Adult)
		(Animation)
		(Arts)
		(Business)
		(Children's)
		(Comedy)
		(Consumer)
		(Cookery)
		(Dance)
		(D.I.Y.)
		(Documentary)
		(Drama)
		(Educational)
		(Factual)
		(Film)
		(Gardening)
		(Holiday)
		(Light)
		(Motoring)
		(Film)
		(Music)
		(News)
		(Quiz)
		(Sci-fi)
		(Soap)
		(Sport)
		(Sports Related)
		(Live Sports)
		(Talk Show)
		(Weather)
		(Wildlife)

7)  <genre>    			Film Genre (Films only). One of the following:

    Comedy
    Drama
    Mystery
    Thriller
    Musical
    Adventure
    Erotic Drama
    Western
    Science Fiction
    Fantasy
    Horror
    Animation
    Biopic
    Documentary

8)	<film_year> 		Film Year (Films only)
9)	<certificate>   Film Certificate (only on Satellite films)
10)	<director>      Film Director
11)	<cast>          Cast
12)	<detail>        Synopsis
13)	<attr>          This will include one or more of the following:

	  S = programme is in stereo
	  T = programme carries Teletext or Ceefax subtitles
	  R = programme is a repeat
	  B = programme is broadcast in Black and White
	  P = this is a broadcast premiere
	  L = this is the last in a series of showings of this programme
	  D = programme is being broadcast in surround sound
	  W = programme is being shown in widescreen format

Printing Guidelines

This document is best printed using a browser which understands Cascading Style Sheets. There are specific suggestions for printing in the style sheet.

For the pretty version, you may wish to replace the first page with the one I have (to be online shortly), and simply don't print (this) last page, since once you have printed it you shouldn't need instructions on how to do so.

If you would rather a postscript version, mail ash@videdot.com, and I can make one available.