Find a SECReT supervisor
Information for overseas students
View SECReT animation
Download SECReT brochure

Secure digital archive search using a probably approximately correct architecture

22 February 2012

Sami Richardson

There is a growing interest in digital archives. A number of digital archive projects exist that use peer-to-peer (P2P) networks to provide long-term, persistent storage of digital content. Several are designed to withstand censorship attempts. They allow retrieval of known items, but the problem of indexing content and providing a secure search service, free of any controlling authority, remains unsolved. One solution might be to use the same P2P network to also host the search service.

We evaluate the security of a Probably Approximately Correct (PAC) architecture. Such an architecture could be used to handle search of digital archive content on a P2P network. PAC search involves dividing up the document search index, distributing these partial indices randomly across nodes (computers), and performing queries by consulting a random subset of partial indices. Documents can be allocated to partial indices uniformly randomly, or in proportion to their query popularity. The former provides security against censorship by ensuring all documents are indexed on a similar number of nodes, and the latter improves search performance but results in the placement of unpopular documents on relatively few nodes, increasing the risk of censorship.

We propose a hybrid combination of uniform and non-uniform PAC search to achieve security and performance. Four different attacks are modelled and their effectiveness assessed. With our attack models, and knowledge of the resources of an adversary, it would be possible to design a PAC search system that has a known probability of resisting attack.