Participatory Databases

"Arguing in My Spare Time," No. 5.05

by Arnold Kling

February 20, 2002

Even my closest friends and associates do not know that I am the one to call if somebody in Silver Spring has their air conditioning break down. After all, there is nothing in my background that suggests that I could fix an air conditioner, and indeed I never have.

What I do have that can help with broken air conditioning is the name and phone number of a terrific repairman. Every time we have recommended his service to someone, the recipient has been grateful.

It makes sense that total strangers would not call me about an air conditioning repairman. Why should they trust me? But my friends and associates could take advantage of this information, if they had a way to get to it.

This is one potential use for a participatory database. I will describe the specific application later in this essay. First, I want to explain the general concept of a participatory database.

Participatory vs. Adversarial

In general, I am talking about databases with information on many people, or members. Databases enhance convenience. Databases at the Motor Vehicle Administration make it easier to track down cars that participate in hit-and-run accidents. Databases at credit bureaus make it easier to approve transactions at stores.

However, people are suspicious of databases. There is a book called Database Nation, which I have not read, that I believe raises concerns about the growth of databases.

I believe that there is a distinction to be made between participatory and adversarial databases. Participatory databases are databases that serve the needs and interests of members, who contribute their data openly and voluntarily. Adversarial databases serve the needs and interests of owners, who obtain their data surreptitiously, often against the preferences of their members.

ParticipatoryAdversarial
ExamplesPaypal; Instant Messengertelemarketers' lists; a totalitarian government's list of "enemies of the state"
Data is used byList membersList Owners
Data is gatheredopenly and voluntarily from memberssecretly and without members' permission

Participatory databases are used by members at their own convenience. Instant messenger is a service that allows online conversations, but its value depends on the database of screen names. Paypal is a service for making small payments between individuals, but its value depends on the database that links individual email addresses to credit card accounts. One of auction site eBay's features is a list of ratings of sellers. Those ratings constitute a participatory database.

Adversarial databases are used by their owners for their own advantage, often to the detriment of people in the database. Telemarketers and email spammers are among the most notorious offenders. Many grocery stores offer discounts to customers who allow their purchases to be tracked using affinity cards. My presumption is that these are adversarial databases, in that the long-run intent of the grocery stores is to figure out ways that they can get more profits out of each customer.

As it stands now, the relationship between consumers and credit bureaus is purely adversarial. If it were not for the laws governing "fair use," credit data would be sold to anyone willing to pay for it.

In books like Net Worth and Permission Marketing, popular consultants have articulated a vision in which databases supposedly confer decisive strategic advantage to corporations that capture them, and consumers will be grateful for the targeted marketing that result. However, these databases strike me as adversarial, in that the information is going to empower database owners rather than consumers.

I believe that Microsoft conceives of its Passport authentication service as a participatory database. Members will benefit by being able to transact with one another more openly and securely. However, the media and Microsoft's opponents are treating Passport as if it were an adversarial database, implying that a consumer who signs up for Passport will be subject to manipulation by Microsoft. What I suggested in The Passport Intifada was that Microsoft needs to find a partner with a better reputation in order to market Passport.

Why Participatory Databases Now are Economical

In the past, databases were more likely to be adversarial than participatory. The reason is that the cost of data storage and retrieval used to be high. If the cost of maintaining a database is high, then it becomes nearly impossible to support the database on the basis of membership dues or fees. Instead, the economics tend to work only if the database provides large benefits to its owners. This leads to adversarial databases.

The cost of storing and searching through large databases is plummeting, making many applications economical that were prohibitive just a few years ago. I remember worrying about data storage costs five years ago when Homefair.com was considering an application involving data stored by Census Tract. To save on data storage, we aggregated from Census Tracts to zip codes. Now, that seems foolish. Data storage is cheap.

Another expense involved in developing a database is the cost of data acquisition. However, if the members are connected to the Internet, then this cost can be very low. Once someone fills out a form, the database can capture the information without a human having to read or transcribe the data.

Because the cost of developing and maintaining databases is falling, the economics now favor participatory databases. There are large potential benefits from information sharing, and the participatory model makes it easier to share the benefits without having consumers hand over power to exploitative organizations.

Credit bureaus could be structured to be participatory, by giving individuals more control over how their data is used. My guess is that if there were no credit information system in existence now, the system that would emerge with today's cost structure would be participatory rather than adversarial.

I Know a Great...

The benefit of participatory databases comes from speeding up information sharing processes by an order of magnitude. The processes that could be faster include making a payment, registering for a web site, and finding trustworthy information.

One entrepreneur who is working on a participatory database is Duncan Work of Net Deva. His company is trying to put together a system to capitalize on the truism that often it is the case that "It's not what you know, it's who you know."

After I met with Duncan, I suggested that he could provide a stripped-down version of his application, that one might call "I Know a Great..." The idea is that each participant in the database would supply two lists.

  1. A list of friends and associates that the participant trusts. These would be people who I am confident would be motivated to help me if I asked them to recommend an expert.
  2. A list of experts that the participant knows. These would be people or information sources whose expertise I respect and who I know well enough to provide an introduction. For example, I know:

The application of the participatory database would consist of linking the two lists. For example, suppose that I need to find an expert in voice recognition software. Here is what would happen.

  1. I would submit my query to the "I Know a Great..." system.
  2. The system would start with my list of trusted friends and associates. It would try to find a trusted friend or associate who lists an expert in voice recognition software among their expert contacts.
  3. Next, the system would look at the second layer--the friends and associates of my friends and associates. It would try to find an expert in voice recognition software from their list of expert contacts.
  4. Once I have located someone who knows an expert, I can use the chain of connections to obtain an introduction to that person. (The longer the chain, the more tenuous the connection, and the more difficult it becomes to trust the match.)

Other Applications

Participatory databases have other potential applications. Some of these are very important.

  1. Non-terrorists.

    I know some people well enough to be very confident that they are not terrorists. I conjecture that if you could pool all of the information that non-terrorists have about one another, you could have a database that could be used to make it much easier to deal with security at airports, sporting events, and so forth.

    How would you keep terrorists from infiltrating the non-terrorist database? It would be important to develop filtering methods to ensure that recommendations come from reliable sources.

    Not everyone's recommendations would be given equal weight. The recommendations from people whose decency can be strongly confirmed would be given more weight than people whose ratings are provisional.

  2. Non-spammers.

    It seems that blocking spam is very difficult. Spammers tend to use email accounts that are new or phony. However, the list of non-spammers might be more stable.

    It may be possible to create a participatory database consisting of people who do not send unsolicited commercial email. Internet service providers could filter out email that does not come from participants in that database.

Conclusion

Shared information systems offer large potential benefits. However, this potential tends to go unrealized. One reason is that historically databases have been adversarial, so that people are not accustomed to obtaining benefits from shared databases. Another reason is that there are network effects, so that the benefits of a database may be small until it reaches critical mass. Yet another reason is that there may be free rider issues--ways for a non-participant to enjoy the benefits without the cost.

A participatory database is one aspect of what David Brin calls The Transparent Society. That is a society where it is taken as given that surveillance will be feasible. Under those circumstances, checks against abuse come from making information broadly available, rather than limiting knowledge to a few large corporations or government agencies. One of the reasons that people may have difficulty buying into the transparent society is that they view databases through the framework of the adversarial model. The participatory model is a relatively new concept. I think that few people appreciate its significance.

All of the challenges to developing participatory databases are reduced by the availability of low-cost data storage and Internet communication. Now, instead of building databases to take power from one group and give it to another, we can develop databases where the distribution of benefits is more democratic. I predict that we will see continued growth in this area.