Wednesday, October 15, 2008

Information Management

Tagged with: ,
Monday, April 9, 2007, 23:15
This news item was posted in Spotlight category and has 1 Comment so far.

The Problem

In some way or the other we all are hopelessly disorganized in our digital life. Our inbox is overflowing with email. The documents are scattered across a half dozen hard drives, none of them backed up etc.
This might be so because we are lazy, but there’s an old adage in software development that says laziness is a virtue. By laziness, we mean only avoiding unpleasant work. For a programmer, the most tedious work to do is work that could be done by a program. Rather than spend an hour on a repetitive task, a programmer will spend 59 minutes writing a program to complete the task in 30 seconds. We could justify our laziness because most definitely software should do most of the work of information management for us.

ap_flickr_t.jpg

There are plenty of great information management tools out there. Certainly, iPhoto has made it easier to organize digital photos. Flickr and Del.icio.us have popularized tagging—organizing items by simply marking them with keywords—and created a new way to navigate large amounts of data. And iTunes is a definite improvement over manually organizing MP3s into folders.

itunes-file-info-dialog.png

But as helpful as these applications are, they can be frustrating to use, because each one implements a slightly different set of features, even though they are basically solving the same information management problems. For example, iPhoto allows you to tag a photo with keywords, but iTunes doesn’t allow you to do the same thing for a song. Subtle incompatibilities like this can contribute to a frustrating user experience.

treephoto_tags.jpg

Even worse than slight incompatibilities between applications, is that they often support entirely different data models. With so many different applications to manage our data, we have to keep track of several different data models, and it’s easy to get confused. For instance, when browsing photos, you might see a photo that you want to send to a friend. In both Picasa and iPhoto, a button allows me to email the photo to them. But I can’t do the same thing with a song in iTunes, or a bookmark in Firefox. What’s so different about each of those things? Unfortunately, this data lives in a balkanized world, and what we are allowed to do with the data depends on what form it is in.

picasa.jpg

This balkanization of our data also makes it more difficult to find things. Before being able to search for something, you have to know what form the data is in, so that you can search in the right application. Did I store it as a Del.icio.us bookmark? Did someone email it to me, or was it in an instant message?

Another usability problem occurs when trying to share data between applications. A really simple example: a friend asks you to email him the photos from your last trip together. You have no problem finding the photos in Picasa, because you’ve got an album for the trip. But try to email it to your friend from my Yahoo! Mail account, you’ll have to browse through the file system to find the file. Even though you have the photo up in Picasa—Right there! That one!!—you can’t communicate that in an intuitive way to the web browser.

All of these problems are caused by the fact that by using many different specialized applications for personal information management, the data are segregated based on its form. Using the term segregated isn’t an exaggeration—in some ways, the data is literally not allowed to mix together.
In short, there are several usability problems caused by the fact that we use many different specialized applications for managing our data. We can become frustrated and confused by incompatible data models and inconsistent features. It’s harder to find the information we are looking for, because we have to remember what form the data is in. Communication between applications is awkward because they don’t speak the same language. The data is stuck in silos, segregated by its type. This prevents us from using perfectly natural ways of organizing our data.
Working towards a Solution
Now that we’ve established what the problem is, the question is: what can we do to fix it? Obviously we can’t expect to have a single application which will support all of our needs. We still need specialized software like iPhoto for managing photos, and GMail for email. The problem is not really with the applications themselves, but with the platform they’re built upon.

In software terms, a platform is a collection of common routines, and a set of interfaces allowing applications to use the routines. Normally, an application is built directly on the routines provided by the operating system. Developers and designers have long understood that an inconsistent user interface is difficult to use, so the UI is built into the platform, resulting in applications that mostly look and feel the same. In order to achieve the same kind of consistency with information management features, we need a platform designed for the manipulation of rich information.

While the amount of information that the average person deals with has increased dramatically in the last 20 years, file systems have hardly changed at all.

All modern operating systems do in fact provide a common way to manage information: the file system. Unfortunately we are still stuck with the old file and folder model. The problem with this model is that an increasing amount of data just doesn’t fit into it. For example, a single email usually does not correspond directly to a file on the local disk. Another example is bookmarks—many people collect and organize hundreds of bookmarks, but a bookmark is not a first-class object like a file.

In a broad sense, we need a new information management platform which is really just a new kind of file system, based on the needs of today’s users. We need a system that will make it easier to manage and navigate the large amounts of rich and diverse information that people deal with every day.

The five distinct usability problems all caused by the fact that we use many different specialized applications for managing our data:
1. Inconsistent features between applications;
2. Incompatible data models;
3. Difficult to find data, because we have to know where to look based on the type of the data;
4. Awkward to share data between applications;
5. Inability to mix different types of data together;
The key requirements for a framework to be successful:

A useful and usable framework
Only if it’s actually used can an information management framework help solve the problems identified here. The framework must be easy for application developers to build upon, and it must be useful enough to be worth their effort. By building on this framework, application developers would be able to focus on the core functionality of their applications, rather than wasting their time reinventing common information management features.

Extensible for new kinds of data
By having applications build upon this framework, we eliminate the problem of having incompatible data models. But the platform must be extensible to be able to handle new types of data. The reason that we have to deal with the different data models of specialized applications is because the existing platform (the file system) was not suited for managing the rich data that today’s applications require. If the framework is not built from the ground up to be extensible, we will quickly find ourselves in the same situation we are now: trying to do today’s job with yesterday’s tools.

Comprehensive search capability
The third problem is that it’s difficult to find data, because you have to know where to look depending on what form the data is in. If it’s in an email, you have to search in one place, but if it’s in a file on your hard drive, you have to search in another place.
While search is not the answer to all our information management problems, it is a very useful feature. Now that Google is a verb, most people are comfortable using search as a primary way to find data. A new platform for information management should provide advanced search capability. Apple has done the right thing by building Spotlight’s sophisticated search functionality into the operating system, and allowing applications to build upon it.
But in order for search to be truly effective, we need to be able to search all of our data at once, instead of having to search in each of the individual silos. Having a single framework for managing rich information means that it will be able to search through all different kinds of data, no matter what form it takes.

All data on equal footing
One of the problems with current information management systems is that it’s difficult, if not impossible, for different types of data to be mixed together. You can’t create a folder that contains an email, a photo album, and some bookmarks. This problem is also related to the problem of inconsistent features and data models. Things that can be done with one type of data, like a file on the file system, can’t necessarily be done to other kinds of data.
In other words, there is an artificial distinction between different types of data. What a bookmark, an email, and a text file all have in common is that they are distinct, discrete pieces of information. If the purpose of the file system is to allow the user to store and organize information, then it should be able to treat these kinds of items equally. All types of data must be on equal footing. Anything that can be done with a file—like copying, searching, or sorting—should be possible with other pieces of information. If all data is on equal footing, then it would be possible to have a folder containing several different types of data.

Flexible organization features
The folder (or directory) is the most common organizational metaphor used on computers. Originally, this concept was designed to be analogous to a physical file folder, so a document could only ever be in one folder. But it often makes sense for a document to be in two different folders at the same time.

In information architecture, it’s good practice to support several paths to a piece of information. This is generally because we need to support many different users. But even with a single user, there are sometimes several different mental models involved.
The idea that an object could exist in multiple folders is known as multiple classification, and it has recently become popular in the form of tags. Flickr, Del.icio.us, and many other web services allow you to associate several keywords with your data. By doing so, you are indicating that the data falls into various categories, with the idea that this will help you or someone else more easily find the data later.
Providing support for multiple classifications is just one example, but in general, for a new information management platform to be successful, it must be flexible enough to allow you to organize your data however you want.

You can leave a response, or trackback from your own site.

One Response to “Information Management”

  1. Rahul S said on Tuesday, April 17, 2007, 15:49

    This is quite interesting. I am pretty surprised and happy to know there exist several services that make life much simpler. In my opinion, for organizing music collections, a cataloguing software like Cathy combined with Mp3Tag does the job quite beautifully. Good article though!

Leave a Reply