NCITA working on new search capabilities

Comments

Australian researchers may soon produce a new Internet search system with unprecedented search capabilities to help individuals and organisations retrieve information from the plethora of new data types now populating the net.

The National ICT Australia (NICTA) Interactive Information Discovery and Delivery Project (I2D2) recognises that in an environment where users expect to receive complex data, including video, images and other non-textual forms on a range of mediums including mobile phones, information discovery alone is not enough.

To overcome those limitations, researchers have been making significant headway on methods that combine understanding of natural language with spatial and temporal information about data stored on a network to resolve challenges in information discovery not addressed by current methods.

The project recognises that as the information net accumulates more and more types of data stored in pictorial, audio, video, or other non-textual forms, traditional methods of information retrieval become increasingly inadequate.

Since data doesn't become information unless it can be readily used, search technologies need to deliver it in a form practical to the user. Scaling down a complex diagram to fit a mobile phone screen is little help if the result is unreadable. That means information must be presented in ways matching the capabilities of output device.

One technique under consideration is to make complex diagrams interactive, allowing users to explore them by expanding and collapsing sections as desired.

The researchers say good information delivery relies on understanding the information being presented, and selecting the appropriate method for the presentation of that information.

Researchers in this program are investigating how an understanding of natural language can be used in combination with spatial and temporal information about data stored on the network to create better retrieval approaches, and are working to build efficient indexing mechanisms and query languages that take into account the better understanding of data.

The latest edition of NCITA News notes massive information networks are mainly used to retrieve information stored, either explicitly or implicitly, on the network. But far more challenging than the technology that Google uses to retrieve information explicitly stored on the network is extracting information from large volumes of data stored on the network.

"This problem, tackled by data mining, among other technologies, has developed into an important research direction.

"Current methods of information discovery are limited. While GoogleTM has been incredibly successful with its (relatively) straightforward approach of ranking information by the number of links that are made to it, the capacity of the underlying search mechanism to find information attached to keywords is limited," it says.

One problem is the ambiguity of keywords, particularly where one word has many meanings, making it useless in searches. Searches based on keyword matching can't create searches based on spatial or temporal constraints, such as 'Find the nearest book store'. "A better understanding of the content of the information to be stored and retrieved is crucial to more effective information discovery.

"Another challenge is to build efficient indexing mechanisms and query languages that take into account the better understanding of data," the newsletter says.

NICTA says the growth of sensor technologies for measurements and monitoring will cause exponential growth of data output to people and machines.