Turn computers loose on your data, and you don't know what they'll come up with -- that's the whole point
Edmund X. DeJesus, Senior Editor
There's gold in your data, but you can't see it. It may be as simple (and wealth-producing) as the realization that baby-food buyers are probably also diaper purchasers. It may be as profound as a new law of nature. But no human who's looked at your data has seen this hidden gold. How can you find it?
Data mining lets the power of computers do the work of sifting through your vast data stores. Tireless and relentless searching can find the tiny nugget of gold in a mountain of data slag.
In "The Data Gold Rush," Sara Reese Hedberg shows the already wide variety of uses for the relatively young practice of data mining. From analyzing customer purchases to analyzing Supreme Court
decisions, from discovering patterns in health care to discovering galaxies, data mining has an enormous breadth of applications. Large corporations are rushing to realize the potential payoffs of data mining, both in the data itself and in marketing their proprietary tools.
In "A Data Miner's Tools," Karen Watterson explains the three categories of software to perform data mining. Query-and-reporting tools, in vastly simplified and easier-to-use forms, require close human direction and data laid out in databases or other special formats. Multidimensional analysis (MDA) tools demand less human guidance but still need data in special forms. Intelligent agents are virtually autonomous, are capable of making their own observations and conclusions, and can handle data as free-form as paragraphs of text.
"Data Mining Dynamite" by Cheryl D. Krivda shows how to facilitate the data-mining process. Data is handled far faster after it has been cleansed of unnecessary fields and stored in more convenient for
ms. Housing data in data warehouses reduces the load on production mainframes and supports client/server analysis. Parallel computing speeds the search process with multiple simultaneous queries. And any activity handling this volume of data requires consideration of physical storage options.
In the short term, the results of data mining will be in profitable if mundane business-related consequences. Micro-marketing campaigns will explore new niches. Advertising will target potential customers with new precision.
In the not-too-long term, data mining may become as common and easy to use as E-mail. We may direct our tools to find the best airfare to the Grand Canyon, root out a phone number for a long-lost classmate, or find the best prices on lawn mowers. The software will figure out where to look, how to evaluate what it finds, and when to quit. Our knowledge helpers may become as indispensable as the telephone.
But it's the long-term prospects of data mining that are truly breathtaking. Im
agine intelligent agents being turned loose on medical-research data or on subatomic-particle information. Computers may reveal new treatments for diseases or new insights into the nature of the universe. We may well see the day when the Nobel prize for a great discovery is awarded to a search algorithm.