Archives
 
 
 
  Special
 
 
 
  About Us
 
 
 

Newsletter
Free E-mail Newsletter from BYTE.com

 
    
           
Visit the home page Browse the four-year online archive Download platform-neutral CPU/FPU benchmarks Find information for advertisers, authors, vendors, subscribers

ArticlesData Mining


October 1995 / State Of The Art / Data Mining

Turn computers loose on your data, and you don't know what they'll come up with -- that's the whole point

Edmund X. DeJesus, Senior Editor

There's gold in your data, but you can't see it. It may be as simple (and wealth-producing) as the realization that baby-food buyers are probably also diaper purchasers. It may be as profound as a new law of nature. But no human who's looked at your data has seen this hidden gold. How can you find it?

Data mining lets the power of computers do the work of sifting through your vast data stores. Tireless and relentless searching can find the tiny nugget of gold in a mountain of data slag.

In "The Data Gold Rush," Sara Reese Hedberg shows the already wide variety of uses for the relatively young practice of data mining. From analyzing customer purchases to analyzing Supreme Court decisions, from discovering patterns in health care to discovering galaxies, data mining has an enormous breadth of applications. Large corporations are rushing to realize the potential payoffs of data mining, both in the data itself and in marketing their proprietary tools.

In "A Data Miner's Tools," Karen Watterson explains the three categories of software to perform data mining. Query-and-reporting tools, in vastly simplified and easier-to-use forms, require close human direction and data laid out in databases or other special formats. Multidimensional analysis (MDA) tools demand less human guidance but still need data in special forms. Intelligent agents are virtually autonomous, are capable of making their own observations and conclusions, and can handle data as free-form as paragraphs of text.

"Data Mining Dynamite" by Cheryl D. Krivda shows how to facilitate the data-mining process. Data is handled far faster after it has been cleansed of unnecessary fields and stored in more convenient for ms. Housing data in data warehouses reduces the load on production mainframes and supports client/server analysis. Parallel computing speeds the search process with multiple simultaneous queries. And any activity handling this volume of data requires consideration of physical storage options.

In the short term, the results of data mining will be in profitable if mundane business-related consequences. Micro-marketing campaigns will explore new niches. Advertising will target potential customers with new precision.

In the not-too-long term, data mining may become as common and easy to use as E-mail. We may direct our tools to find the best airfare to the Grand Canyon, root out a phone number for a long-lost classmate, or find the best prices on lawn mowers. The software will figure out where to look, how to evaluate what it finds, and when to quit. Our knowledge helpers may become as indispensable as the telephone.

But it's the long-term prospects of data mining that are truly breathtaking. Im agine intelligent agents being turned loose on medical-research data or on subatomic-particle information. Computers may reveal new treatments for diseases or new insights into the nature of the universe. We may well see the day when the Nobel prize for a great discovery is awarded to a search algorithm.


Up to the State Of The Art section contentsGo to next article: The Data Gold Rush
Flexible C++
Matthew Wilson
My approach to software engineering is far more pragmatic than it is theoretical--and no language better exemplifies this than C++.

more...

BYTE Digest

BYTE Digest editors every month analyze and evaluate the best articles from Information Week, EE Times, Dr. Dobb's Journal, Network Computing, Sys Admin, and dozens of other CMP publications—bringing you critical news and information about wireless communication, computer security, software development, embedded systems, and more!

Find out more

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE Volume 1: Programming Languages
The Best of BYTE
Volume 1: Programming Languages
In this issue of Best of BYTE, we bring together some of the leading programming language designers and implementors...

Copyright © 2005 CMP Media LLC, Privacy Policy, Your California Privacy rights, Terms of Service
Site comments: webmaster@byte.com
SDMG Web Sites: BYTE.com, C/C++ Users Journal, Dr. Dobb's Journal, MSDN Magazine, New Architect, SD Expo, SD Magazine, Sys Admin, The Perl Journal, UnixReview.com, Windows Developer Network