Jump to...
Columns:
Advanced Software and Technologies
BYTE Media Lab
Chaos Manor
Conference Reports
Features
Free Features
Gigglebytes
Letters to BYTE.com
Mr. Computer Language Person
New Products
Op/Ed
Portable Computing
Serving with Linux
The Upgrade Advisor
Clean Data In
May 1997
/
Inbox
/ Clean Data In
"Garbage in, garbage out ...," the opening sentence of "Take Your Data to the Cleaners" (January State of the Art), is somewhat ironic. The article certainly addresses what to do with "dirty" data, yet it doesn't address the "garbage in" part of the equation. If you're going to invest effort in cleaning your data, you should also control how you enter data in the first place. If your workers enter months as Jan., January, J, and 01, you clearly need to standardize input procedures. Training helps, but automating quality control might yield better results: Use formatted data fields with built-in error checking, pi
ck data from pick lists, verify data with lookup tables, and post-process inputs before committing them to the database. Controlling the "in" part of the equation makes the data more immediately usable, might increase productivity, and might even revea
l some properties of your data that you never suspected.
Geoff Hart
geoff-h@mtl.feric.ca
Matthew Wilson
My approach to software engineering is far more pragmatic than it
is
theoretical--and no language better exemplifies this than C++.
more...
BYTE Digest editors every month analyze and evaluate the best articles from Information Week , EE Times , Dr. Dobb's Journal , Network Computing , Sys Admin ,
and dozens of other CMP publications—bringing
you critical news and information about wireless communication,
computer security, software development, embedded systems,
and more!
Find out more