CIO
Sensible behaviours for nonsensical data
Sue Bushell  07 December, 2004 13:24:18

Data quality is critical to the success of any enterprise application. Systems from business intelligence to customer relationship management are destined to fail without high-quality data - the "garbage in, garbage out" theory.

George Preston scratched his head as he contemplated the road accident data from NSW. Something about it just did not make sense . . .

A director of software company Prometheus Information, Preston was preparing some educational and promotional material for the HealthWiz product his company produces for the Commonwealth Department of Health and Ageing. The subject of road accidents seemed to offer fertile ground, so he had started with an age-standardized comparison of hospitalization rates across all states. Next, with no particularly surprising comparisons standing out he had moved on to prepare an age breakdown, expecting - as all anecdotal evidence leads us to believe - to find a peak in rates for young drivers. Yet while all other states were living up to those expectations, NSW was a distinctly different case.

Could there be something about the NSW P-plate system that explained why drivers in their 20s were hospitalized at half the rate that they are in other states? Perhaps, but this would not explain why elderly people and infants had hospitalization rates three to five times higher. Puzzled, Preston then turned to the geographic distribution of the rates for males and females, finding that the low hospitalization rates were a NSW-wide phenomenon applying to both sexes.

Concluding there must be something systematically different about how the data is collected or compiled in NSW, Preston tried further analysis, hoping this would reveal whether this was a systematic difference across the whole of NSW, and whether there might be some obvious reason for the phenomenon. There was not. Now, with nothing obvious standing out, he plans to seek guidance from someone in NSW Health over whether they can explain the difference.

"I'm working on trying to determine how much I want to rely on this information, so I'm looking for a measure of confidence of some kind," Preston says. "The data has got to have an internal consistency, and it should line up with known information, so that there's kind of external validity as well.

"How one uses that information really depends on whether you've got some kind of coherent explanation for how the information is actually being generated. If you don't have some mental model in your head, it's generally pretty hard to use information, I think. The really important point is that usually you're looking for some kind of signal in the midst of noise. You need to try and get a handle on what the difference is between the noise and the signal."

In some cases a statistical test can help distinguish signal from noise; in others the organization can adjust the data for known causal factors and what counts is to put all data on a comparable basis.

Ultimately, it is a matter of applying common sense, Preston says. "The important thing is to actually pick up on the signal - the presence of the signal - and then you can work harder to try and get a better handle on what the signal is so you can focus your efforts around that aspect of the data quality, without worrying about the rest of it."

To help alert those using its data to its degree of reliability or otherwise, the HealthWiz team has developed a warning system that can be set to trigger for any specific value, variable or category in the data. These warnings are authored in consultation with the data custodian who supplied the collection in question.

"I think in general you should be able to alert users to parts of the data that are stronger and weaker," Preston says.

Comments

Post new comment

Login or register to link comments to your user profile, or you may also post a comment without being logged in.
The content of this field is kept private and will not be shown publicly.
Enter the fully qualified URL, eg. http://www.example.com/
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

Additional Resources
Executive Guides
Whitepapers
Zones
Zone logoZones provide focussed content from CIO and leading technology partners.
Newsletter Subscription
Sign up for our CIO newsletters!
RSS Feeds
Syndicate content

HP Data Center Transformation solutions offer practical ways to overcome the energy and capacity limitations, operational vulnerabilities and technology constraints that can plague your data center. Choosing from a portfolio of solutions matched to your business needs, we can help you transform your data center into a business-driven, process-smart and future-ready asset.

Latest on Data Centre

  • +

    Inside Internode's data centre 05 June, 2009 14:39:00

    Computerworld gets an exclusive behind the scenes look inside Internode's Adelaide data centre with network guru Mark Newton
    Computerworld gets an exclusive behind the scenes look inside Internode's Adelaide data centre with network guru Mark Newton
  • +

    HP uses outside air, big fans, 12-foot raised floor to cool servers 03 June, 2009 07:44:00

    It's also cutting data center power use by painting server racks white
    Just off the North Sea coast in the United Kingdom, Hewlett-Packard Co.'s EDS unit has built a data center that largely relies on cold sea air to keep servers chilled and -- by doing so -- cut the center's cooling power needs in half.
  • +

    HP targets the cloud with new hardware 12 June, 2009 08:27:00

    HP offers complete cloud computing package for businesses
    HP has designed a new portfolio of hardware, software, and services, aimed at reducing costs and saving resource, particularly for businesses involved in Web 2.0, cloud and high-performance computing.
  • +

    Defence to spend $700m on ICT reform 05 June, 2009 11:13:00

    Strategic Reform Program report reveals only half of defence IT budget visible to CIO
    Less than half of the annual $1.2 billion spent by Defence on its ICT is visible to its chief information officer, Greg Farr, a new report has revealed.
  • +

    Inside Telstra's Virtualisation Strategy 11 May, 2009 14:12:00

    Need to cut infrastructure costs driving the strategy
    Telstra is increasingly turning to virtualisation as its core strategy to both manage the rising costs of, and growth in, its data centres, according the company’s CIO, John McInerney.
  • +

    Defence to Initiate ICT Reform Program, Expand CIO Role 05 May, 2009 11:56:00

    ERP rollout, data centre consolidation, single architecture all on the cards, according to the Department of Defence’s strategic policy white paper
    The Defence department has signaled a raft of changes to its approach to information technology under a new ICT reform program.

Free Resource Library

Data Centre Assessments

The First step to Optimising

Speeding business innovation

Removing barriers to growth, increasing agility and driving out costs

Assessments: Ammunition for Facts-Based Decision Making
by Richard L. Sawyer, Senior Principal, HP Critical Facilities Services
Download Podcast Download Transcript
 

CIO Summit The New World Order Opportunities and Challenges for CIOs

23rd July 2009
The Westin Sydney


A content-rich networking event where CIOs and senior executives collaborate on business and technology issues ranging from the impact of the economic downturn to the most pressing trends affecting IT in the enterprise.

Register Now

  • +

    New scam email uses Australian Federal Police to gain victims' trust 03 July, 2009 10:49:00

    Fake offers of free AFP monitoring service to stop "cybernetic attacks"
    Cyber criminals have changed tack in their ongoing scam campaign against banks, moving to the use of government agencies to gain the trust of unsuspecting email recipients.
  • +

    AFP hits $6 million identity fraud syndicate 03 July, 2009 08:25:00

    $500,000 of goods per week purchased with fake credit cards
    The Australian Federal Police (AFP) claims to have struck a major blow to a multi-million identity fraud syndicate.
  • +

    5 steps to secure a new PC 30 June, 2009 00:19:00

    Just unwrapped a brand-new PC? Security pros share their secrets for making your system Internet-safe.
    A common misconception is that a shiny new computer is more or less secure because it hasn't yet been exposed to the Internet's sinister underbelly. But the truth is, these machines come out of the box needing scores of patches, some basic security software downloads and the disabling or replacing of items security pros don't typically trust.
  • +

    Facebook simplifies privacy settings, calls them too complex 02 July, 2009 05:48:00

    The social-networking site is also getting ready to let members share content with anyone on the Internet
    Facebook will simplify the way in which it offers privacy options to its users, as it gets ready to give its members for the first time the option to make the content they post on their profiles available to anyone on the Internet.
  • +

    DR a growing concern for A/NZ CIOs: Symantec 02 July, 2009 09:16:00

    Mission critical apps and cost of down-time major drivers
    CIOs in Australia and New Zealand are increasingly getting involved in the disaster recovery planning of their organisations, according to a new survey from Symantec.
Upcoming Industry Events
  • CIO SummitNSW - Sydney | 23/07/2009 | Hosted by CIO Magazine, IDC & the CIO Executive Council
Whitepaper

Master Data Management as “Plan B”: Why Your Data Warehouse, CRM, ETL and EII Solutions Are Better with MDM

The problems with corporate information extend beyond escalating data volumes. High-quality master data is reliable and effective when availed to enterprise business processes. Read more about how MDM provides new solutions to new problems.


CIO Industry Insight Podcast #4: Kerry Stratton, Managing Director of Healthcare, InterSystems
Listen to the latest edition of CIO Live which is now available for download.
Listen to the podcast
Sign up to the CIO Live email