CSIRO brings open source data mining to business
- 15 January, 2004 07:19
- Comments
With Linux and open source software, CSIRO’s mathematical and information sciences division is now able to model information for business benefit without relying on proprietary software, according to principal computer scientist for enterprise data mining Dr Graham Willliams.
“Data mining is all about building models of the world which can be used to reason the world and identify things that can be used for business benefit,” Williams said. “And anytime data mining is about getting an answer now.”
After experiencing commercial data mining software, Dr Williams’ team now uses a variety of open source software running on the Debian GNU/Linux operating system.
“A lot of government departments use SAS for data mining,” he said. “At around $100,000 per seat per year it is a good product but once you get over the ‘woo’ features you hit a brick wall because you can’t customise it.”
For the data mining CSIRO uses a number of “toolkits” including R, GNOME, and Python scripting.
“By taking the open source option we have data mining software that is free and can be modified,” Williams said. “Commercial software is available but there are quality assurance concerns about correct implementations, additional functionality is required for individual requirements, and who knows if they are going to be around in fives years time.”
Williams cited the Health Insurance Commission and the NRMA as organisations using CSIRO’s open source and custom developed data mining applications to “identify groups of data according to certain characteristics”.
“Data mining is used at the NRMA for vehicle insurance premium setting which involves analysis of several million transactions annually,” he said. “At the HIC, some patients lodged all their Medicare claims at once creating a regular pattern of fraud. Hot spots are identified which are classified by clusters, rule induction, and then interestingness.”
CSIRO is working with the Department of Health and Ageing’s research group for the data mining activities which has a “secure data mining facility”.
“The Department of Health and Ageing has a 200 CPU cluster running Debian Linux,” Williams said. “Debian is a stable server operating system that is easy to maintain and we also use it on desktops.”
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bookmark this page
- Share this article
- Got more on this story? Email CIO
- Follow CIO on twitter
-
Six ways to reduce expenses using ERP
-
All Systems Down
-
CIO of CSC deploys social collaboration platform
-
IT workers are happy, but will still leave for something better
-
30 days with Ubuntu Linux, day 3: Where's my iTunes?
-
Pathways Business Brochure 2012
Tailored learning and development program for organisations looking to build business acumen within their Key ICT executive. The course curriculum is designed in conjunction with the specific requirements the enrolling organisation. -
Security Threat Report 2012
This threat report shares the latest research on hacktivism, online threats, mobile malware, cloud computing, and social network security looking ahead to the coming year. -
Information Security Policies, Standards and Procedure
As a result of the adjustments in the way business is conducted, ownership of information does not carry the same clear accountability it once did. Physical and behavioural boundaries used to exist around information management but these can be missing in the modern workplace. Clearly thought-out information security policies, standards and procedures addressing internationally supported standards, will go a long way to addressing the risk exposure these changes have created. In this third paper, “Policies, Standards and Procedures,” we discuss guidelines for effective information security management.

















Comments
Post new comment