Statoo Consulting's logo Statoo Consulting
Statistical Consulting + Data Analysis + Data Mining Services
Switzerland


AntMethodological Training in Statistical Data Mining, March 28-30, 2012, Berne, Switzerland
AntSéminaire de Méthodologie en Data Mining, 12-13 Juin 2012, Paris, France
AntMethodological Training in Statistical Data Mining, July 18-19, 2012, Milton Keynes, England
Ant Methoden für statistisches Data Mining, 4.-5. Dezember 2012, Hamburg, Deutschland
AntBrochures
AntDienstleistungen
AntPrestations
Home

AntWhat is
 Statistical Thinking?
 Statistics?
 Data Mining?

AntNews

About Us

Consulting Services

Training Services

Clients

Publications

Partners

Newsletters

Feedback

Contact Us

Jobs

Search
AntManaging Uncertainty to Improve Decision Making: Statistical Thinking for Managers
Find Us on
Facebook
Bookmark and Share
2011 - 2001 = 10 + ε
What is Data Mining?

`We are drowning in information but starved for knowledge.'
John Naisbitt



Data mining is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns or structures or models or trends or relationships in data to make crucial decisions.

What is meant by these terms?
  • `Non-trivial': it is not a straightforward computation of predefined quantities like computing the average value of a set of numbers.
  • `Valid': the patterns hold in general, i.e. being valid on new data in the face of uncertainty.
  • `Novel': the patterns were not known beforehand.
  • `Potentially useful': lead to some benefit to the user.
  • `Understandable': the patterns are interpretable and comprehensible.

Is data mining `statistical déjà vu'?

Statistics is the science of learning from data or turning data into knowledge. If you want to know more about what statistics is, please click here.

Like statistical thinking and statistics, data mining is not only modelling and prediction, nor a product that can be bought, but a whole iterative problem solving cycle/process that must be mastered through team effort.

`Coming together is a beginning. Keeping together is progress. Working together is success.'
Henry Ford


What distinguishes data mining from statistics?

Statistics traditionally is concerned with analysing primary (e.g. experimental) data that have been collected to check specific hypotheses (ideas). As such statistics is `primary data analysis', top-down (confirmatory) analysis or `hypothesis evaluation or testing.

Data mining, on the other hand, typically is concerned with analysing secondary (e.g. observational) data that have been collected for other reasons. As such data mining is `secondary data analysis', bottom-up (exploratory) analysis, `hypothesis generation' or `knowledge discovery'.

The two approaches of learning from data or turning data into knowledge are complementary.
  • The information obtained from a bottom-up analysis, which identifies important relations and tendencies, can not explain why these discoveries are useful and to what extent they are valid. The confirmatory tools of top-down analysis can be used to confirm the discoveries and evaluate the quality of decisions based on those discoveries.
  • Performing a top-down analysis, we think up possible explanations for the observed behaviour and let those hypotheses dictate the data to be analysed. Then, performing a bottom-up analysis, we let the data suggest new hypotheses to test.

Want to know more about the relation between data mining and statistics? Check out our paper entitled `Is Data Mining for Gold "Statistical déjà vu"?' or additional papers in our `Publications' section.

Want to know more about the relation between bioinformatics, data mining and statistics? Click here, check out our paper entitled `Challenges in Bioinformatics for Statistical Data Miners' or additional papers in our `Publications' section.

Interested in our data mining services? Are you drowning in uncertainty and starving for knowledge? Interested to get Statooed? Have a question about our data mining services? Contact us to allow us to help you.


© 2001-2012 by Statoo Consulting, Switzerland. All rights reserved.
Statoo is a registered trademark of Statoo Consulting.
Privacy Policy. Usage Terms and Conditions.
Last updated on July 04, 2011.
www.statoo.com/en/datamining/