footprint

Look at what people do with data mining, trace their footprint and learn something new...


Long Cases (Where people paintakingly record every details of their application)


Case ID Title Description       Sponsor/industry Maturity Techniques Synopsis Source
1 Used Car distribution system Automation of used car distribution to auction site to maximize profit A large US car manufacturer
 
Production Prediction
Optimization
Active learning
2 Predict aircraft component replacement Predict when to replace an aircraft component Commercial Airline(Air Canada) Prototype Supervised learning (KNN, C4.5, Naïve Bayes)
3 Early Waning System for vehicle related quality data Detect abnormal repair pattern,forecast future quality issue DaimlerChrysler AG Test operation Probability Modeling , anomaly detection
4 Trading Surveillance Detect anomalous trading One of the biggest bank in US Prototype Ensemble of classifier
5 Mining the Software Change Repository Learn relationship among software entities from historic change records Software Maintenance Prototype Classification
6 Data Mining for discovering insurance risk Risk group characterization by rule learning Insurance Test operation Rule induction
7 Churn Analysis/life time value estimation Profile and predict consumer most likely to churn Subscription-based business
Varies classification Survival analysis etc.
8  More Cases
           
9
 More cases







Short Cases(a.k.a anecdote, appear frequently as "successful stories" in the marketing material of commercial tools)


Predicative Modelling in action_Educational Marketing
comparatively new application, but essentially the similar framework as CRM....

Pets Surveillance
You are Kidding... Surveillance is now such a popular application that even pets can't escape


Find patterns and trends in unsolved criminal cases
It indeed helps police to imrpove efficiency


Growing Use of Data Mining in Businesses and Government

Data mining technology, which can scan unstructured data to identify trends or link users, and generate reports that answer analytical questions...





Educational Marketing



Noel-Levits is an adimission consulting company estbablished since 1973. They provided recuriment management service to universities to help them acheive goals in improved enrollment and increased student body diversity and net operating revenue. Recruitment is an expensive process, targeted recuriment is an opporunity to increase efficiency. One of the most successful offerring is ForecastPlus - an application developed on SAS enterprise miner. It builds a predicative model from such enrollment factors as location, academic ability, college major. the model would output a relative score which predicates which student is the right fit and their liklihood of enrolling in that school. According to the story, the model often identifying students who are up to five times more likely than average  to enroll at that school.In addition, the solution provide a web-based interface that enables schools to input their data, manipulate it and submit for scoring. This is an attractive feature to schools as it remove the necessarity to maintain their data store and models in house.

(I coined it "outsourced data mining service", it is a well justifable business model: when the model is generic, the cost of model development could be shared by multiple clients, to achieve economics scale. It targets the market where client's data mining needs is defined, but not frequent, for them, to host a suite of data mining application is not viable)


Full Story


Pets Suveillance


Researchers at Purdue University's school of veterinary medicine is developing a pet surveillance system, This system utilize a national pet database to perform spatial and temporal analysis with SAS to identify unexpedted clusters of diseases that could result from terrorism or exposure to chemical. As dogs and cats are sensitive to four of the six class A biological agents identified by DHS, the researcher hope to identify disease quickly. The database has records of 160 million dogs and cats, 60,000 weekly. The system would analysize 10 million office visit information from seven tables that deals with demographcial variables of pets, physical examination, lab test, treatements, diagnosis, visits and free-text medical notes. In addition, the system integrate with ArcGIS for mapping. Currently, they works on retrospective data, eventually they plan to have a web-based portal that will be available to veterians and officials across the countries The system has also been adapted for pharmacovigilance - monitoring vaccine and drug associated adverse events and cancer occurance among pets. This project is an collaborative efforts among university,CDC,DHS FDA.


(What a serious impact pets are having to our life? somewhat amusing, however an reasonable effort. Even though pets wouldn't seek medical help by themselves, the people who loves cat will)


Full Story



Criminal Cases


the West Midlands, U.K. police department works to identify key case patterns and trends in hopes of solving old cases and identifying new criminal behavioral patterns. The data mining models built from clementine helps them to match unsolved cases with known erpetrators  and target and catch repeat offenders.


Each West Midlands electronic case file contains physical descriptions of the thieves as well as their modus operandi (MO, used in police work to describe a criminal's characteristic patterns and style of work). While many cases lacking evidence were filed away, the department is now re-examining them. They uses two Kohonen networks to cluster similar physical descriptions and MOs. He then combines clusters to see whether groups of similar physical descriptions coincide with groups of similar MOs. If he finds a good match, and perpetrators are known for one or more of the offenses, it is possible the unsolved cases were committed by the same individuals. The analytical team further investigates the clusters, using statistical methods to verify the similarities' importance. If clusters indicate the same criminal might be at work, the department is likely to re-open and investigate the other crimes. Or, if the criminal is unknown but a large cluster indicates the same offender, the leads from these cases can be combined - and the case reprioritized. They are also investigating the behavior of prolific repeat offenders, with the goal of identifying crimes that seem to fit their behavioral pattern, hopting to discover unexpected connections to known perpetrators.


Full Story


Growing Use of data mining in business and government

Eastman Kodak uses the technology to identify connections in its own and competitors' patent filings, and Mayo Clinic researchers scan doctors' notes to evaluate various treatments, the Seattle Times reports

Companies backed by the CIA like Attensity and Intelliseek ,Intelliseek scans web logs and e-mail list servers and has recently partnered with Factiva, which scans media reports to offer “reputation insight”.
 
eAnalytics Portal,Programs that can simplify implementation of data mining include FRx Software, which can connect directly to the user’s general ledger or practice management system, without requiring the user to find someone to provide the connection; and Crystal Reports, which works on XBRL standard language and tags data, enabling data mining programs to work more easily.

The Internal Revenue Service (IRS) uses the Reveal system to detect patterns of criminal activity, analyze intelligence and detect terrorist activities. Installed in February of this year, the Reveal system can query data from multiple sources.