Questo sito utilizza cookie di terze parti per inviarti pubblicità in linea con le tue preferenze. Se vuoi saperne di più clicca QUI 
Chiudendo questo banner, scorrendo questa pagina, cliccando su un link o proseguendo la navigazione in altra maniera, acconsenti all'uso dei cookie. OK

Comparison of Data Mining Techniques for Insurance Claim Prediction

This thesis investigates how data mining algorithms can be used to predict Bodily Injury Liability Insurance claim payments based on the characteristics of the insured customer’s vehicle. The algorithms are tested on real data provided by the organizer of the competition. The data present a number of challenges such as high dimensionality, heterogeneity and missing variables. The problem is addressed using a combination of regression, dimensionality reduction, and classification techniques.

Mostra/Nascondi contenuto.
Chapter 1 Introduction With the increasing power of computer technology, companies and institutions can nowadays store large amounts of data at reduced cost. The amount of available data is increasing exponentially and cheap disk storage makes it easy to store data that previously was thrown away. There is a huge amount of information locked up in databases that is potentially important but has not yet been explored. The growing size and complexity of the databases makes it hard to analyze the data manually, so it is important to have automated systems to support the process. Hence there is the need of computational tools able to treat these large amounts of data and extract valuable information. In this context, Data Mining provides automated systems capable of processing large amounts of data that are already present in databases. Data Mining is used to automatically extract important patterns and trends from databases seeking regularities or patterns that can reveal the structure of the data and answer business problems. Data Mining includes learning techniques that fall into the eld of Machine learning. The growth of databases in recent years brings data mining at the forefront of new business technologies [WF05]. To apply and develop their new research ideas, data scientists need large quantities of data. Most of the time however business valuable data is not freely available, so it is not always possible for a data expert to have access to real data. Competitions are usually the occasion for data miners to access real business data and compete with other people to nd the best technique to apply to the data. The Kaggle website ( is a web platform where companies have the opportunity to post their data and have it scrutinized by data scientists. In this way data experts have the opportunity to access real dataset and solve problems with the opportunity to win a prize given by the company. 7

Laurea liv.II (specialistica)

Facoltà: Scienze Statistiche

Autore: Andrea Dal Pozzolo Contatta »

Composta da 81 pagine.


Questa tesi ha raggiunto 172 click dal 24/01/2012.


Consultata integralmente una volta.

Disponibile in PDF, la consultazione è esclusivamente in formato digitale.