Posts

Showing posts from August, 2012

Market Basket Analysis

Image
Affinity analysis is a data analysis and data mining technique that discovers co-occurrence relationships among activities performed by (or recorded about) specific individuals or groups. In general, this can be applied to any process where agents can be uniquely identified and information about their activities can be recorded. In retail, affinity analysis is used to perform market basket analysis, in which retailers seek to understand the purchase behavior of customers. This information can then be used for purposes of cross-selling and up-selling, in addition to influencing sales promotions, loyalty programs, store design, and discount plans [Source: Wikipedia]. In today's post, we use fictitious data from a supermarket describing items bought together by a set a individuals along with personal data that can be acquired through a loyalty scheme.  We will use SPSS Modeler to identify relationships between items bought together so we can understand which items are typically

Predicting Telco Churn using Binomial Logistic Regression

Image
In today's post, we will use a sample data set from a fictitious telecommunications company with the objective of predicting customer churn.  Churn refers to the loss of customers to another company.    A large percentage of subscribers signing up with a new wireless carrier every year are coming from another wireless provider and hence are already churners.  It costs hundreds of dollars to acquire a new customer in most Telecom industries.  So when a customer leaves, the company loses not only the future revenue from this customer but also the resources spent on customer acquisition in the first place. Given this background, having the ability to predict potential churners before they churn and making them an offer that would get them to stay is an extremely valuable proposition.  Let us see how we can do this using a Binomial Logistic Regression model in SPSS Modeler. To start with, we take our sample data set from a fictitious telco.  The data contains 42 fields that i