Greetings! Welcome to Venky Rao's blog on Predictive Analytics, Geospatial Analytics and Visualization. This blog aims to present interesting analysis of geospatial data and to de-mystify predictive analytics for the layman. My blog is featured on: http://www.kdnuggets.com/ - Analytics and Data Mining Resources.
I created a web app the mapped recent earthquakes in the world based on data from the US Geological Survey (http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_week.csv). To understand the potential impact of these earthquakes, I added a layer that showed the population of the world's major cities (Esri's World's Cities layer). Some countries with large populations (China, India, Indonesia) lie in areas that are prone to earthquakes. Some less populated countries (eg New Zealand) are also in earthquake prone areas. The other observation is that there is significant and regular seismic activity in the Indian and Pacific Oceans especially when compared to the Atlantic Ocean. Here is the web map that I created:
I also created an interactive web app based on this data. To view the web app in all its glory, go here: http://arcg.is/2llj9Qd
Lift and Gain Charts are a useful way of visualizing how good a predictive model is. In SPSS, a typical gain chart appears as follows:
In today's post, we will attempt to understand the logic behind generating a gain chart and then discuss how gain and lift charts are interpreted.
To do this, we will use the example of a direct mailing company. Let us assume that based on experience, the company knows that the average response rate on its direct mail campaigns is 10%. Let us further make the following assumptions:
* Cost per ad mailed = $1 * Return per response = $50
Additionally, let us assume that the company mails out ads in lots of 10,000. Based on these assumptions, if the company mails out 100,000 ads, a table summarizing the results it would obtain from this campaign is provided below:
Now let us assume that the company uses SPSS Modeler to develop a predictive model using data from previous campaigns. "Response / No Response" is identified as the "target" fie…
In today's post, we discuss how to create a time series forecast using IBM SPSS Modeler. For the purposes of our exercise, we will use historical sales data at a SKU (stock keeping unit) level. This data is provided in a MicroSoft Excel .xlsx file and must be in the following format:
In the image above, AAAAA through EEEEE are SKU numbers with the relevant monthly sales data provided in the respective columns. There is also a column that indicates the grand total of all SKUs sold in a month (AAAAA +...+ EEEEE + other SKUs not shown in the image above). The last column in the image above reflects the months for which historical sales data are provided.
As with any modeling exercise, we first insert a source node into the modeling canvas. Since our data is in the MicroSoft Excel .xlsx file format, we insert an Excel source node as follows:
On exporting to a Table node, we see the output display as follows:
We then add a Filter node to select the five SKUs that we will be creati…
Using ArcGIS online and some simple instructions from Dr. Pinde Fu of Esri, I re-created a simple web app for selecting restaurant locations in USA. This web app allows users to choose between two competing locations for opening a full service restaurant based on some interesting analytics capabilities like driving distance in time based on historical traffic patterns, the latest demographic information of the locations including population, disposable income, etc.
Here is a screenshot of the results of analysis done on service area of one of the locations based on a 15-minute drive time distance if usual traffic at 6pm on a Friday is taken into account: