How-to: RFM Analysis using Excel
Here’s a step-by-step tutorial to slicing-and-dicing your structured transaction data to uncover new customer segmentations using Microsoft Excel
I won’t bore you with the benefits of an RFM analysis.
You’re here because you understand the benefits of running an RFM analysis to weed out new customer segmentations you didn’t know existed. You want to drive greater personalization in your campaigns with these segmentations.
The following is a simple way of running an RFM analysis using structured customer transaction data from your CRM (or otherwise) and Microsoft Excel.
You’ll definitely want to use a more robust data tool if your datasets are large, but this is at least one way (of many) to build some very compelling lists for campaigns or an asset to use to build buy-in for marketing technology spend.
E-Commerce types sitting on a lot of customer transaction data will find value in this step-by-step guide to running an RFM analysis that has the potential to unlock very valuable (and neglected) customer segments.
Here’s how to do it, hopefully explained at the appropriate level of granularity.
1. Gather Customer Transaction Data
You will need transactional data sets tied to unique customer identifiers (email address or customer numbers), typically pulled from CRM and/or POS (point of sale) solutions; the data contained in this transactional data should be relational to unique customer identifiers, for example, connected to a customer ID or email address.
If you are starting with transactional data that can be tied to whatever unique customer identifier is used at your organization, you can easily build your own RFM model to backbone automated engagement scoring or run it to build some intriguing lists for campaigns.
Let us start with the possibility you are pulling customer transaction data that needs to be prepared for manipulation.
2. Prepare Transactional Data for Manipulation
If your data looks like it does above, you will want to pivot and run a formula or two in Microsoft Excel or another capable tool to manipulate the transactional data.
Select the entire data set and create a pivot table in a new tab, and order your table with the following checked for “Field Settings” and “PivotTable Fields,” as shown.
I cheated and did a V-Lookup to pull in the date of the transactions from the original data set referenced in the formula here:
Copy and paste the VALUES from the pivot table into new cells. Be sure to paste the values only – not the formula that outputs the value — just the values.
3. Manipulate Transactional Data in Excel
Using this clean data set (clean of formulas), you will want to pivot again to count, average and do some simple arithmetic, such that your pivot table looks like this (take note of the settings):
The “Count of Order Number” is a weighted FREQUENCY (F) value; “Average of Order Total $” is your weighted MONETARY VALUE (M) value for each Customer #.
Next calculate you’ll calculate a weighted RECENCY (R) value by creating a column of cells containing “Today’s Date” and subtracting the “Today’s Date” column from that of the “Max of Date of Purchase” cell in the pivot previous this one.
“Max of Date of Purchase” of course is the date of the customer’s most recent purchase. Make sure the column is set to ‘short date’. Subtract the “Today’s Date” column from the “Max Date of Purchase” one to get the number of days since last purchase.
4. Begin modeling with analysis of aggregate data
Create a new sheet with the columns in the graphic below, being sure to paste only the values into the cells. You should include the weighted R – Recency, F – Frequency, M – Monetary value numbers in the cells.
Note: These values are not very useful until we normalize them into 1 to 5 scoring dimensions (or whatever scoring dimensions you decide to use).
Copy and paste the R, F, and M column VALUES (such that the cells do not contain formulas) into a new table or sheet, along with the VALUES of the other columns listed in the graphic below. I pasted these into a new sheet – again containing raw values only (no formulas!)
Here’s how your data set should look in Excel:
I calculated averages for each of the RFM dimensions, along with standard deviation and percentage thresholds.
Here’s the actual I used to calculate the percentiles:
5. Analyze Data; Create RFM Scoring Thresholds
Now use this data to create scoring thresholds for your RFM Model. Here’s how mine looked, taking into account the data I calculated in the previous step, using a 1-5 scoring dimensions:
There are so many ways of creating the scoring thresholds for your RFM model, play around with your data some more to find it. Keep in mind first iterations aren’t perfect, but with an orientation toward continued improvement, it can get perfect. There’s also this this thing called Machine Learning.
6. Create Formulas for 1 to 5 RFM Scoring
You’ll want to create formulas such that whatever ‘weighted’ value is entered for R, F, or M is converted into a score 1 to 5, based on the thresholds you created in the previous step.
Here, I create another column called “Test Value”, where one can enter a test value used in a formula for another column we’ll create called “Formula”, that will spit out an RFM score 1-5.
Most critical to building this out in Excel is the formula you use to spit out RFM Scores. Here’s how my formulas look based on the RFM Scoring values in the chart above.
The Excel Formula based on the F-Frequency thresholds above looks like this (assume cell G12 is a test value indicating the number of purchases).
F - Frequency Formula:
=IF(AND(G12=2),1,IF(AND(G12>=3,G12<=5),2,IF(AND(G12>=6,G12<=8),3,IF(AND(G12>=9,G12<=12),4,IF(AND(G12>=13),5,0)))))
If # purchases = 2, the F-score output will be 1
If between 2 and 5, a score of 2
6 and 8 is a score of 3
9 and 12 is a score of 4
13 or more is a score of 5
R - Recency Formula:
M - Monetary Value Formula:
7. Assign RFM Values to Customers
Create a table as shown below, using the formulas you created for the R Score, F Score, and M Score columns. Use the # of orders, Average Order $, and Days Since Last Purchase (the weighted RFM values) as the variable used in the formula fields (R, F, M Score columns).
You’ll want to paste the formulas you created in the last step and update them accordingly.
Now add all the values up into a single RFM score; you can also weigh the dimensions differently according to your needs.
8. Analyze RFM Scoring for Possible Segmentations
Finally, copy and paste this data into a new table to begin analyzing how well the first iteration of your RFM model does weeding out your most valuable customers.
Of course, don’t concentrate just on the highest scores; it’s possible you have loyal but disengaged customers with low RFM scores that might be good targets for a campaign.
In theory, the customers that sort to the top with the highest RFM scores should be your most loyal customers. If you’re a business that’s membership-driven (i.e. paid vs free or member vs non-member), you’ll find plenty of opportunities for campaigns to convert non-members into members like we see here:
Using the inverse line of thinking, one could use the RFM model to find customers that are “Members” or paying-types but disengaged; they may look like this and be good targets for re-engagement campaigns:
Use the form below to download a spreadsheet that demonstrates how the RFM formula in this blog post works!