ShopEase is a fast-growing online retail platform and supermarket chain that offers a wide range of products, including clothing, groceries, sportswear, electronics, and home & garden essentials. Despite its rapid expansion, ShopEase has only scratched the surface of leveraging its customer data for strategic decision-making. With years of data collected from its sales records and customer purchase patterns, ShopEase embarked on a customer segmentation project. This project aims to analyze customer behavior and create actionable segments to personalize marketing campaigns, enhance customer engagement, and drive retention. By clustering customers based on their monetary value and purchase frequency, ShopEase is positioned to optimize its marketing strategies and improve overall customer satisfaction.
- The step-by-step implementation of this project can be seen in Clustering Notebook.
This is a transactional data set that contains all the transactions occurring between 01/12/2010 and 09/12/2011. Many customers of the company are wholesalers.
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
InvoiceNo | ID | Categorical | a 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation | - | no |
StockCode | ID | Categorical | a 5-digit integral number uniquely assigned to each distinct product | - | no |
Description | Feature | Categorical | product name | - | no |
Quantity | Feature | Integer | the quantities of each product (item) per transaction | - | no |
InvoiceDate | Feature | Date | the day and time when each transaction was generated | - | no |
UnitPrice | Feature | Continuous | product price per unit | sterling | no |
CustomerID | Feature | Categorical | a 5-digit integral number uniquely assigned to each customer | - | no |
Country | Feature | Categorical | the name of the country where each customer resides | - | no |
In this dataset, each row (or instance) represents a single customer, including their associated purchase history. For each customer, the dataset captures key behavioral information, such as their spending habits and purchase frequency. These attributes are used to understand and analyze customer behavior. The dataset used contains 536,642
rows and can be gotten here.
After data cleaning, 27.1% of the dataset was removed, consisting of transactions that did not represent customer purchases, such as postage charges, commissions, shipping costs, bank charges, and other similar expenses.
Most customers have a low monetary value (below 50,000), with very few high-spending customers. The presence of outliers at very high monetary values suggests a small group of customers with significant spending power. The dataset is right-skewed, as the bulk of customers are clustered at the lower end, indicating that most spend relatively little.
The majority of customers have very few purchase occurrences (most are below 10 purchases), with a very small number of customers showing higher purchase frequencies (up to 200). This suggests that most customers are infrequent buyers, while a few might engage in repeated purchases.
The recency plot shows that most customers have made purchases recently (with recency close to 0), but there are a significant number of customers who haven’t purchased in a long time. The distribution is also right-skewed, with most recent purchases being concentrated on the lower end (indicating recent activity).
These distributions represent the non-outlier dataset and are typical in RFM (Recency, Frequency, Monetary) analysis, where customer behaviors are often right-skewed because a small portion of customers contribute to the majority of revenue, while most customers engage less frequently.
The K-Means clustering algorithm segmented customers into seven distinct groups (4 clusters from the non-outlier data and 4 clusters from the outlier data :
Characteristics: Average spend customers who purchase relatively regularly, though most are recent buyers.
Characteristics: Lower-value, infrequent buyers who have averagely purchased recently.
Characteristics: Very frequent, high-value customers and most have made recent purchases.
Characteristics: Low-value, few frequent buyers, many of whom are not actively purchasing.
Characteristics: Most are average spenders, frequent buyers, and recent purchasers.
Characteristics: Extreme spenders, very frequent buyers, and very recent.
Characteristics: Low spend, high frequency, and not recent buyers.
Characteristics: High spending, high frequency, and very recent buyers.
The marketing and customer experience teams should consider the following targeted strategies based on the insights gained from the customer segmentation:
Based on the similarities in customer behaviors, merging clusters to optimize marketing strategies is recommended. The merged clusters allow for more streamlined targeting while maintaining distinct approaches for different customer groups. By reducing the number of segments, marketing efforts can be more focused and efficient.
Characteristics: Average spenders, frequent buyers, recent purchasers.
Both clusters represent customers who purchase frequently with average spend levels and recent activity. The behavior of these two segments is closely aligned, making it unnecessary to differentiate them for marketing purposes.
- Personalized Engagement Campaigns: Use data-driven insights to send tailored offers and personalized product recommendations tied to their past shopping behavior to maintain consistent engagement.
- Habit Reinforcement Programs: Implement a loyalty program that rewards frequent purchases with tiered benefits like discounts, early access to new products, or exclusive collections.
- Cross-sell & Upsell Tactics: Leverage their frequent buying habits by suggesting complementary or higher-value products with targeted promotions like "Complete Your Set" or "You Might Also Like."
Characteristics: High to extreme spenders, very frequent buyers, very recent.
Both clusters represent high-value, frequent purchasers with recent activity. Their distinction based on spending levels can be managed with similar high-end marketing strategies focused on exclusivity and VIP treatment.
- High-value VIP Perks: Provide white-glove services like personal shopping consultations, invitations to private events, or early access to high-end collections. Position them as premium members of an exclusive club.
- Exclusive Access & Flash Sales: Offer time-sensitive flash sales and priority access to limited-edition or high-end products to reward their spending and further drive engagement.
- Personalized Thank-Yous & Luxury Gifts: Send personalized thank-you notes or surprise luxury gifts with purchases to strengthen brand loyalty and emotional connection.
Characteristics: Low-spend, infrequent buyers, ranging from recent to dormant.
Both clusters exhibit low-spend and infrequent purchasing patterns, with the primary distinction being the time since their last purchase. Combining them into a single segment allows for a cohesive re-engagement strategy targeting low-value buyers.
- Reactivation & Urgency-driven Campaigns: Run FOMO-based campaigns with strong incentives like time-limited discounts or personalized offers to prompt immediate action. Messages like "We Miss You!" or "Flash Sale: 24 Hours Only!" can help drive conversions.
- Retargeting & Abandonment Campaigns: Deploy retargeting ads and abandonment recovery emails to remind them of viewed products or incomplete checkouts, encouraging them to complete their purchase.
- Win-Back Surveys & Content: Use surveys to understand why they’ve gone dormant and adjust marketing strategies accordingly. Share educational content or product guides to showcase the value of higher-end products and encourage re-engagement.
Characteristics: High spenders, frequent buyers, recent activity.
- Exclusive VIP Programs: Create exclusive access to upcoming collections, special events, or premium product lines to reward their loyalty and keep them engaged.
- Referral & Advocacy Incentives: Introduce a referral program that leverages their advocacy to bring in new high-value customers. Offer special bonuses for successful referrals.
- Personalized Experiences: Deepen emotional connections with personalized thank-you notes, surprise gifts, or customized offers based on their preferences.
Characteristics: Low spend, high frequency, not recent.
- Frequent Small Promotions: Keep them engaged with continuous, low-value promotions or rewards that align with their modest but frequent purchasing habits, such as "Buy More, Save More" offers.
- Subscription-based Programs: Launch subscription services that offer regular rewards or incentives for small but frequent purchases, encouraging long-term engagement.
- Re-Engagement Newsletters: Send regular newsletters featuring new product arrivals or trending items that match their buying patterns to reignite interest and purchasing activity.
By applying these strategies, ShopEase can maximize customer engagement, increase purchase frequency, and foster long-term loyalty, ultimately driving sustained growth and commercial success.