Problem
In most retail businesses, salespeople visit stores regularly for various purposes like product promotion,
market analysis, customer relationship building, and more.
However, manually planning these visits,
considering the geographical locations, frequency of visits, and available time,
often lead to inefficient routes, wasted time, and unnecessary travel expenses.
So let's define our problem here with a clear statement...
Objective: Create an optimized scheduling system for a sales team spread across the country, attending various stores with different locations, frequency, and time duration constraints. The goal is to maximize the number of stores visited, minimize travel time, comply with working hours limits, and recommend locations for new hires if not all stores can be visited.
Data Available
These are the fixed inputs and known data that define the problem:
- Store Location - in geographical coordinates;
- Salesman Starting Location - (usually homes) in geographical coordinates;
- Store Frequency and Time Duration - for attendance;
- Travel Times and Distances - required to travel between different locations (provided by API);
- Working Hours Limit - for each salesperson per week (including travel and attendance time).
Decision Variables
These are the variables that we can control and change to solve the problem:
- Allocation of Salesmen - How many and which stores are assigned to each salespeople;
- Travel Routes - The travel paths considering stores order to minimize travel time;
- New Salesmen Locations - Locations to relocate or hire new salespeople if necessary.
Constraints and Assumptions
These are the limitations and rules that the solution must adhere to:
- Working Hours Limit - Salespeople can't exceed a fixed number of working hours per week;
- Salesmen Availability - A fixed number of salespeople with known locations;
- Store Requirements - The specific frequency and time duration needed for each store;
- Travel Constraints - Consideration of real-world travel conditions, including traffic, routes, etc.
Objective Function
This is the goal we're trying to achieve. Here, it's a multi-objective function:
- Minimize Travel Time - Reduce the total travel time for salespeople;
- Maximize Store Attendance - Ensure that the maximum number of stores is attended by salespeople;
- Optimize Hiring Locations - If not possible to visit all stores, find the best locations to hire new salespeople or relocate idle ones.
Proposed Solution
It's like trying to solve a complex jigsaw puzzle.
The solution I propose harnesses the power of data science to solve this problem efficiently.
Data Preparation -> Clustering -> Task Aggregation -> Allocation Model
Data Preparation
The first step in any data science project is preparing the data. We cleaned the unnecessary columns and saved the missing data. We then disentangled correlated rows and added coordinates, which were then saved into a dictionary.
Clustering
The next step involved running a clustering algorithm, which is like grouping similar things together. In this case, we grouped stores based on their locations to minimize travel distance. The clustering was done in two layers —an initial clustering throughout the whole country and an inner clustering inside the large clusters for task aggregation later.
The ideal number of clusters was determined using the Silhouette Coefficient, a measure of how similar an object is to its own cluster compared to other clusters.
Task Aggregation
The clusters created were then used to group tasks into daily work packages based on the proximity of locations, creating 'Task units.' The goal here was to establish tasks within a defined limit of working hours. Taking the edge Door and successively adding the closest, based on a distance matrix between all, until reaches the limit.
Allocation Model
Finally, the task allocation model was implemented using Google's OR-Tools, specifically the Minimum Cost Flow CP-SAT Solver. These tools helped to optimize the workforce allocation based on time and frequency.
API Endpoints
View Clusters Positions
GET /api/v1/view_clusters_positions/<division>Figure 1: Plot of the position of each clusters centroids.
View Tasks Cluster
GET /api/v1/view_tasks_cluster/<division>/<n_cluster>
Figure 1: Plot of the position of each clusters centroids.
View Allocation Cluster
GET /api/v1/view_allocation_cluster/<division>/<n_cluster>
Figure 1: Plot of the position of each clusters centroids.
View Allocation Cluster with Doors
GET /api/v1/view_allocation_cluster_with_doors/<division>/<n_cluster>
Figure 1: Plot of the position of each clusters centroids.
View Outliers
GET /api/v1/view_outliers/<division>/<n_cluster>Figure 1: Plot of the position of Doors and STEs outside of US territory.
View Allocation Division
GET /api/v1/view_allocation_division/<division>/<n_cluster>
Figure 1: Plot of the whole division allocation throughout the territory.
Results
We compared the new method with traditional ones using specific metrics like travel time, costs, and task completion rate. The new method outperformed the old ones, showcasing the power of data science in solving real-world problems.
Figure 1: .
Figure 1: .
Figure 1: .
Although this approach significantly improves workforce allocation, it's not without its limitations. Future improvements might include more sophisticated clustering techniques or evolutionary optimization algorithms.
In conclusion, the combination of clustering algorithms and Google's OR-Tools presents a robust solution to the complex problem of workforce allocation in sales. I hope that this method will encourage other data scientists to tackle similar problems in various industries.
References
For readers interested in learning more about this topic, I've compiled a list of the references and additional resources below. Happy exploring!
Google Machine Learning: Clustering Algorithms
Google OR-Tools: A Comprehensive Guide
Google OR-Tools: Assignment as Minimum Cost Flow Problem
Google OR-Tools: CP-SAT Solver