Mining Call Center Dialog Data
Free (open access)
A. Gilman, B. Narayanan & S. Paul
We consider the problem of mining conversations between customers and call center representatives for automatically classifying calls into predefined categories. We analyze the conversations for speaker dependent information content using several multi-class classification technologies. The data consists of 539 manually transcribed conversations belonging to 15 categories. Classifiers were built using Support Vector Machines, Naïve Bayes, Latent Semantic Analysis, Vector Space and K-Nearest Neighbor technologies. SVM classifiers were found to perform consistently well giving an accuracy of about 74% on the entire data and about 92% when considering only the 4 largest classes. It is observed that very high weightage to either the customer part of the dialog or that of the agent results in poor accuracy. Nearly equal weightage to the customer and agent provides the best results consistently. This approach has potential to identify cross sell and up-sell opportunities in real-time. Keywords: text mining, classification, support vector machines, dialog mining, customer relationship management. 1 Introduction Customer Relationship Management (CRM) strategy and implementation are essential for a business to retain its most valuable customers and boost overall profitability. Information related to the products, customers and transactions is aggregated in a CRM data warehouse. This data is then analyzed using traditional analytics to build models for customer segmentation, propensity for products, and cross sell opportunities. It is also recognized that a lot of information about customer behavior and propensities may be gleaned from the conversations between call center service representatives and customers. Dialog Mining of these conversations – that may be transcribed to text automatically
text mining, classification, support vector machines, dialog mining, customer relationship management.