Event-Driven Call Transcript Classification Pipeline with Vertex AI and Gemini

A real-time machine learning solution for processing and classifying call center transcripts to drive personalized customer experiences for a leading custom closet company.

PythonVertex AIGeminiKubeFlowGCPPub/Sub
Event-Driven ML Pipeline with Vertex AI architecture diagram showing data flow from call center to processing pipeline

Project Overview

This project involved designing and implementing an end-to-end event-driven machine learning pipeline for a leading custom closet company with retail locations across the United States. The system processes inbound call center transcripts by integrating a call tracking platform, Google Cloud Platform services, and Vertex AI with large language models for customized classification and analysis.

The solution captures call transcripts in real-time from a call center tracking platform, automatically processes them using a Pub/Sub triggered pipeline, and classifies them according to product interest, purchase intent, and appointment details. This enriched data fuels personalized customer communications and recommendations across digital touchpoints, improving conversion rates and customer satisfaction.

Role & Impact

Led end-to-end design and implementation of the real-time ML pipeline as the sole data scientist on a cross-functional team. Personally handled all ML design, prompt engineering, and infrastructure orchestration. Enabled automated classification of thousands of weekly call transcripts, reducing inconsistent manual processing.

Business Challenge

The client faced several challenges processing the high volume of inbound sales calls:

  • Inconsistent reporting quality and metrics from customer support representatives across locations
  • No automated way to extract actionable insights from customer conversations
  • Manual follow-up processes that missed key sales opportunities
  • Lack of flexibility in tailoring conversation insights to specific business types and customer segments

Solution Architecture

We designed and implemented a fully event-driven architecture on Google Cloud Platform that processes call transcripts in real-time:

Event-driven architecture for real-time call transcript classification and processing

Machine Learning Implementation

The core of the solution leveraged Vertex AI and Gemini large language models to analyze call transcripts:

  • Model Selection: Utilized Gemini for its superior ability to understand conversational context and nuances in call transcripts
  • Prompt Engineering: Designed specialized prompts for each classification task using LangChain optimized to achieve high performance in transcript analysis
  • Multi-label Classification: Developed four distinct classification chains
  • Safety Mechanisms: Configured safety settings to enable business classification while preventing inappropriate content
  • Data Preprocessing: Implemented text normalization, special character removal, and standardization techniques
  • Validation System: Created a fuzzy matching validation process to correct classification outputs that didn't match predefined categories
  • Pipeline Automation: Built a three-step Vertex AI Pipeline that:
    1. Exports transcript data from BigQuery
    2. Processes transcripts through classification models
    3. Loads enriched data back to BigQuery for activation
  • Model Versioning: Established CI/CD processes for prompt updates

This approach allowed us to process thousands of call transcripts weekly with minimal human intervention, achieving high classification performance while maintaining flexibility to adapt to changing business needs.

Outcomes

Email Engagement

Projected Increase

Personalized email campaigns based on call transcript insights designed to drive higher open and click-through rates.

Conversion Optimization

Targeted Improvement

Enhanced follow-up processes and targeted content based on call intent classification aimed at improving website-to-appointment conversion rates.

Opportunity Recovery

Expected Reduction

Automated follow-up system for high-intent calls that don't convert to appointments designed to reduce missed sales opportunities.

Technical Challenges

Implementing this event-driven ML solution required overcoming several complex technical hurdles:

  • Transcript Quality Issues: Call transcripts often contained noise, brand mentions, and speech disfluencies that reduced classification accuracy. I developed specialized preprocessing techniques to clean and normalize the text before analysis.
  • Event-Driven Architecture: Ensuring reliable real-time processing with minimal latency required careful design of the Pub/Sub topic configuration, subscription handling, and error management to prevent message loss.
  • Authentication & Security: The solution required secure authentication between multiple GCP services. I implemented service account management with appropriate IAM permissions and Secret Manager integration for API credentials.
  • Pipeline Orchestration: Building a reliable KubeFlow pipeline involved complex component design, artifact passing between steps, and ensuring consistent data formats throughout the workflow.
  • LLM Prompt Engineering: Developing effective prompts for the Gemini model required extensive testing and refinement to achieve high classification accuracy across diverse call scenarios and product types.
  • Validation & Error Handling: LLM outputs sometimes contained unexpected formats or classifications. I implemented a fuzzy matching system that could correct outputs to match predefined categories without human intervention.
  • Cost Optimization: LLM inference at call volume scale could become expensive. I implemented batching and caching strategies to reduce API calls and optimize the overall solution cost.
  • Data Privacy Compliance: Call transcripts contained personally identifiable information (PII). The solution needed robust security measures to ensure GDPR and CCPA compliance throughout the data processing pipeline.

Design Tradeoffs & Decisions

Several key architectural and modeling decisions were made to ensure the solution was scalable, cost-effective, and optimized for real-world call data:

  • Event-Driven Architecture: Chose an event-driven design using Pub/Sub and Cloud Functions to enable low-latency, real-time processing and avoid unnecessary compute costs associated with batch jobs.
  • Model Selection: Selected Gemini 1.5 Pro for its strong zero-shot performance and contextual understanding of noisy, real-world transcripts, outperforming traditional classification models in preliminary evaluations.
  • Prompt Design Strategy: Used LangChain to modularize and chain prompts, enabling rapid iteration, reusable templates, and better debugging across multiple classification tasks.
  • Pipeline Modularity: Structured the Vertex AI Pipeline with clearly separated steps for ingestion, classification, and enrichment to improve observability and maintainability.
  • Fallback Handling: Anticipated classification edge cases by integrating fuzzy matching and validation layers, trading minor recall for significantly improved precision and reliability.

Technologies Used

Cloud Infrastructure

  • Google Cloud Platform
  • BigQuery
  • Cloud Pub/Sub
  • Cloud Functions

ML & AI

  • Vertex AI
  • Gemini 1.5 Pro
  • LangChain
  • KubeFlow Pipelines

Development

  • Python
  • Pandas
  • FuzzyWuzzy
  • Secret Manager

Integrations

  • Customer Data Platform
  • Call Tracking Platform
  • Email Marketing System
  • Cloud Storage

Why It Matters

Business Perspective

Businesses often overlook unstructured data in customer support—this project proves how conversation data can be mined in real time for revenue opportunities.

The most valuable insights often exist in places where companies aren't looking—the space between formal data points, in the conversations and interactions that happen every day.

ML/DS Perspective

LLMs can't just "plug and play"—they require thoughtful prompt engineering, validation systems, and downstream integration to work in real-world production.

The difference between a demo and production-ready ML is immense. Robust solutions require meticulous attention to edge cases, validation chains, and integration points that are invisible in controlled environments.

Conclusion

This project demonstrates the power of combining event-driven architecture with advanced machine learning to transform unstructured conversation data into actionable business intelligence. The solution successfully:

  • Automated the classification of thousands of weekly call transcripts with high performance
  • Enabled personalized marketing campaigns based on specific product interests expressed during calls
  • Improved sales follow-up by automatically identifying high-intent prospects
  • Created a unified view of customer interactions across channels
  • Provided retail locations with better insights into customer needs and preferences

Beyond the immediate business impact, this project showcases how modern cloud infrastructure, event-based architectures, and large language models can be combined to solve complex business problems in a scalable, cost-effective manner.