I Deployed a Client-Facing Chatbot on Vertex AI for Less Than a Penny Per Conversation

The Client Asked for a Chatbot. I Built Them an Employee.

A restoration client wanted a website chatbot. Their brief was simple: answer common questions about services, capture lead information, and route urgent inquiries to their dispatch team. The expectation was a /month SaaS widget with canned responses.

I built them something better. A custom chatbot running on Google Vertex AI via Cloud Run, trained on their specific service pages, pricing guidelines, and service area boundaries. It handles natural language questions, qualifies leads by asking the right follow-up questions, and routes urgent water damage calls directly to dispatch with full context. Cost per conversation: .002. That is two-tenths of a penny.

At 500 conversations per month, the total AI cost is . Add Cloud Run hosting at roughly /month for the container, and the total infrastructure cost is under /month for a chatbot that replaces a /month SaaS product and performs significantly better because it actually understands the business.

The Architecture

The chatbot runs on three components:

Vertex AI (Gemini model): Handles the conversational intelligence. The model receives a system prompt loaded with the client’s service information, pricing ranges, service area (Houston metro), and qualification criteria. It responds conversationally, asks clarifying questions when needed, and structures lead information for capture.

Cloud Run container: A lightweight Python FastAPI application that serves as the API endpoint. The WordPress site calls this endpoint via JavaScript when a visitor interacts with the chat widget. The container handles session management, conversation history, and the Vertex AI API calls. It scales to zero when not in use, so idle hours cost nothing.

WordPress integration: A simple JavaScript widget on the client site that renders the chat interface and communicates with the Cloud Run endpoint. No WordPress plugin required. The widget is 40 lines of JavaScript that creates a chat bubble, handles user input, and displays responses.

Why Vertex AI Instead of OpenAI

Cost: Gemini 1.5 Flash on Vertex AI costs significantly less per token than GPT-4 or GPT-3.5. For a chatbot handling short conversational exchanges, the per-conversation cost difference is dramatic.

Data residency: Vertex AI runs on GCP infrastructure where I already have my project. Data stays within the Google Cloud ecosystem I control. No third-party API means the conversation data, which includes client contact information, stays within my GCP project boundaries.

Scale-to-zero: Cloud Run only charges when processing requests. During overnight hours when nobody is chatting, the cost is literally zero. OpenAI’s API has the same pay-per-use model, but coupling it with Cloud Run for the hosting layer gives me full control over the deployment.

The System Prompt That Makes It Work

The chatbot’s intelligence comes entirely from its system prompt. No fine-tuning. No RAG pipeline. No vector database. Just a well-structured system prompt that contains the client’s service descriptions, pricing ranges (not exact quotes), service area zip codes, qualification questions, and escalation triggers.

The prompt includes explicit instructions for lead qualification. When someone describes a water damage situation, the chatbot asks: When did the damage occur? Is it an active leak or standing water? What is the approximate affected area? Is this a residential or commercial property? Do you have insurance? These questions mirror what the dispatch team asks on phone calls.

When the qualification criteria indicate an emergency (active leak, less than 24 hours, standing water), the chatbot provides the dispatch phone number prominently and offers to notify the team. Non-emergency inquiries get scheduled callback options.

Results After 90 Days

The chatbot handled 1,400 conversations in its first 90 days. Of those, 340 were qualified leads (24% conversion rate from chat to lead). Of the qualified leads, 89 became paying customers.

The previous chatbot solution (a SaaS widget with canned response trees) had a 6% chat-to-lead conversion rate. The AI chatbot quadrupled it because it can actually understand what someone is describing and respond helpfully rather than forcing them through a rigid decision tree.

Total infrastructure cost for 90 days: approximately . Total value of the 89 customers: several hundred thousand dollars in restoration work. The ROI is not a percentage – it is a category error to even calculate it.

Frequently Asked Questions

Can the chatbot handle multiple languages?

Yes. Gemini handles multilingual conversations natively. The Houston market has significant Spanish-speaking population, and the chatbot responds in Spanish when addressed in Spanish without any additional configuration. This alone increased lead capture from a demographic the client was previously underserving.

What happens when the chatbot cannot answer a question?

The system prompt includes a graceful fallback: if the question is outside the defined scope, the chatbot acknowledges the limitation and offers to connect the visitor with a human team member via phone or scheduled callback. It never fabricates information about pricing or services.

How hard is this to set up for a new client?

About 3 hours. Create the Cloud Run service from the template, customize the system prompt with the client’s information, deploy, and add the JavaScript widget to their WordPress site. The infrastructure is templated – the customization is entirely in the system prompt content.

The Bigger Point

AI chatbots do not need to be expensive SaaS products with monthly subscriptions. The underlying technology – language models accessible via API – costs fractions of a penny per interaction. The value is in the deployment architecture and the domain-specific knowledge you embed in the system prompt. Own the infrastructure, own the intelligence, and the cost drops to near zero while the quality exceeds anything a canned-response widget can deliver.

{
“@context”: “https://schema.org”,
“@type”: “Article”,
“headline”: “I Deployed a Client-Facing Chatbot on Vertex AI for Less Than a Penny Per Conversation”,
“description”: “Using Google Vertex AI and Cloud Run, I deployed a production chatbot for a client site that handles FAQs, qualifies leads, and routes inquiries –.”,
“datePublished”: “2026-03-21”,
“dateModified”: “2026-04-03”,
“author”: {
“@type”: “Person”,
“name”: “Will Tygart”,
“url”: “https://tygartmedia.com/about”
},
“publisher”: {
“@type”: “Organization”,
“name”: “Tygart Media”,
“url”: “https://tygartmedia.com”,
“logo”: {
“@type”: “ImageObject”,
“url”: “https://tygartmedia.com/wp-content/uploads/tygart-media-logo.png”
}
},
“mainEntityOfPage”: {
“@type”: “WebPage”,
“@id”: “https://tygartmedia.com/i-deployed-a-client-facing-chatbot-on-vertex-ai-for-less-than-a-penny-per-conversation/”
}
}

I Deployed a Client-Facing Chatbot on Vertex AI for Less Than a Penny Per Conversation

The Client Asked for a Chatbot. I Built Them an Employee.

The Architecture

Why Vertex AI Instead of OpenAI

The System Prompt That Makes It Work

Results After 90 Days

Frequently Asked Questions

Can the chatbot handle multiple languages?

What happens when the chatbot cannot answer a question?

How hard is this to set up for a new client?

The Bigger Point

Further Reading

Comments

Leave a Reply Cancel reply

More posts

I Deployed a Client-Facing Chatbot on Vertex AI for Less Than a Penny Per Conversation

The Client Asked for a Chatbot. I Built Them an Employee.

The Architecture

Why Vertex AI Instead of OpenAI

The System Prompt That Makes It Work

Results After 90 Days

Frequently Asked Questions

Can the chatbot handle multiple languages?

What happens when the chatbot cannot answer a question?

How hard is this to set up for a new client?

The Bigger Point

Further Reading

Comments

Leave a Reply Cancel reply

More posts

Metricool Scheduler: How the Planner Actually Works

Metricool Alternatives: When to Use Something Else

Metricool for Instagram: What Works, What Doesn’t, and What to Expect

Metricool API: What It Can Do and How to Use It