Welcome to WAPTO!
1. Overview
The AI Call Assistant represents the pinnacle of voice automation. This feature empowers businesses to deploy intelligent, autonomous voice assistants that handle phone calls with human-like reasoning. By combining Speech-to-Text (STT), Large Language Models (LLMs), and Text-to-Speech (TTS), the agent can understand complex queries, fetch real-time data via APIs, and provide natural responses 24/7.
2. Sequential Configuration Protocol
Configuring a voice assistant is a strictly sequential 5-step process designed to ensure absolute operational integrity.
Step 1: Identification & Initialization
- Assistant Name: The internal identifier for your agent (e.g., "Front Desk Bot").
- Welcome Greeting: The mandatory initial audio prompt played to every caller.
- Live & Ready Toggle: The master switch to enable or disable the agent's connectivity.
Tip: Keep greetings concise and welcoming: "Hello, I'm your AI assistant. How can I help you today?"
Step 2: AI Intelligence (The Brain)
- Model Engine: Select the underlying LLM (e.g., Gemini-2.0-flash-lite) for processing.
- System Instructions: Define the agent's personality and rules (e.g., "You are a professional hotel receptionist").
- Knowledge Base URL: Link to external documentation to feed the agent real-time business data.
Step 3: Functions & Tools (Action Hub)
This enables the agent to perform real-world actions like checking order status or booking via API.
- Click Deploy New Function Tool to connect your business APIs.
- Define JSON-based functions that the AI can call during a conversation.
Step 4: Voice & STT (Sensory Layer)
- TTS Provider: Provided premium voices from providers like ElevenLabs.
- Voice Style: Specific vocal characteristics (e.g., "Crystal Clear", "Deep Analysis").
- STT Engine: Provided transcription engine for accurate caller understanding.
Step 5: Connectivity
The final step to make your AI Call Assistant live. This section allows you to connect your telephony system so real phone calls can be routed to your AI agent.
What you can do:
- Enable or disable call recording for agent and user voice
- Turn on AI transcription for automatic call logging
- Configure hangup behavior for call completion
- Add exit keywords (e.g., bye, goodbye) to end calls automatically
- Customize the final greeting message before call disconnect
- Control how conversations are recorded and stored
- Review settings and launch the AI Call Assistant
3. Operational Workflow
Logic Pipeline
- Capture: Incoming call triggers the Welcome Greeting.
- Transpose: Caller's voice is converted to text (STT).
- Reason: AI processes text against instructions and tools.
- Execute: AI fires API tools if action is required.
- Articulate: Response is converted to natural audio (TTS).
Strategic Use Cases
- Customer Support: Resolve common queries without queuing.
- Appointment Booking: Real-time calendar syncing via API.
- Sales Qualification: Qualify leads via voice before human hand-off.
4. Governance & Best Practices
- Latency Optimization: Use "Natural Conciseness" in Step 2 to keep responses brief and human-like.
- Instruction Clarity: Your System Instructions are the "law" for the agent. Be explicit about tone and boundaries.
- Tool Validation: Always test API tools separately before deploying to the voice assistant.
Troubleshooting: If the agent isn't responding, first verify that your AI API Key (Step 2) and TTS API Key (Step 4) are both valid and have sufficient credits.