Building Effective Prompts

This guide covers 10 practical techniques for creating effective, consistently-executed prompts for GPT-realtime models. These are based on extensive testing and real-world usage patterns.

Recommended Prompt Structure

Organize your prompts to help the model understand context and maintain consistency across turns. Use clear, labeled sections in your system prompt so the model can find and follow them. Make each section focused on one thing.

# Role & Objective        — who you are and what "success" means
# Personality & Tone      — the voice and style to maintain
# Context                 — retrieved context, relevant info
# Reference Pronunciations — phonetic guides for tricky words
# Tools                   — names, usage rules, and preambles
# Instructions / Rules    — do's, don'ts, and approach
# Conversation Flow       — states, goals, and transitions
# Safety & Escalation     — fallback and handoff logic

This format also makes it easier to iterate and modify problematic sections. To make this system your own prompt, add domain-specific sections (e.g., compliance, brand policies) and remove sections you don’t need. Within each section, provide instructions and other information so the model responds correctly.

10 Essential Techniques

1. Be Precise - Eliminate Conflicts

New real-time models are excellent at following instructions. However, this also means that small wording changes or unclear instructions can meaningfully change behavior. Check and iterate your system prompts to test different phrasings and fix instruction conflicts. Example: In one experiment, changing the word “inaudible” to “unintelligible” in instructions for handling noisy input significantly improved model performance. After your first attempt at a system prompt, have an LLM check it for ambiguity or conflicts.

2. Use Bullet Points Over Paragraphs

Real-time models follow short bullet points better than long paragraphs.

Before (Harder to Follow)
After (Easier to Follow)

instructions: `When you can't clearly hear the user, don't proceed. If there's background noise or you only caught part of the sentence, pause and ask them politely to repeat themselves in their preferred language, and make sure you keep the conversation in the same language as the user.`

instructions: `## Unclear Audio

- Only respond to clear audio or text.
- If audio is unclear/partial/noisy/silent, ask for clarification in {preferred_language}.
- Continue in the same language as the user if intelligible.`

3. Handle Unclear Audio

Real-time models excel at following instructions about handling unclear audio. Provide detailed instructions on what to do when audio is unavailable.

instructions: `## Unclear audio

- Always respond in the same language the user is speaking in, if intelligible.
- Default to English if the input language is unclear.
- Only respond to clear audio or text.
- If the user's audio is not clear (e.g., ambiguous input/background noise/silent/unintelligible) or if you did not fully hear or understand the user, ask for clarification using {preferred_language} phrases.

Sample clarification phrases (parameterize with {preferred_language}):

- "Sorry, I didn't catch that—could you say it again?"
- "There's some background noise. Please repeat the last part."
- "I only heard part of that. What did you say after ___?"`

4. Limit Model to One Language

If you see the model switching languages in unhelpful ways, add a dedicated “Language” section to your prompt. Ensure it doesn’t conflict with other rules. By default, mirroring the user’s language works well. Simple approach to mirror user language:

instructions: `## Language

Language matching: Respond in the same language as the user unless directed otherwise.

For non-English, start with the same standard accent/dialect the user uses.`

English-only constraint example:

instructions: `## Language

- The conversation will be only in English.
- Do not respond in any other language, even if the user asks.
- If the user speaks another language, politely explain that support is limited to English.`

For language learning applications:

instructions: `## Language

### Explanations
Use English when explaining grammar, vocabulary, or cultural context.

### Conversation
Speak in French when conducting practice, giving examples, or engaging in dialogue.`

Dialect control:

instructions: `## Language

Response only in argentine spanish.`

5. Provide Example Phrases and Flow Snippets

The model learns style from examples. Provide short, varied examples for common conversation moments. For example, you can provide the model with a high-level conversation flow shape:

Greeting → Discover → Verify → Diagnose → Resolve → Confirm/Close. Advance only when criteria in each phase are met.

Then provide prompt guidance for each part. For example, here’s how to guide the greeting section:

instructions: `## Conversation flow — Greeting

Goal: Set tone and invite the reason for calling.

How to respond:
- Identify as ACME Internet Support.
- Keep it brief; invite the caller's goal.

Sample phrases (vary, don't always reuse):
- "Thanks for calling ACME Internet—how can I help today?"
- "You've reached ACME Support. What's going on with your service?"
- "Hi there—tell me what you'd like help with."

Exit when: Caller states an initial goal or symptom.`

6. Avoid Robotic Repetition

If responses sound repetitive or mechanical, include explicit variety instructions. This can sometimes happen when using example phrases.

instructions: `## Variety

- Do not repeat the same sentence twice. Vary your responses so it doesn't sound robotic.`

7. Use Capitalized Text to Emphasize Instructions

Like many LLMs, using CAPITALIZATION for important rules helps the model understand and follow them. It’s also helpful to convert non-text rules (like numeric conditions) to text before capitalizing.

Instead of
Use

instructions: `## Rules

- If [func.return_value] > 0, respond 1 to the user.`

instructions: `## Rules

- IF [func.return_value] IS BIGGER THAN 0, RESPOND 1 TO THE USER.`

8. Help Models Use Tools

How models use tools can change the experience—how much they rely on user confirmation vs. taking action, what they say when making tool calls, what rules they follow for each specific tool, and more. One way to prompt tool usage is to use preambles. Good preambles instruct the model to provide some feedback about what it’s doing before making a tool call, so users always know what’s happening. Example:

instructions: `# Tools

- Before any tool call, say one short line like "I'm checking that now." Then call the tool immediately.`

You can add example phrases to preambles to increase variety and better customize your use case. There are several other ways to improve model behavior when executing tool calls while maintaining conversation with the user. Ideally, the model proactively calls the right tools, checks for confirmation on any important writes, and keeps the user informed throughout the process.

9. Use LLMs to Improve Your Prompts

LLMs are excellent at finding issues in prompts. Use ChatGPT or the API to get a model’s review of your current realtime prompt and help improving it. Whether your prompt is working well or not, you can run the following prompt to get a model’s review:

const reviewPrompt = `## Role & Objective

You are a **Prompt-Critique Expert**.

Examine a user-supplied LLM prompt and surface any weaknesses following the instructions below.

## Instructions

Review the prompt that is meant for an LLM to follow and identify the following issues:

- Ambiguity: Could any wording be interpreted in more than one way?
- Lacking Definitions: Are there any class labels, terms, or concepts that are not defined that might be misinterpreted by an LLM?
- Conflicting, missing, or vague instructions: Are directions incomplete or contradictory?
- Unstated assumptions: Does the prompt assume the model has to be able to do something that is not explicitly stated?

## Do **NOT** list issues of the following types:

- Invent new instructions, tool calls, or external information. You do not know what tools need to be added that are missing.
- Issues that you are not sure about.

## Output Format

# Issues

- Numbered list; include brief quote snippets.

# Improvements

- Numbered list; provide the revised lines you would change and how you would change them.

# Revised Prompt

- Revised prompt where you have applied all your improvements surgically with minimal edits to the original prompt`;

Use this template as a starting point for troubleshooting recurring issues:

const issueReviewPrompt = `Here's my current prompt to an LLM:

[BEGIN OF CURRENT PROMPT]
{CURRENT_PROMPT}
[END OF CURRENT PROMPT]

But I see this issue happening from the LLM:

[BEGIN OF ISSUE]
{ISSUE}
[END OF ISSUE]

Can you provide some variants of the prompt so that the model can better understand the constraints to alleviate the issue?`;

10. Help Users Faster

Two frustrating user experiences are slow, mechanical-sounding voice agents and the inability to escalate. Add speed and escalation instructions to your system prompt to help users faster. In the personality and tone section of your system prompt, add pacing instructions to make the model speed up its support:

instructions: `# Personality & Tone

## Personality
Friendly, calm and approachable expert customer service assistant.

## Tone
Tone: Warm, concise, confident, never fawning.

## Length
2–3 sentences per turn.

## Pacing
Deliver your audio response fast, but do not sound rushed. Do not modify the content of your response, only increase speaking speed for the same response.`

Often, for real-time voice agents, having a reliable way to escalate to a human is very important. In the Safety & Escalation section, modify the WHEN instructions for escalation based on your use case. Here’s an example:

instructions: `# Safety & Escalation

When to escalate (no extra troubleshooting):

- Safety risk (self-harm, threats, harassment)
- User explicitly asks for a human
- Severe dissatisfaction (e.g., "extremely frustrated," repeated complaints, profanity)
- **2** failed tool attempts on the same task **or** **3** consecutive no-match/no-input events
- Out-of-scope or restricted (e.g., real-time news, financial/legal/medical advice)

What to say at the same time of calling the escalate_to_human tool (MANDATORY):

- "Thanks for your patience—**I'm connecting you with a specialist now**."
- Then call the tool: escalate_to_human

Examples that would require escalation:

- "This is the third time the reset didn't work. Just get me a person."
- "I am extremely frustrated!"`

Complete Prompt Template

Here’s a complete template combining all these techniques:

instructions: `# Role & Objective
You are a [ROLE] helping users with [OBJECTIVE]. Success means [DEFINE SUCCESS].

# Personality & Tone
## Personality
[Personality traits]

## Tone
Tone: [Tone description]

## Length
2–3 sentences per turn.

## Pacing
Deliver your audio response fast, but do not sound rushed.

# Language
Language matching: Respond in the same language as the user unless directed otherwise.

# Unclear Audio
- Only respond to clear audio or text.
- If audio is unclear/partial/noisy/silent, ask for clarification in {preferred_language}.
- Continue in the same language as the user if intelligible.

# Tools
- Before any tool call, say one short line like "I'm checking that now." Then call the tool immediately.

# Instructions / Rules
- [Your specific rules here]
- Use CAPITALIZATION for critical rules: IF [condition], THEN [action].

# Conversation Flow
[Define your conversation flow phases and transitions]

# Safety & Escalation
When to escalate:
- Safety risk
- User explicitly asks for a human
- 2 failed tool attempts OR 3 consecutive no-match events
- Out-of-scope requests

What to say when escalating:
- "Thanks for your patience—I'm connecting you with a specialist now."
- Then call: escalate_to_human

# Variety
- Do not repeat the same sentence twice. Vary your responses so it doesn't sound robotic.`

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

Recommended Prompt Structure

10 Essential Techniques

1. Be Precise - Eliminate Conflicts

2. Use Bullet Points Over Paragraphs

3. Handle Unclear Audio

4. Limit Model to One Language

5. Provide Example Phrases and Flow Snippets

6. Avoid Robotic Repetition

7. Use Capitalized Text to Emphasize Instructions

8. Help Models Use Tools

9. Use LLMs to Improve Your Prompts

10. Help Users Faster

Complete Prompt Template

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

​Recommended Prompt Structure

​10 Essential Techniques

​1. Be Precise - Eliminate Conflicts

​2. Use Bullet Points Over Paragraphs

​3. Handle Unclear Audio

​4. Limit Model to One Language

​5. Provide Example Phrases and Flow Snippets

​6. Avoid Robotic Repetition

​7. Use Capitalized Text to Emphasize Instructions

​8. Help Models Use Tools

​9. Use LLMs to Improve Your Prompts

​10. Help Users Faster

​Complete Prompt Template

Recommended Prompt Structure

10 Essential Techniques

1. Be Precise - Eliminate Conflicts

2. Use Bullet Points Over Paragraphs

3. Handle Unclear Audio

4. Limit Model to One Language

5. Provide Example Phrases and Flow Snippets

6. Avoid Robotic Repetition

7. Use Capitalized Text to Emphasize Instructions

8. Help Models Use Tools

9. Use LLMs to Improve Your Prompts

10. Help Users Faster

Complete Prompt Template