Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix console bug #477

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ const client = new RealtimeClient({ apiKey: process.env.OPENAI_API_KEY });

// Can set parameters ahead of connecting
client.updateSession({ instructions: 'You are a great, upbeat friend.' });
client.updateSession({ voice: 'alloy' });
client.updateSession({ voice: 'onyx' });
client.updateSession({ turn_detection: 'server_vad' });
client.updateSession({ input_audio_transcription: { model: 'whisper-1' } });

Expand Down
Binary file added bun.lockb
Binary file not shown.
4 changes: 4 additions & 0 deletions relay-server/lib/relay.js
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@ export class RealtimeRelay {
this.log(`Connecting with key "${this.apiKey.slice(0, 3)}..."`);
const client = new RealtimeClient({ apiKey: this.apiKey });

// Can set parameters ahead of connecting
client.updateSession({ instructions: 'You are a great, upbeat friend.' });
client.updateSession({ voice: 'onyx' });

// Relay: OpenAI Realtime API Event -> Browser Event
client.realtime.on('server.*', (event) => {
this.log(`Relaying "${event.type}" to Client`);
Expand Down
90 changes: 75 additions & 15 deletions src/pages/ConsolePage.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ import { useEffect, useRef, useCallback, useState } from 'react';
import { RealtimeClient } from '@openai/realtime-api-beta';
import { ItemType } from '@openai/realtime-api-beta/dist/lib/client.js';
import { WavRecorder, WavStreamPlayer } from '../lib/wavtools/index.js';
import { instructions } from '../utils/conversation_config.js';
import { instructions, userInfo } from '../utils/conversation_config.js';
import { WavRenderer } from '../utils/wav_renderer';

import { X, Edit, Zap, ArrowUp, ArrowDown } from 'react-feather';
Expand Down Expand Up @@ -44,6 +44,8 @@ interface Coordinates {
};
}

type Voice = 'coral' | 'alloy' | 'echo' | 'shimmer' | 'ash' | 'ballad' | 'sage' | 'verse';

/**
* Type for all event logs
*/
Expand All @@ -62,8 +64,8 @@ export function ConsolePage() {
const apiKey = LOCAL_RELAY_SERVER_URL
? ''
: localStorage.getItem('tmp::voice_api_key') ||
prompt('OpenAI API Key') ||
'';
prompt('OpenAI API Key') ||
'';
if (apiKey !== '') {
localStorage.setItem('tmp::voice_api_key', apiKey);
}
Expand All @@ -85,9 +87,9 @@ export function ConsolePage() {
LOCAL_RELAY_SERVER_URL
? { url: LOCAL_RELAY_SERVER_URL }
: {
apiKey: apiKey,
dangerouslyAllowAPIKeyInBrowser: true,
}
apiKey: apiKey,
dangerouslyAllowAPIKeyInBrowser: true,
}
)
);

Expand Down Expand Up @@ -124,6 +126,28 @@ export function ConsolePage() {
lng: -122.418137,
});
const [marker, setMarker] = useState<Coordinates | null>(null);
const [selectedVoice, setSelectedVoice] = useState<Voice>('ash');

const voices = ['coral', 'alloy', 'echo', 'shimmer', 'ash', 'ballad', 'sage', 'verse'];

const handleVoiceChange = async (e: React.ChangeEvent<HTMLSelectElement>) => {
const voice = e.target.value as Voice;
setSelectedVoice(voice);

// If connected, disconnect first
if (isConnected) {
await disconnectConversation();
}

// Update voice and reconnect
const client = clientRef.current;
try {
client?.updateSession({ voice });
await connectConversation();
} catch (error) {
console.error('Error updating voice:', error);
}
};

/**
* Utility for formatting the timing of logs
Expand Down Expand Up @@ -237,7 +261,17 @@ export function ConsolePage() {
const { trackId, offset } = trackSampleOffset;
await client.cancelResponse(trackId, offset);
}
await wavRecorder.record((data) => client.appendInputAudio(data.mono));

try {
// Check if already recording
if (wavRecorder.getStatus() === 'recording') {
await wavRecorder.pause();
}
// Now safe to start recording
await wavRecorder.record((data) => client.appendInputAudio(data.mono));
} catch (error) {
console.error('Recording error:', error);
}
};

/**
Expand All @@ -247,7 +281,15 @@ export function ConsolePage() {
setIsRecording(false);
const client = clientRef.current;
const wavRecorder = wavRecorderRef.current;
await wavRecorder.pause();
try {
// Check if actually recording before trying to pause
if (wavRecorder.getStatus() === 'recording') {
await wavRecorder.pause();
}
// If already paused, do nothing
} catch (error) {
console.error('Recording stop error:', error);
}
client.createResponse();
};

Expand Down Expand Up @@ -377,7 +419,8 @@ export function ConsolePage() {
const client = clientRef.current;

// Set instructions
client.updateSession({ instructions: instructions });
client.updateSession({ instructions: instructions + userInfo });
client.updateSession({ voice: selectedVoice });
// Set transcription, otherwise we don't get user transcriptions back
client.updateSession({ input_audio_transcription: { model: 'whisper-1' } });

Expand Down Expand Up @@ -457,6 +500,7 @@ export function ConsolePage() {

// handle realtime events from client + server for event logging
client.on('realtime.event', (realtimeEvent: RealtimeEvent) => {
console.log('realtimeEvent', realtimeEvent);
setRealtimeEvents((realtimeEvents) => {
const lastEvent = realtimeEvents[realtimeEvents.length - 1];
if (lastEvent?.event.type === realtimeEvent.event.type) {
Expand Down Expand Up @@ -565,11 +609,10 @@ export function ConsolePage() {
}}
>
<div
className={`event-source ${
event.type === 'error'
? 'error'
: realtimeEvent.source
}`}
className={`event-source ${event.type === 'error'
? 'error'
: realtimeEvent.source
}`}
>
{realtimeEvent.source === 'client' ? (
<ArrowUp />
Expand All @@ -586,6 +629,11 @@ export function ConsolePage() {
{event.type}
{count && ` (${count})`}
</div>
{event.type === 'error' && (
<div className="event-error">
{JSON.stringify(event.error.message)}
</div>
)}
</div>
{!!expandedEvents[event.event_id] && (
<div className="event-payload">
Expand Down Expand Up @@ -639,7 +687,7 @@ export function ConsolePage() {
(conversationItem.formatted.audio?.length
? '(awaiting transcript)'
: conversationItem.formatted.text ||
'(item sent)')}
'(item sent)')}
</div>
)}
{!conversationItem.formatted.tool &&
Expand Down Expand Up @@ -724,6 +772,18 @@ export function ConsolePage() {
{JSON.stringify(memoryKv, null, 2)}
</div>
</div>
<div className="content-block kv">
<div className="content-block-title">set_voice()</div>
<div className="content-block-body content-kv">
<select value={selectedVoice} onChange={handleVoiceChange}>
{voices.map(voice => (
<option key={voice} value={voice}>
{voice}
</option>
))}
</select>
</div>
</div>
</div>
</div>
</div>
Expand Down
70 changes: 55 additions & 15 deletions src/utils/conversation_config.js
Original file line number Diff line number Diff line change
@@ -1,16 +1,56 @@
export const instructions = `System settings:
Tool use: enabled.

Instructions:
- You are an artificial intelligence agent responsible for helping test realtime voice capabilities
- Please make sure to respond with a helpful voice via audio
- Be kind, helpful, and curteous
- It is okay to ask the user questions
- Use tools and functions you have available liberally, it is part of the training apparatus
- Be open to exploration and conversation
- Remember: this is just for fun and testing!

Personality:
- Be upbeat and genuine
- Try speaking quickly as if excited
export const instructions = `
// Core Identity
- You are Boardy, an empathetic professional coach who guides conversations purposefully
- Lead with genuine curiosity and warmth, not artificial enthusiasm
- Never use formulaic responses like "That's great!" or "How wonderful!"

// Initial Setup
1. Gather essential info: [name] and [career]
2. If missing, ask directly: "What's your name?" / "What do you do?"
3. Open with: "Hi [name], I'd love to hear how you got to where you are in [career]. What's your story?"

// Conversation Flow
Focus on four key areas, steering naturally towards:
1. Journey ("How did you get here?")
2. Present ("What are you focused on now? Who would you like to meet?")
3. Dating Styles ("What are you looking for in a relationship?")

// Response Structure
Each response should:
1. Show understanding through brief, specific reflection
2. Acknowledge their perspective without judgment
3. Bridge naturally to your target topic
4. Ask an open question

// Response Style
- Keep responses grounded and authentic
- Avoid overly enthusiastic praise or reactions
- Focus on understanding vs evaluating
- Use natural transitions, not forced pivots

Examples:

Personal Story -> Business Focus:
User: "I spent 5 years in Tokyo before moving to NYC. The culture shock was intense but..."
Boardy: "Moving between Tokyo and NYC brings so many adjustments. Given those experiences with different business cultures, what's your main focus now?"

Life Update -> Professional Challenges:
User: "Just moved to a new house and trying to settle in..."
Boardy: "Moves take up a lot of mental space. How's this transition affecting your work priorities?"

Off-Topic -> Career Journey:
User: "Been really into photography lately..."
Boardy: "The perspective shift photography brings can be illuminating. What drew you to your current field?"

// Anti-patterns to avoid:
❌ "That's fantastic that you're into photography!"
❌ "How wonderful that you moved!"
❌ "It's so great that you're exploring new things!"

✅ Instead, acknowledge through specificity and curiosity
`;

export const userInfo = `
<userInfo>
</userInfo>
`