diff --git a/gemini/multimodal-live-api/websocket-demo-app/README.md b/gemini/multimodal-live-api/websocket-demo-app/README.md index 35cd87f4ede..8401e2b213f 100644 --- a/gemini/multimodal-live-api/websocket-demo-app/README.md +++ b/gemini/multimodal-live-api/websocket-demo-app/README.md @@ -11,14 +11,20 @@ The [Multimodal Live API](https://cloud.google.com/vertex-ai/generative-ai/docs/ ## Pre-requisites -Some web development experience is required to follow this tutorial, especially working with localhost, understanding port numbers, and the difference between websockets and http requests. +While some web development experience, particularly with localhost, port numbers, and the distinction between WebSockets and HTTP requests, can be beneficial for this tutorial, don't worry if you're not familiar with these concepts. We'll provide guidance along the way to ensure you can successfully follow along. ### File Structure -- main.py: The Python backend code -- index.html: The frontend HTML+JS+CSS app -- pcm-processor.js: Script for processing audio -- requirements.txt: Lists the required Python dependencies + +- backend/main.py: The Python backend code +- backend/requirements.txt: Lists the required Python dependencies + +- frontend/index.html: The frontend HTML app +- frontend/script.js: Main frontend JavaScript code +- frontend/gemini-live-api.js: Script for interacting with the Gemini API +- frontend/live-media-manager.js: Script for handling media input and output +- frontend/pcm-processor.js: Script for processing PCM audio +- frontend/cookieJar.js: Script for managing cookies ![Demo](https://storage.googleapis.com/cloud-samples-data/generative-ai/image/demo-UI.png) @@ -32,38 +38,39 @@ You can set up this app locally or via Cloud Shell. ```sh git clone https://github.com/GoogleCloudPlatform/generative-ai.git -cd gemini/multimodal-live-api/websocket-demo-app +cd generative-ai/gemini/multimodal-live-api/websocket-demo-app ``` -1. Create a new virtual environment and activate it: +2. Create a new virtual environment and activate it: ```sh python3 -m venv env source env/bin/activate ``` -1. Install dependencies: +3. Install dependencies: ```sh -pip3 install -r requirements.txt +pip3 install -r backend/requirements.txt ``` -1. Start the Python WebSocket server: +4. Start the Python WebSocket server: ```sh -python3 main.py +python3 backend/main.py ``` -1. Start the frontend: +5. Start the frontend: Make sure to open a **new** terminal window to run this command. Keep the backend server running in the first terminal. ```sh +cd frontend python3 -m http.server ``` -1. Point your browser to the demo app UI based on the output of the terminal. (E.g., it may be http://localhost:8000, or it may use a different port.) +6. Point your browser to the demo app UI based on the output of the terminal. (E.g., it may be http://localhost:8000, or it may use a different port.) -1. Get your Google Cloud access token: +7. Get your Google Cloud access token: Run the following command in a terminal with gcloud installed to set your project, and to retrieve your access token. ```sh @@ -71,16 +78,16 @@ gcloud config set project YOUR-PROJECT-ID gcloud auth print-access-token ``` -1. Copy the access token from the previous step into the UI that you have open in your browser. +8. Copy the access token from the previous step into the UI that you have open in your browser. -1. Enter the model ID in the UI: +9. Enter the model ID in the UI: Replace `YOUR-PROJECT-ID` in the input with your credentials -1. Connect and interact with the demo: +10. Connect and interact with the demo: - After entering your Access Token and Model ID, press the connect button to connect your web app. Now you should be able to interact with Gemini 2.0 with the Multimodal Live API. -1. To interact with the app, you can do the following: +11. To interact with the app, you can do the following: - Text input: You can write a text prompt to send to the model by entering your message in the box and pressing the send arrow. The model will then respond via audio (turn up your volume!). - Voice input: Press the pink microphone button and start speaking. The model will respond via audio. If you would like to mute your microphone, press the button with a slash through the microphone. @@ -90,52 +97,54 @@ gcloud auth print-access-token 1. Open [Cloud Shell](https://cloud.google.com/shell/docs/editor-overview) -1. Upload `main.py`, `index.html`, `pcm-processor.js`, and `requirements.txt` to your Cloud Shell Editor project. Alternatively, you can clone the repository and cd into the correct directory: +2. Upload the frontend and backend folders to your Cloud Shell Editor project. Alternatively, you can clone the repository and cd into the correct directory: ```sh git clone https://github.com/GoogleCloudPlatform/generative-ai.git -cd gemini/multimodal-live-api/websocket-demo-app +cd generative-ai/gemini/multimodal-live-api/websocket-demo-app ``` -1. Open two new terminal windows. -1. Navigate to whichever folder in Cloud Shell you uploaded the code files to (i.e., using `cd your_folder_name`) +3. Open two new terminal windows. +4. Navigate to whichever folder in Cloud Shell you uploaded the code files to (i.e., using `cd your_folder_name`) -1. Install dependencies: In one of the terminal windows run: +5. Install dependencies: In one of the terminal windows run: ```sh -pip3 install -r requirements.txt +pip3 install -r backend/requirements.txt ``` -1. Start the Python WebSocket server in one terminal. +6. Start the Python WebSocket server in one terminal. ```sh -python3 main.py +python3 backend/main.py ``` -1. In order for index.html to work properly, you will need to update the app URL inside index.html to point to the correct proxy server URL you just set up in the previous step. To do so: +7. In order for index.html to work properly, you will need to update the app URL inside script.js to point to the correct proxy server URL you just set up in the previous step. To do so: - Click on Web Preview (to the right of the Open Terminal button near the top) - Click "Preview on port 8080" (the port where you've setup the proxy server in the previous step) - Copy the URL, but make sure to discard everything at the end after "cloudshell.dev/" - Navigate to `const URL = "ws://localhost:8080";` in `index.html` on line 116 -- Replace `ws://localhost:8080` with `wss://[THE_URL_YOU_COPIED_WITHOUT_HTTP]`. For example, it should look like: `const URL = "wss://8080-cs-123456789-default.cs-us-central1-abcd.cloudshell.dev";` -- save the changes you've made to index.html +- Navigate to `const PROXY_URL = "wss://your websocket server";` in `script.js` +- Replace `wss://your websocket server` with `wss://[THE_URL_YOU_COPIED_WITHOUT_HTTP]`. For example, it should look like: `const PROXY_URL = "wss://8080-cs-123456789-default.cs-us-central1-abcd.cloudshell.dev";` +- save the changes you've made to script.js -1. Start the frontend: +8. Start the frontend: In the second terminal window, run the command below. Keep the backend server running in the first terminal. (Make sure you have navigated to the folder containing the code files, i.e. using `cd your_folder_name`) ```sh +cd frontend python3 -m http.server ``` -1. Test the demo app: +9. Test the demo app: - Navigate to the Web Preview button again - Click on "Change port" - Change Preview Port to 8000, and then click on "Change and Preview". This should open up a new tab with the UI. -1. Going back to the tab with the Cloud Shell Editor, connect to the application by running the following command in a new terminal window: +10. Going back to the tab with the Cloud Shell Editor, connect to the application by running the following command in a new terminal window: ```sh gcloud config set project YOUR-PROJECT-ID @@ -147,7 +156,7 @@ gcloud auth print-access-token For example, it should look like: `projects/my-project-id/locations/us-central1/publishers/google/models/gemini-2.0-flash-exp` - Press the "Connect" button. Now you should be able to interact with Gemini 2.0 with the Multimodal Live API. -1. To interact with the app, you can do the following: +11. To interact with the app, you can do the following: - Text input: You can write a text prompt to send to the model by entering your message in the box and pressing the send arrow. The model will then respond via audio (turn up your volume!). - Voice input: Press the pink microphone button and start speaking. The model will respond via audio. If you would like to mute your microphone, press the button with a slash through the microphone. diff --git a/gemini/multimodal-live-api/websocket-demo-app/main.py b/gemini/multimodal-live-api/websocket-demo-app/backend/main.py similarity index 100% rename from gemini/multimodal-live-api/websocket-demo-app/main.py rename to gemini/multimodal-live-api/websocket-demo-app/backend/main.py diff --git a/gemini/multimodal-live-api/websocket-demo-app/requirements.txt b/gemini/multimodal-live-api/websocket-demo-app/backend/requirements.txt similarity index 100% rename from gemini/multimodal-live-api/websocket-demo-app/requirements.txt rename to gemini/multimodal-live-api/websocket-demo-app/backend/requirements.txt diff --git a/gemini/multimodal-live-api/websocket-demo-app/frontend/cookieJar.js b/gemini/multimodal-live-api/websocket-demo-app/frontend/cookieJar.js new file mode 100644 index 00000000000..9aacc8ab573 --- /dev/null +++ b/gemini/multimodal-live-api/websocket-demo-app/frontend/cookieJar.js @@ -0,0 +1,65 @@ +class CookieJar { + /** + * @class CookieJar + * @classdesc A utility class for managing cookies associated with HTML input elements. + */ + + /** + * @static + * @method init + * @memberof CookieJar + * @description Initializes the CookieJar for a given element. Loads saved value from cookie and sets up auto-saving on value change. + * @param {string} elementId - The ID of the HTML input element. + */ + static init(elementId) { + const element = document.getElementById(elementId); + if (!element) { + console.error(`❌ Element with ID '${elementId}' not found.`); + return; + } + + const cookieName = `CookieJar_${elementId}`; + + // Load existing value from cookie + const savedValue = CookieJar.getCookie(cookieName); + if (savedValue) { + console.log(`🍪 Found cookie for ${elementId}. Loading value: ${savedValue}`); + element.value = savedValue; + } + + // Save on value change + element.addEventListener('input', () => { + console.log(`🍪 Saving value for ${elementId} to cookie...`); + CookieJar.setCookie(cookieName, element.value); + }); + } + + /** + * @static + * @method setCookie + * @memberof CookieJar + * @description Sets a cookie with the given name, value, and optional expiration days. + * @param {string} name - The name of the cookie. + * @param {string} value - The value to store in the cookie. + * @param {number} [days=365] - The number of days until the cookie expires. Defaults to 365. + */ + static setCookie(name, value, days = 365) { + const expires = new Date(); + expires.setTime(expires.getTime() + days * 24 * 60 * 60 * 1000); + document.cookie = `${name}=${encodeURIComponent(value)};expires=${expires.toUTCString()};path=/`; + console.log(`🍪 Cookie '${name}' set successfully!`); + } + + /** + * @static + * @method getCookie + * @memberof CookieJar + * @description Retrieves the value of a cookie with the given name. + * @param {string} name - The name of the cookie to retrieve. + * @returns {string|null} The value of the cookie if found, otherwise null. + */ + static getCookie(name) { + const cookieValue = document.cookie.match(`(^|;)\\s*${name}\\s*=\\s*([^;]+)`); + return cookieValue ? decodeURIComponent(cookieValue.pop()) : null; + } +} \ No newline at end of file diff --git a/gemini/multimodal-live-api/websocket-demo-app/frontend/gemini-live-api.js b/gemini/multimodal-live-api/websocket-demo-app/frontend/gemini-live-api.js new file mode 100644 index 00000000000..26e00afec83 --- /dev/null +++ b/gemini/multimodal-live-api/websocket-demo-app/frontend/gemini-live-api.js @@ -0,0 +1,175 @@ + +class GeminiLiveResponseMessage { + constructor(data) { + + this.data = ""; + this.type = ""; + this.endOfTurn = data?.serverContent?.turnComplete; + + const parts = data?.serverContent?.modelTurn?.parts + + if (parts.length && parts[0].text) { + this.data = parts[0].text; + this.type = "TEXT" + } + else if (parts.length && parts[0].inlineData) { + this.data = parts[0].inlineData.data; + this.type = "AUDIO" + } + else if (data?.setupComplete) { + this.type = "SETUP COMPLETE" + } + } +} + + +class GeminiLiveAPI { + + constructor(proxyUrl, projectId, model, apiHost) { + + this.proxyUrl = proxyUrl; + + this.projectId = projectId + this.model = model + this.modelUri = `projects/${this.projectId}/locations/us-central1/publishers/google/models/${this.model}` + + this.responseModalities = ["AUDIO"] + this.systemInstructions = "" + + this.apiHost = apiHost + this.serviceUrl = `wss://${this.apiHost}/ws/google.cloud.aiplatform.v1beta1.LlmBidiService/BidiGenerateContent` + + this.onReceiveResponse = (message) => { + console.log("Default message received callback", message) + } + + this.onConnectionStarted = () => { + console.log("Default onConnectionStarted") + } + + this.onErrorMessage = (message) => { + alert(message); + } + + this.accessToken = '' + this.websocket = null + + console.log("Created Gemini Live API object: ", this) + } + + setProjectId(projectId) { + this.projectId = projectId + this.modelUri = `projects/${this.projectId}/locations/us-central1/publishers/google/models/${this.model}` + } + + setAccessToken(newAccessToken) { + console.log("setting access token: ", newAccessToken) + this.accessToken = newAccessToken + } + + connect(accessToken) { + this.setAccessToken(accessToken) + this.setupWebSocketToService() + } + + disconnect() { + this.webSocket.close(); + } + + sendMessage(message) { + this.webSocket.send(JSON.stringify(message)); + } + + onReceiveMessage(messageEvent) { + console.log("Message received: ", messageEvent) + const messageData = JSON.parse(messageEvent.data); + const message = new GeminiLiveResponseMessage(messageData); + console.log("onReceiveMessageCallBack this ", this) + this.onReceiveResponse(message); + } + + setupWebSocketToService() { + console.log("connecting: ", this.proxyUrl); + + this.webSocket = new WebSocket(this.proxyUrl); + + this.webSocket.onclose = (event) => { + console.log("websocket closed: ", event); + this.onErrorMessage("Connection closed"); + }; + + this.webSocket.onerror = (event) => { + console.log("websocket error: ", event); + this.onErrorMessage("Connection error"); + + }; + + this.webSocket.onopen = (event) => { + console.log("websocket open: ", event); + this.sendInitialSetupMessages(); + this.onConnectionStarted(); + }; + + this.webSocket.onmessage = this.onReceiveMessage.bind(this); + } + + + sendInitialSetupMessages() { + + const serviceSetupMessage = { + bearer_token: this.accessToken, + service_url: this.serviceUrl + }; + this.sendMessage(serviceSetupMessage) + + const sessionSetupMessage = { + setup: { + model: this.modelUri, + generation_config: { response_modalities: this.responseModalities }, + system_instruction: { parts: [{ text: this.systemInstructions }] } + } + } + this.sendMessage(sessionSetupMessage) + + } + + sendTextMessage(text) { + + const textMessage = { + client_content: { + turns: [ + { + role: "user", + parts: [{ text: text }], + }, + ], + turn_complete: true, + }, + }; + this.sendMessage(textMessage) + } + + sendRealtimeInputMessage(data, mime_type) { + const message = { + realtime_input: { + media_chunks: [ + { + mime_type: mime_type, + data: data, + }, + ] + }, + }; + this.sendMessage(message) + } + + sendAudioMessage(base64PCM) { + this.sendRealtimeInputMessage(base64PCM, "audio/pcm") + } + + sendImageMessage(base64Image, mime_type = "image/jpeg") { + this.sendRealtimeInputMessage(base64Image, mime_type) + } +} + +console.log("loaded gemini-live-api.js") \ No newline at end of file diff --git a/gemini/multimodal-live-api/websocket-demo-app/frontend/index.html b/gemini/multimodal-live-api/websocket-demo-app/frontend/index.html new file mode 100644 index 00000000000..98149869608 --- /dev/null +++ b/gemini/multimodal-live-api/websocket-demo-app/frontend/index.html @@ -0,0 +1,204 @@ +<head> + <link + href="https://fonts.googleapis.com/css2?family=Roboto:wght@400;500;700&display=swap" + rel="stylesheet" + /> + <link + rel="stylesheet" + href="https://fonts.googleapis.com/css2?family=Material+Symbols+Outlined:opsz,wght,FILL,GRAD@24,400,0,0" + /> + <script type="importmap"> + { + "imports": { + "@material/web/": "https://esm.run/@material/web/" + } + } + </script> + <script type="module"> + import "@material/web/all.js"; + import { styles as typescaleStyles } from "@material/web/typography/md-typescale-styles.js"; + + document.adoptedStyleSheets.push(typescaleStyles.styleSheet); + </script> + + <script src="gemini-live-api.js"></script> + <script src="live-media-manager.js"></script> + <script src="cookieJar.js"></script> + + <link rel="stylesheet" href="styles.css" /> + <script defer src="script.js"></script> +</head> +<body> + <h1 class="md-typescale-display-medium">Multimodal Live API</h1> + <p class="md-typescale-body-medium"> + The Multimodal Live API enables low-latency, two-way interactions that use + text, audio, and video input, with audio and text output. + </p> + <br /> + + <div id="model-config-container"> + <div> + <md-outlined-text-field + id="token" + label="Access Token" + value="" + type="password" + ></md-outlined-text-field> + <br /> + <br /> + <md-outlined-text-field + id="project" + label="Project ID" + value="" + ></md-outlined-text-field> + </div> + + <div class="modality-container"> + <p class="md-typescale-body-medium" style="margin-top: 0"> + Model response type + </p> + + <form> + <md-radio + id="audio-radio" + name="responseModality" + value="AUDIO" + checked="checked" + ></md-radio> + <label for="audio-radio"> + <span class="material-symbols-outlined"> volume_up </span> + <span class="icon-text">Audio</span> + </label> + + <br /><br /> + + <md-radio + id="text-radio" + name="responseModality" + value="TEXT" + ></md-radio> + <label for="text-radio"> + <span class="material-symbols-outlined"> text_fields </span> + <span class="icon-text">Text</span> + </label> + </form> + </div> + + <md-outlined-text-field + id="systemInstructions" + type="textarea" + label="System Instructions" + rows="3" + > + </md-outlined-text-field> + + <div> + <br /> + <md-outlined-button onclick="connectBtnClick()" + >Connect</md-outlined-button + > + <br /><br /> + <md-outlined-button onclick="disconnectBtnClick()" + >Disconnect</md-outlined-button + > + </div> + </div> + <br /> + + <div id="model-state"> + <div id="disconnected" class="state"> + <span class="material-symbols-outlined"> cloud_off </span> + <span class="icon-text">disconnected</span> + </div> + <div id="connecting" class="state" hidden> + <span class="material-symbols-outlined"> hourglass_empty </span> + <span class="icon-text">connecting...</span> + </div> + <div id="connected" class="state" hidden> + <span class="material-symbols-outlined"> cloud_done </span> + <span class="icon-text">connected</span> + </div> + <div id="speaking" class="state" hidden> + <span class="material-symbols-outlined"> graphic_eq </span> + <span class="icon-text">model speaking</span> + </div> + </div> + + <br /> + + <div id="user-input-container"> + <div> + <div> + <md-outlined-select + id="cameraSource" + label="Camera Input" + onchange="newCameraSelected()" + > + </md-outlined-select> + </div> + + <br /> + + <div> + <md-outlined-select + id="audioSource" + label="Microphone Input" + onchange="newMicSelected()" + > + </md-outlined-select> + </div> + + <br /> + <div class="spread"> + <span id="micBtn"> + <md-filled-icon-button onclick="micBtnClick()"> + <md-icon>mic</md-icon> + </md-filled-icon-button> + </span> + + <span id="micOffBtn" hidden> + <md-filled-icon-button onclick="micOffBtnClick()"> + <md-icon>mic_off</md-icon> + </md-filled-icon-button> + </span> + + <span id="cameraBtn"> + <md-filled-icon-button onclick="cameraBtnClick()"> + <md-icon>videocam</md-icon> + </md-filled-icon-button> + </span> + + <span id="screenBtn"> + <md-filled-icon-button onclick="screenShareBtnClick()"> + <md-icon>present_to_all</md-icon> + </md-filled-icon-button> + </span> + </div> + </div> + + <div> + <div id="video-preview"> + <video id="video" autoplay playsinline muted></video> + <canvas id="canvas"></canvas> + </div> + </div> + <div> + <md-outlined-text-field + id="text-message" + label="Text Message" + value="" + ></md-outlined-text-field> + <md-icon-button onclick="newUserMessage()"> + <md-icon>send</md-icon> + </md-icon-button> + <br /> + <div id="text-chat"></div> + </div> + </div> + + <md-dialog id="dialog" close> + <div slot="content"> + <span id="dialogMessage">A dialog that is opened by default.</span> + </div> + </md-dialog> +</body> diff --git a/gemini/multimodal-live-api/websocket-demo-app/frontend/live-media-manager.js b/gemini/multimodal-live-api/websocket-demo-app/frontend/live-media-manager.js new file mode 100644 index 00000000000..beeab2653b4 --- /dev/null +++ b/gemini/multimodal-live-api/websocket-demo-app/frontend/live-media-manager.js @@ -0,0 +1,301 @@ +class LiveAudioOutputManager { + + constructor() { + this.audioInputContext + this.workletNode + this.initalized = false + + this.audioQueue = []; + this.isPlaying = false; + + this.initializeAudioContext(); + } + + + async playAudioChunk(base64AudioChunk) { + try { + if (!this.initalized) { + await this.initializeAudioContext.bind(this)(); + } + + if (this.audioInputContext.state === "suspended") { + await this.audioInputContext.resume(); + } + + const arrayBuffer = LiveAudioOutputManager.base64ToArrayBuffer(base64AudioChunk); + const float32Data = LiveAudioOutputManager.convertPCM16LEToFloat32(arrayBuffer); + + this.workletNode.port.postMessage(float32Data); + } catch (error) { + console.error("Error processing audio chunk:", error); + } + } + + async initializeAudioContext() { + + if (this.initalized) return; + + console.log("initializeAudioContext...") + + this.audioInputContext = new (window.AudioContext || + window.webkitAudioContext)({ sampleRate: 24000 }); + await this.audioInputContext.audioWorklet.addModule("pcm-processor.js"); + this.workletNode = new AudioWorkletNode(this.audioInputContext, "pcm-processor"); + this.workletNode.connect(this.audioInputContext.destination); + + this.initalized = true; + console.log("initializeAudioContext end") + } + + static base64ToArrayBuffer(base64) { + const binaryString = window.atob(base64); + const bytes = new Uint8Array(binaryString.length); + for (let i = 0; i < binaryString.length; i++) { + bytes[i] = binaryString.charCodeAt(i); + } + return bytes.buffer; + } + + static convertPCM16LEToFloat32(pcmData) { + const inputArray = new Int16Array(pcmData); + const float32Array = new Float32Array(inputArray.length); + for (let i = 0; i < inputArray.length; i++) { + float32Array[i] = inputArray[i] / 32768; + } + return float32Array; + } +} + + + +class LiveAudioInputManager { + + constructor() { + this.audioContext + this.mediaRecorder + this.processor = false + this.pcmData = []; + + this.deviceId = null + + this.interval = null; + this.stream = null + + this.onNewAudioRecordingChunk = (audioData) => { + console.log("New audio recording ") + } + } + + async connectMicrophone() { + this.audioContext = new AudioContext({ + sampleRate: 16000, + }); + + let contraints = { + audio: { + channelCount: 1, + sampleRate: 16000, + }, + } + + if (this.deviceId) { + contraints.audio.deviceId = { exact: this.deviceId } + } + + this.stream = await navigator.mediaDevices.getUserMedia( + contraints + ); + + const source = this.audioContext.createMediaStreamSource(this.stream); + this.processor = this.audioContext.createScriptProcessor(4096, 1, 1); + + this.processor.onaudioprocess = (e) => { + const inputData = e.inputBuffer.getChannelData(0); + // Convert float32 to int16 + const pcm16 = new Int16Array(inputData.length); + for (let i = 0; i < inputData.length; i++) { + pcm16[i] = inputData[i] * 0x7fff; + } + this.pcmData.push(...pcm16); + }; + + source.connect(this.processor); + this.processor.connect(this.audioContext.destination); + + this.interval = setInterval(this.recordChunk.bind(this), 1000); + } + + newAudioRecording(b64AudioData) { + console.log("newAudioRecording ") + this.onNewAudioRecordingChunk(b64AudioData) + } + + recordChunk() { + const buffer = new ArrayBuffer(this.pcmData.length * 2); + const view = new DataView(buffer); + this.pcmData.forEach((value, index) => { + view.setInt16(index * 2, value, true); + }); + + const base64 = btoa( + String.fromCharCode.apply(null, new Uint8Array(buffer)) + ); + this.newAudioRecording(base64); + this.pcmData = []; + } + + disconnectMicrophone() { + try { + this.processor.disconnect(); + this.audioContext.close(); + } catch { + + } + + clearInterval(this.interval); + } + + async updateMicrophoneDevice(deviceId) { + this.deviceId = deviceId + this.disconnectMicrophone() + this.connectMicrophone() + } +} + + +class LiveVideoManager { + + constructor(previewVideoElement, previewCanvasElement) { + this.previewVideoElement = previewVideoElement; + this.previewCanvasElement = previewCanvasElement; + this.ctx = this.previewCanvasElement.getContext("2d") + this.stream = null + this.interval = null; + this.onNewFrame = (newFrame) => { + console.log("Default new frame trigger.") + } + } + + async startWebcam() { + try { + const constraints = { + video: true + // video: { + // width: { max: 640 }, + // height: { max: 480 }, + // }, + }; + this.stream = await navigator.mediaDevices.getUserMedia(constraints); + this.previewVideoElement.srcObject = this.stream; + } catch (err) { + console.error("Error accessing the webcam: ", err); + } + + setInterval(this.newFrame.bind(this), 1000) + } + + stopWebcam() { + clearInterval(this.interval); + this.stopStream() + + } + + stopStream() { + if (!this.stream) + return + + const tracks = this.stream.getTracks(); + + tracks.forEach((track) => { + track.stop(); + }); + } + + + async updateWebcamDevice(deviceId) { + const constraints = { + video: { deviceId: { exact: deviceId } } + } + this.stream = await navigator.mediaDevices.getUserMedia(constraints); + this.previewVideoElement.srcObject = this.stream; + } + + captureFrameB64() { + + if (this.stream == null) return "" + + this.previewCanvasElement.width = this.previewVideoElement.videoWidth; + this.previewCanvasElement.height = this.previewVideoElement.videoHeight; + this.ctx.drawImage(this.previewVideoElement, 0, 0, this.previewCanvasElement.width, this.previewCanvasElement.height); + const imageData = this.previewCanvasElement.toDataURL("image/jpeg").split(",")[1].trim(); + return imageData + } + + newFrame() { + console.log("capturinng new frame") + const frameData = this.captureFrameB64() + this.onNewFrame(frameData) + } + +} + + + +class LiveScreenManager { + + constructor(previewVideoElement, previewCanvasElement) { + this.previewVideoElement = previewVideoElement; + this.previewCanvasElement = previewCanvasElement; + this.ctx = this.previewCanvasElement.getContext("2d") + this.stream = null + this.interval = null; + this.onNewFrame = (newFrame) => { + console.log("Default new frame trigger: ", newFrame) + } + } + + async startCapture() { + try { + + this.stream = await navigator.mediaDevices.getDisplayMedia(); + this.previewVideoElement.srcObject = this.stream; + } catch (err) { + console.error("Error accessing the webcam: ", err); + } + setInterval(this.newFrame.bind(this), 1000) + } + + stopCapture() { + clearInterval(this.interval); + + if (!this.stream) + return + + const tracks = this.stream.getTracks(); + + tracks.forEach((track) => { + track.stop(); + }); + } + + + captureFrameB64() { + + if (this.stream == null) return "" + + this.previewCanvasElement.width = this.previewVideoElement.videoWidth; + this.previewCanvasElement.height = this.previewVideoElement.videoHeight; + this.ctx.drawImage(this.previewVideoElement, 0, 0, this.previewCanvasElement.width, this.previewCanvasElement.height); + const imageData = this.previewCanvasElement.toDataURL("image/jpeg").split(",")[1].trim(); + return imageData + } + + newFrame() { + console.log("capturinng new frame") + const frameData = this.captureFrameB64() + this.onNewFrame(frameData) + } + +} + +console.log("loaded live-media-manager.js") \ No newline at end of file diff --git a/gemini/multimodal-live-api/websocket-demo-app/pcm-processor.js b/gemini/multimodal-live-api/websocket-demo-app/frontend/pcm-processor.js similarity index 69% rename from gemini/multimodal-live-api/websocket-demo-app/pcm-processor.js rename to gemini/multimodal-live-api/websocket-demo-app/frontend/pcm-processor.js index ec62ccdad30..e5ff359a46c 100644 --- a/gemini/multimodal-live-api/websocket-demo-app/pcm-processor.js +++ b/gemini/multimodal-live-api/websocket-demo-app/frontend/pcm-processor.js @@ -3,7 +3,11 @@ class PCMProcessor extends AudioWorkletProcessor { super(); this.buffer = new Float32Array(); - // Correct way to handle messages in AudioWorklet +/** + * @class PCMProcessor + * @extends AudioWorkletProcessor + * @description Processes PCM audio data. + */ this.port.onmessage = (e) => { const newData = e.data; const newBuffer = new Float32Array(this.buffer.length + newData.length); @@ -11,6 +15,12 @@ class PCMProcessor extends AudioWorkletProcessor { newBuffer.set(newData, this.buffer.length); this.buffer = newBuffer; }; + const newData = e.data; + const newBuffer = new Float32Array(this.buffer.length + newData.length); + newBuffer.set(this.buffer); + newBuffer.set(newData, this.buffer.length); + this.buffer = newBuffer; + }; } process(inputs, outputs, parameters) { diff --git a/gemini/multimodal-live-api/websocket-demo-app/frontend/script.js b/gemini/multimodal-live-api/websocket-demo-app/frontend/script.js new file mode 100644 index 00000000000..45a53eafe6a --- /dev/null +++ b/gemini/multimodal-live-api/websocket-demo-app/frontend/script.js @@ -0,0 +1,272 @@ +window.addEventListener("load", (event) => { + console.log("Hello Gemini Realtime Demo!"); + + setAvalibleCamerasOptions(); + setAvalibleMicrophoneOptions(); +}); + + +const PROXY_URL = "wss://[THE_URL_YOU_COPIED_WITHOUT_HTTP]"; +const PROJECT_ID = "your project id"; +const MODEL = "gemini-2.0-flash-exp"; +const API_HOST = "us-central1-aiplatform.googleapis.com"; + +const accessTokenInput = document.getElementById("token"); +const projectInput = document.getElementById("project"); +const systemInstructionsInput = + document.getElementById("systemInstructions"); + +CookieJar.init("token"); +CookieJar.init("project"); +CookieJar.init("systemInstructions"); + +const disconnected = document.getElementById("disconnected"); +const connecting = document.getElementById("connecting"); +const connected = document.getElementById("connected"); +const speaking = document.getElementById("speaking"); + +const micBtn = document.getElementById("micBtn"); +const micOffBtn = document.getElementById("micOffBtn"); +const cameraBtn = document.getElementById("cameraBtn"); +const screenBtn = document.getElementById("screenBtn"); + +const cameraSelect = document.getElementById("cameraSource"); +const micSelect = document.getElementById("audioSource"); + +const geminiLiveApi = new GeminiLiveAPI( + PROXY_URL, + PROJECT_ID, + MODEL, + API_HOST +); + +geminiLiveApi.onErrorMessage = (message) => { + showDialogWithMessage(message); + setAppSatus("disconnected"); +}; + +function getSelectedResponseModality() { + // return "AUDIO"; + const radioButtons = document.querySelectorAll( + 'md-radio[name="responseModality"]' + ); + + let selectedValue; + for (const radioButton of radioButtons) { + if (radioButton.checked) { + selectedValue = radioButton.value; + break; + } + } + return selectedValue; +} + +function getSystemInstructions() { + return systemInstructionsInput.value; +} + +function connectBtnClick() { + setAppSatus("connecting"); + + geminiLiveApi.responseModalities = getSelectedResponseModality(); + geminiLiveApi.systemInstructions = getSystemInstructions(); + + geminiLiveApi.onConnectionStarted = () => { + setAppSatus("connected"); + startAudioInput(); + }; + + geminiLiveApi.setProjectId(projectInput.value); + geminiLiveApi.connect(accessTokenInput.value); +} + +const liveAudioOutputManager = new LiveAudioOutputManager(); + +geminiLiveApi.onReceiveResponse = (messageResponse) => { + if (messageResponse.type == "AUDIO") { + liveAudioOutputManager.playAudioChunk(messageResponse.data); + } else if (messageResponse.type == "TEXT") { + console.log("Gemini said: ", messageResponse.data); + newModelMessage(messageResponse.data); + } +}; + +const liveAudioInputManager = new LiveAudioInputManager(); + +liveAudioInputManager.onNewAudioRecordingChunk = (audioData) => { + geminiLiveApi.sendAudioMessage(audioData); +}; + +function addMessageToChat(message) { + const textChat = document.getElementById("text-chat"); + const newParagraph = document.createElement("p"); + newParagraph.textContent = message; + textChat.appendChild(newParagraph); +} + +function newModelMessage(message) { + addMessageToChat(">> " + message); +} + +function newUserMessage() { + const textMessage = document.getElementById("text-message"); + addMessageToChat("User: " + textMessage.value); + geminiLiveApi.sendTextMessage(textMessage.value); + + textMessage.value = ""; +} + +function startAudioInput() { + liveAudioInputManager.connectMicrophone(); +} + +function stopAudioInput() { + liveAudioInputManager.disconnectMicrophone(); +} + +function micBtnClick() { + console.log("micBtnClick"); + stopAudioInput(); + micBtn.hidden = true; + micOffBtn.hidden = false; +} + +function micOffBtnClick() { + console.log("micOffBtnClick"); + startAudioInput(); + + micBtn.hidden = false; + micOffBtn.hidden = true; +} + +const videoElement = document.getElementById("video"); +const canvasElement = document.getElementById("canvas"); + +const liveVideoManager = new LiveVideoManager(videoElement, canvasElement); + +const liveScreenManager = new LiveScreenManager( + videoElement, + canvasElement +); + +liveVideoManager.onNewFrame = (b64Image) => { + geminiLiveApi.sendImageMessage(b64Image); +}; + +liveScreenManager.onNewFrame = (b64Image) => { + geminiLiveApi.sendImageMessage(b64Image); +}; + +function startCameraCapture() { + liveScreenManager.stopCapture(); + liveVideoManager.startWebcam(); +} + +function startScreenCapture() { + liveVideoManager.stopWebcam(); + liveScreenManager.startCapture(); +} + +function cameraBtnClick() { + startCameraCapture(); + console.log("cameraBtnClick"); +} + +function screenShareBtnClick() { + startScreenCapture(); + console.log("screenShareBtnClick"); +} + +function newCameraSelected() { + console.log("newCameraSelected ", cameraSelect.value); + liveVideoManager.updateWebcamDevice(cameraSelect.value); +} + +function newMicSelected() { + console.log("newMicSelected", micSelect.value); + liveAudioInputManager.updateMicrophoneDevice(micSelect.value); +} + +function disconnectBtnClick() { + setAppSatus("disconnected"); + geminiLiveApi.disconnect(); + stopAudioInput(); +} + +function showDialogWithMessage(messageText) { + const dialog = document.getElementById("dialog"); + const dialogMessage = document.getElementById("dialogMessage"); + dialogMessage.innerHTML = messageText; + dialog.show(); +} + +async function getAvalibleDevices(deviceType) { + const allDevices = await navigator.mediaDevices.enumerateDevices(); + const devices = []; + allDevices.forEach((device) => { + if (device.kind === deviceType) { + devices.push({ + id: device.deviceId, + name: device.label || device.deviceId, + }); + } + }); + return devices; +} + +async function getAvalibleCameras() { + return await this.getAvalibleDevices("videoinput"); +} + +async function getAvalibleAudioInputs() { + return await this.getAvalibleDevices("audioinput"); +} + +function setMaterialSelect(allOptions, selectElement) { + allOptions.forEach((optionData) => { + const option = document.createElement("md-select-option"); + option.value = optionData.id; + + const slotDiv = document.createElement("div"); + slotDiv.slot = "headline"; + slotDiv.innerHTML = optionData.name; + option.appendChild(slotDiv); + + selectElement.appendChild(option); + }); +} + +async function setAvalibleCamerasOptions() { + const cameras = await getAvalibleCameras(); + const videoSelect = document.getElementById("cameraSource"); + setMaterialSelect(cameras, videoSelect); +} + +async function setAvalibleMicrophoneOptions() { + const mics = await getAvalibleAudioInputs(); + const audioSelect = document.getElementById("audioSource"); + setMaterialSelect(mics, audioSelect); +} + +function setAppSatus(status) { + disconnected.hidden = true; + connecting.hidden = true; + connected.hidden = true; + speaking.hidden = true; + + switch (status) { + case "disconnected": + disconnected.hidden = false; + break; + case "connecting": + connecting.hidden = false; + break; + case "connected": + connected.hidden = false; + break; + case "speaking": + speaking.hidden = false; + break; + default: + } +} \ No newline at end of file diff --git a/gemini/multimodal-live-api/websocket-demo-app/frontend/styles.css b/gemini/multimodal-live-api/websocket-demo-app/frontend/styles.css new file mode 100644 index 00000000000..def9db86cfa --- /dev/null +++ b/gemini/multimodal-live-api/websocket-demo-app/frontend/styles.css @@ -0,0 +1,127 @@ +:root { + --container-bg: rgb(224, 224, 224); +} + +body { + font-family: "Roboto"; + text-align: center; +} +h1 { + margin-bottom: 0; +} + +#video-preview { + height: 200px; + width: 250px; + border-radius: 30px; + background-color: var(--container-bg); + padding: 50px; + display: inline-block; + vertical-align: top; +} + +#model-input { + display: inline-block; +} + +.icon-text { + vertical-align: super; +} + +.state { + border-radius: 50px; + padding: 10px; + text-align: center; + background-color: rgb(203, 203, 203); +} + +#disconnected { + background-color: #ffebee; /* Light red background */ + color: #b71c1c; /* Dark red text */ +} + +#disconnected .material-symbols-outlined { + color: #b71c1c; +} + +#connecting { + background-color: #fffde7; /* Light yellow background */ + color: #f57f17; /* Dark yellow text */ + animation: throb 1s infinite ease-in-out; +} + +#connecting .material-symbols-outlined { + color: #f57f17; +} + +#connected { + background-color: #e8f5e9; /* Light green background */ + color: #2e7d32; /* Dark green text */ +} + +#connected .material-symbols-outlined { + color: #2e7d32; +} + +#speaking { + background-color: #e3f2fd; /* Light blue background */ + color: #1565c0; /* Dark blue text */ + animation: throb 1s infinite ease-in-out; +} + +#speaking .material-symbols-outlined { + color: #1565c0; +} + +@keyframes throb { + 0% { + opacity: 0.6; + } + 50% { + opacity: 1; + } + 100% { + opacity: 0.6; + } +} + +.spread { + display: flex; + justify-content: space-around; +} + +#model-config-container { + display: flex; + justify-content: space-around; + flex-direction: column; /* Stack elements vertically on mobile */ +} + +#user-input-container { + display: flex; + justify-content: space-around; + flex-direction: column; /* Stack elements vertically on mobile */ +} + +@media (min-width: 768px) { + /* Adjust breakpoint as needed */ + #model-config-container { + flex-direction: row; /* Revert to horizontal layout on larger screens */ + } + + #user-input-container { + flex-direction: row; /* Revert to horizontal layout on larger screens */ + } + + body { + text-align: left; + } +} + +#video { + width: 100%; + height: 100%; +} + +#canvas { + display: none; +} diff --git a/gemini/multimodal-live-api/websocket-demo-app/index.html b/gemini/multimodal-live-api/websocket-demo-app/index.html deleted file mode 100644 index 3da56783830..00000000000 --- a/gemini/multimodal-live-api/websocket-demo-app/index.html +++ /dev/null @@ -1,424 +0,0 @@ -<html> - <head> - <link - rel="stylesheet" - href="https://fonts.googleapis.com/icon?family=Material+Icons" - /> - <link - rel="stylesheet" - href="https://code.getmdl.io/1.3.0/material.indigo-pink.min.css" - /> - <script defer src="https://code.getmdl.io/1.3.0/material.min.js"></script> - - <style> - #videoElement { - width: 320px; - height: 240px; - border-radius: 20px; - } - - #canvasElement { - display: none; - } - - .demo-content { - padding: 20px; - display: flex; - flex-direction: column; - align-items: center; - } - - /* Styling for button group */ - .button-group { - margin-bottom: 20px; - } - </style> - </head> - - <body> - <div class="mdl-layout mdl-js-layout mdl-layout--fixed-header"> - <header class="mdl-layout__header"> - <div class="mdl-layout__header-row"> - <!-- Title --> - <span class="mdl-layout-title">Demo</span> - </div> - </header> - <main class="mdl-layout__content"> - <div class="page-content"> - <div class="demo-content"> - <!-- Connect Button --> - <div class="mdl-textfield mdl-js-textfield"> - <input class="mdl-textfield__input" type="text" id="token" /> - <label class="mdl-textfield__label" for="input" - >Access Token...</label - > - </div> - <div class="mdl-textfield mdl-js-textfield"> - <input class="mdl-textfield__input" type="text" id="model" value="projects/YOUR-PROJECT-ID/locations/us-central1/publishers/google/models/gemini-2.0-flash-exp" /> - <label class="mdl-textfield__label" for="input" - >Model ID</label - > - </div> - <button - onclick="connect()" - class="mdl-button mdl-js-button mdl-button--raised mdl-button--colored" - > - Connect - </button> - - <!-- Button Group --> - <div class="button-group"> - <!-- Text Input Field --> - <div class="mdl-textfield mdl-js-textfield"> - <input class="mdl-textfield__input" type="text" id="input" /> - <label class="mdl-textfield__label" for="input" - >Enter text message...</label - > - </div> - - <!-- Send Message Button --> - <button - onclick="sendUserMessage()" - class="mdl-button mdl-js-button mdl-button--icon" - > - <i class="material-icons">send</i> - </button> - </div> - - <!-- Voice Control Buttons --> - <div class="button-group"> - <button - onclick="startAudioInput()" - class="mdl-button mdl-js-button mdl-button--fab mdl-button--mini-fab mdl-button--colored" - > - <i class="material-icons">mic</i> - </button> - <button - onclick="stopAudioInput()" - class="mdl-button mdl-js-button mdl-button--fab mdl-button--mini-fab" - > - <i class="material-icons">mic_off</i> - </button> - </div> - - <!-- Video Element --> - <video id="videoElement" autoplay></video> - - <!-- Hidden Canvas --> - <canvas id="canvasElement"></canvas> - <div id="chatLog"></div> - </div> - </div> - </main> - </div> - - <script defer> - const URL = "ws://localhost:8080"; - const video = document.getElementById("videoElement"); - const canvas = document.getElementById("canvasElement"); - const context = canvas.getContext("2d"); - const accessTokenInput = document.getElementById("token"); - const modelIdInput = document.getElementById("model"); - let stream = null; // Store the MediaStream object globally - - let currentFrameB64; - // Function to start the webcam - async function startWebcam() { - try { - const constraints = { - video: { - width: { max: 640 }, - height: { max: 480 }, - }, - }; - - stream = await navigator.mediaDevices.getUserMedia(constraints); - video.srcObject = stream; - } catch (err) { - console.error("Error accessing the webcam: ", err); - } - } - - // Function to capture an image and convert it to base64 - function captureImage() { - if (stream) { - // Check if the stream is available - canvas.width = video.videoWidth; - canvas.height = video.videoHeight; - context.drawImage(video, 0, 0, canvas.width, canvas.height); - const imageData = canvas.toDataURL("image/jpeg").split(",")[1].trim(); - currentFrameB64 = imageData; - } - } - - window.addEventListener("load", startWebcam); - setInterval(captureImage, 1000); - - let webSocket = null; - - function connect() { - console.log("connecting: ", URL); - - webSocket = new WebSocket(URL); - - webSocket.onclose = (event) => { - console.log("websocket closed: ", event); - alert("Connection closed"); - }; - - webSocket.onerror = (event) => { - console.log("websocket error: ", event); - }; - - webSocket.onopen = (event) => { - console.log("websocket open: ", event); - sendInitialSetupMessage(); - }; - - webSocket.onmessage = receiveMessage; - } - - function sendInitialSetupMessage() { - console.log("sending auth message"); - - const accessToken = accessTokenInput.value; - const modelId = modelIdInput.value; - - auth_message = { - bearer_token: accessToken, - }; - webSocket.send(JSON.stringify(auth_message)); - - console.log("sending setup message"); - setup_client_message = { - setup: { - model: modelId, - generation_config: { response_modalities: ["AUDIO"] }, - }, - }; - - webSocket.send(JSON.stringify(setup_client_message)); - } - - function sendMessage(message) { - if (webSocket == null) { - console.log("websocket not initilised"); - return; - } - - payload = { - client_content: { - turns: [ - { - role: "user", - parts: [{ text: message }], - }, - ], - turn_complete: true, - }, - }; - - webSocket.send(JSON.stringify(payload)); - console.log("sent: ", payload); - displayMessage("USER: " + message); - } - - function sendVoiceMessage(b64PCM) { - if (webSocket == null) { - console.log("websocket not initialized"); - return; - } - - payload = { - realtime_input: { - media_chunks: [ - { - mime_type: "audio/pcm", - data: b64PCM, - // will_continue: false, - }, - { - mime_type: "image/jpeg", - data: currentFrameB64, - // will_continue: false, - }, - ], - }, - }; - - webSocket.send(JSON.stringify(payload)); - console.log("sent: ", payload); - } - - function sendUserMessage() { - const messageText = document.getElementById("input").value; - console.log("user message: ", messageText); - sendMessage(messageText); - } - - let current_message = ""; - - function receiveMessage(event) { - const messageData = JSON.parse(event.data); - console.log("messageData: ", messageData); - const response = new Response(messageData); - console.log("receiveMessage ", response); - - current_message = current_message + response.data; - - injestAudioChuckToPlay(response.data); - - if (response.endOfTurn) { - // displayMessage("GEMINI: 🔊"); - // current_message = ""; - } - } - - let audioInputContext; - let workletNode; - let initialized = false; - - async function initializeAudioContext() { - if (initialized) return; - - audioInputContext = new (window.AudioContext || - window.webkitAudioContext)({ sampleRate: 24000 }); - await audioInputContext.audioWorklet.addModule("pcm-processor.js"); - workletNode = new AudioWorkletNode(audioInputContext, "pcm-processor"); - workletNode.connect(audioInputContext.destination); - initialized = true; - } - - function base64ToArrayBuffer(base64) { - const binaryString = window.atob(base64); - const bytes = new Uint8Array(binaryString.length); - for (let i = 0; i < binaryString.length; i++) { - bytes[i] = binaryString.charCodeAt(i); - } - return bytes.buffer; - } - - function convertPCM16LEToFloat32(pcmData) { - const inputArray = new Int16Array(pcmData); - const float32Array = new Float32Array(inputArray.length); - - for (let i = 0; i < inputArray.length; i++) { - float32Array[i] = inputArray[i] / 32768; - } - - return float32Array; - } - - async function injestAudioChuckToPlay(base64AudioChunk) { - try { - if (!initialized) { - await initializeAudioContext(); - } - - if (audioInputContext.state === "suspended") { - await audioInputContext.resume(); - } - - const arrayBuffer = base64ToArrayBuffer(base64AudioChunk); - const float32Data = convertPCM16LEToFloat32(arrayBuffer); - - workletNode.port.postMessage(float32Data); - } catch (error) { - console.error("Error processing audio chunk:", error); - } - } - - let audioContext; - let mediaRecorder; - let processor; - let pcmData = []; - - let interval = null; - - function recordChunk() { - // Convert to base64 - const buffer = new ArrayBuffer(pcmData.length * 2); - const view = new DataView(buffer); - pcmData.forEach((value, index) => { - view.setInt16(index * 2, value, true); - }); - - const base64 = btoa( - String.fromCharCode.apply(null, new Uint8Array(buffer)) - ); - - // document.getElementById("output").textContent = base64; - - sendVoiceMessage(base64); - - pcmData = []; - } - - async function startAudioInput() { - audioContext = new AudioContext({ - sampleRate: 16000, - }); - - const stream = await navigator.mediaDevices.getUserMedia({ - audio: { - channelCount: 1, - sampleRate: 16000, - }, - }); - - const source = audioContext.createMediaStreamSource(stream); - processor = audioContext.createScriptProcessor(4096, 1, 1); - - processor.onaudioprocess = (e) => { - const inputData = e.inputBuffer.getChannelData(0); - // Convert float32 to int16 - const pcm16 = new Int16Array(inputData.length); - for (let i = 0; i < inputData.length; i++) { - pcm16[i] = inputData[i] * 0x7fff; - } - pcmData.push(...pcm16); - }; - - source.connect(processor); - processor.connect(audioContext.destination); - - interval = setInterval(recordChunk, 1000); - } - - function stopAudioInput() { - processor.disconnect(); - audioContext.close(); - clearInterval(interval); - } - - function displayMessage(message) { - console.log(message); - // const newMessage = message.message - addParagraphToDiv("chatLog", message); - } - - function addParagraphToDiv(divId, text) { - const newParagraph = document.createElement("p"); - newParagraph.textContent = text; - const div = document.getElementById(divId); - div.appendChild(newParagraph); - } - - class Response { - constructor(data) { - this.data = ""; - this.endOfTurn = data?.serverContent?.turnComplete; - - if (data?.serverContent?.modelTurn?.parts) { - this.data = data?.serverContent?.modelTurn?.parts[0]?.text; - } - - if (data?.serverContent?.modelTurn?.parts[0]?.inlineData) { - this.data = - data?.serverContent?.modelTurn?.parts[0]?.inlineData.data; - } - } - } - </script> - </body> -</html>