Openai/6910a506-2ac0-8003-b4a7-5f5804de4257
This conversation was exported from ChatGPT[1]
gpt-5-thinking edit
User: I'm working with OpenAI Agent Builder and want to execute a ChatKit Client Tool that will left click an UI element by i… edit
I'm working with OpenAI Agent Builder and want to execute a ChatKit Client Tool that will left click an UI element by its HTML ID. Provide me with a simple:
- A tool definition for Agent Builder
- A tool implementation for a Lovable project
Assistant: Here’s a minimal, copy-pasteable setup for a client tool that left-clicks an element by its HTML id. edit
Here’s a minimal, copy-pasteable setup for a client tool that left-clicks an element by its HTML id.
Agent Builder – tool definition (JSON) edit
Use this as the tool schema in Agent Builder (mark it as a Client tool):
{
"name": "click_element_by_id",
"description": "Left-click a DOM element identified by its HTML id.",
"strict": true,
"parameters": {
"type": "object",
"properties": {
"element_id": {
"type": "string",
"description": "The HTML id of the target element (without the leading #)."
}
},
"required": ["element_id"],
"additionalProperties": false
}
}
Lovable (React) – client tool implementation edit
Add this component anywhere in your Lovable project (e.g., src/components/ChatKitClient.tsx). It registers the client tool and dispatches a real left-click.
import { useEffect } from "react";
import { ChatKit, useChatKit } from "@openai/chatkit-react";
/**
* Mount this component once (e.g., in your App layout) after you've wrapped your app
* with <ChatKit .../> and supplied getClientSecret.
*/
export default function ChatKitClientTool() {
const { control } = useChatKit();
useEffect(() => {
if (!control) return;
// Register the client tool so Agent Builder can call it.
const unregister = control.registerClientTools([
{
name: "click_element_by_id",
description: "Left-click a DOM element identified by its HTML id.",
// Keep the schema identical to the Agent Builder definition:
parameters: {
type: "object",
properties: {
element_id: {
type: "string",
description:
"The HTML id of the target element (without the leading #)."
}
},
required: ["element_id"],
additionalProperties: false
},
handler: async ({ element_id }: { element_id: string }) => {
try {
// 1) Find the element
const el = document.getElementById(element_id);
if (!el) {
return {
ok: false,
error: "not_found",
message: <code>No element found with id="${element_id}".</code>
};
}
// 2) Ensure it can be interacted with
(el as HTMLElement).scrollIntoView({ behavior: "smooth", block: "center" });
await new Promise((r) => setTimeout(r, 120)); // allow layout/scroll to settle
// 3) Compute a safe click point (center of the element)
const rect = el.getBoundingClientRect();
const clientX = Math.max(0, Math.floor(rect.left + rect.width / 2));
const clientY = Math.max(0, Math.floor(rect.top + rect.height / 2));
// If something overlays the element, click the topmost node at that point,
// but still report the original target.
const topNode = document.elementFromPoint(clientX, clientY) as HTMLElement | null;
const target = topNode ?? (el as HTMLElement);
// 4) Synthesize a full left-click sequence
const baseOpts: MouseEventInit & { bubbles: boolean; cancelable: boolean } = {
bubbles: true,
cancelable: true,
view: window,
clientX,
clientY,
button: 0 // left
};
// Some frameworks listen to pointer events first
target.dispatchEvent(new PointerEvent("pointerdown", baseOpts));
target.dispatchEvent(new MouseEvent("mousedown", baseOpts));
target.dispatchEvent(new PointerEvent("pointerup", baseOpts));
target.dispatchEvent(new MouseEvent("mouseup", baseOpts));
target.dispatchEvent(new MouseEvent("click", baseOpts));
return {
ok: true,
clicked: true,
target_id: element_id,
topmost_tag: target.tagName,
note:
target !== el
? "Click dispatched on the topmost element at the same point."
: "Click dispatched on the element with the provided id."
};
} catch (err: any) {
return {
ok: false,
error: "unhandled_exception",
message: err?.message ?? String(err)
};
}
}
}
]);
return () => unregister?.();
}, [control]);
return null;
}Typical app wrapper (for context) edit
If you don’t already have it, your app shell should include <ChatKit> and mount the tool component once:
// e.g., src/App.tsx
import { ChatKit } from "@openai/chatkit-react";
import ChatKitClientTool from "./components/ChatKitClient";
export default function App() {
return (
<ChatKit
api={{
// Provide a function that returns a short-lived client secret from your backend.
getClientSecret: async () => {
// fetch from your server; return the secret string
const res = await fetch("/api/chatkit/secret");
const { secret } = await res.json();
return secret;
}
}}
>
{/'' Mount once so the tool is available page-wide ''/}
<ChatKitClientTool />
{/'' ...your app UI... ''/}
</ChatKit>
);
}How the agent calls it edit
From Agent Builder, the tool call input should look like:
{ "element_id": "my-button-id" }
That’s it — the agent can now reliably left-click any interactable element by its id.
References edit
- ↑ "Tool definition and implementation". ChatGPT. Retrieved 2025-11-10.