
























@@ -63,6 +63,32 @@ they follow the same API shape.
6363current image-capable Grok refs in the bundled catalog.
6464</Tip>
656566+## OpenClaw feature coverage
67+68+The bundled plugin maps xAI's current public API surface onto OpenClaw's shared
69+provider and tool contracts where the behavior fits cleanly.
70+71+| xAI capability | OpenClaw surface | Status |
72+| -------------------------- | -------------------------------------- | ------------------------------------------------------------------- |
73+| Chat / Responses | `xai/<model>` model provider | Yes |
74+| Server-side web search | `web_search` provider `grok` | Yes |
75+| Server-side X search | `x_search` tool | Yes |
76+| Server-side code execution | `code_execution` tool | Yes |
77+| Images | `image_generate` | Yes |
78+| Videos | `video_generate` | Yes |
79+| Batch text-to-speech | `messages.tts.provider: "xai"` / `tts` | Yes |
80+| Streaming TTS | — | Not exposed; OpenClaw's TTS contract returns complete audio buffers |
81+| Speech-to-text | — | Not exposed yet; needs a transcription provider surface |
82+| Realtime voice | — | Not exposed yet; different session/WebSocket contract |
83+| Files / batches | Generic model API compatibility only | Not a first-class OpenClaw tool |
84+85+<Note>
86+OpenClaw uses xAI's REST image/video/TTS APIs for media generation and the
87+Responses API for model, search, and code-execution tools. Features that need
88+new OpenClaw contracts, such as streaming STT or Realtime voice sessions, are
89+documented here as upstream capabilities rather than hidden plugin behavior.
90+</Note>
91+6692### Fast-mode mappings
67936894`/fast on` or `agents.defaults.models["xai/<model>"].params.fastMode: true`
@@ -103,12 +129,17 @@ Legacy aliases still normalize to the canonical bundled ids:
103129`video_generate` tool.
104130105131- Default video model: `xai/grok-imagine-video`
106-- Modes: text-to-video, image-to-video, and remote video edit/extend flows
107-- Supports `aspectRatio` and `resolution`
132+- Modes: text-to-video, image-to-video, remote video edit, and remote video
133+ extension
134+- Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3`
135+- Resolutions: `480P`, `720P`
136+- Duration: 1-15 seconds for generation/image-to-video, 2-10 seconds for
137+ extension
108138109139<Warning>
110140Local video buffers are not accepted. Use remote `http(s)` URLs for
111-video-reference and edit inputs.
141+video edit/extend inputs. Image-to-video accepts local image buffers because
142+OpenClaw can encode those as data URLs for xAI.
112143</Warning>
113144114145To use xAI as the default video provider:
@@ -132,6 +163,82 @@ Legacy aliases still normalize to the canonical bundled ids:
132163133164</Accordion>
134165166+<Accordion title="Image generation">
167+The bundled `xai` plugin registers image generation through the shared
168+`image_generate` tool.
169+170+- Default image model: `xai/grok-imagine-image`
171+- Additional model: `xai/grok-imagine-image-pro`
172+- Modes: text-to-image and reference-image edit
173+- Reference inputs: one `image` or up to five `images`
174+- Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2`
175+- Resolutions: `1K`, `2K`
176+- Count: up to 4 images
177+178+OpenClaw asks xAI for `b64_json` image responses so generated media can be
179+stored and delivered through the normal channel attachment path. Local
180+reference images are converted to data URLs; remote `http(s)` references are
181+passed through.
182+183+To use xAI as the default image provider:
184+185+```json5
186+{
187+ agents: {
188+ defaults: {
189+ imageGenerationModel: {
190+ primary: "xai/grok-imagine-image",
191+ },
192+ },
193+ },
194+}
195+```
196+197+<Note>
198+xAI also documents `quality`, `mask`, `user`, and additional native ratios
199+such as `1:2`, `2:1`, `9:20`, and `20:9`. OpenClaw forwards only the
200+shared cross-provider image controls today; unsupported native-only knobs
201+are intentionally not exposed through `image_generate`.
202+</Note>
203+204+</Accordion>
205+206+<Accordion title="Text-to-speech">
207+The bundled `xai` plugin registers text-to-speech through the shared `tts`
208+provider surface.
209+210+- Voices: `eve`, `ara`, `rex`, `sal`, `leo`, `una`
211+- Default voice: `eve`
212+- Formats: `mp3`, `wav`, `pcm`, `mulaw`, `alaw`
213+- Language: BCP-47 code or `auto`
214+- Speed: provider-native speed override
215+- Native Opus voice-note format is not supported
216+217+To use xAI as the default TTS provider:
218+219+```json5
220+{
221+ messages: {
222+ tts: {
223+ provider: "xai",
224+ providers: {
225+ xai: {
226+ voiceId: "eve",
227+ },
228+ },
229+ },
230+ },
231+}
232+```
233+234+<Note>
235+OpenClaw uses xAI's batch `/v1/tts` endpoint. xAI also offers streaming TTS
236+over WebSocket, but the OpenClaw speech provider contract currently expects
237+a complete audio buffer before reply delivery.
238+</Note>
239+240+</Accordion>
241+135242<Accordion title="x_search configuration">
136243The bundled xAI plugin exposes `x_search` as an OpenClaw tool for searching
137244X (formerly Twitter) content via Grok.
@@ -209,6 +316,12 @@ Legacy aliases still normalize to the canonical bundled ids:
209316- `grok-4.20-multi-agent-experimental-beta-0304` is not supported on the
210317 normal xAI provider path because it requires a different upstream API
211318 surface than the standard OpenClaw xAI transport.
319+- xAI STT and Realtime voice are not registered as OpenClaw providers yet.
320+ They require transcription/session contracts rather than the existing
321+ batch TTS provider shape.
322+- xAI image `quality`, image `mask`, and extra native-only aspect ratios are
323+ not exposed until the shared `image_generate` tool has corresponding
324+ cross-provider controls.
212325</Accordion>
213326214327<Accordion title="Advanced notes">
@@ -229,6 +342,23 @@ Legacy aliases still normalize to the canonical bundled ids:
229342</Accordion>
230343</AccordionGroup>
231344345+## Live testing
346+347+The xAI media paths are covered by unit tests and opt-in live suites. The live
348+commands load secrets from your login shell, including `~/.profile`, before
349+probing `XAI_API_KEY`.
350+351+```bash
352+pnpm test extensions/xai
353+OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_TEST_QUIET=1 pnpm test:live -- extensions/xai/xai.live.test.ts
354+OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_TEST_QUIET=1 OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS=xai pnpm test:live -- test/image-generation.runtime.live.test.ts
355+```
356+357+The provider-specific live file synthesizes normal TTS, telephony-friendly PCM
358+TTS, text-to-image generation, and reference-image editing. The shared image
359+live file verifies the same xAI provider through OpenClaw's runtime selection,
360+fallback, normalization, and media attachment path.
361+232362## Related
233363234364<CardGroup cols={2}>
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。