Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 29 additions & 29 deletions docs/features/fleet-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,18 @@ Fleet mode is useful when the work can be decomposed before execution and each u

Good fits include:

- Multi-file refactors where each worker owns a file, package, or language SDK.
- Batch reviews where each worker checks a separate diff, module, or alert group.
- Parallel research across independent repositories, services, or feature areas.
- Documentation refreshes where each worker owns a page or topic.
- Migration tasks where each worker can validate its own slice and report back.
* Multi-file refactors where each worker owns a file, package, or language SDK.
* Batch reviews where each worker checks a separate diff, module, or alert group.
* Parallel research across independent repositories, services, or feature areas.
* Documentation refreshes where each worker owns a page or topic.
* Migration tasks where each worker can validate its own slice and report back.

Avoid fleet mode for:

- Sequential tasks where step 2 needs the concrete output from step 1.
- Tightly coupled edits where workers would contend for the same files.
- Small tasks that one synchronous sub-agent or the parent agent can finish quickly.
- Tasks that require continuous shared reasoning rather than clear ownership.
* Sequential tasks where step 2 needs the concrete output from step 1.
* Tightly coupled edits where workers would contend for the same files.
* Small tasks that one synchronous sub-agent or the parent agent can finish quickly.
* Tasks that require continuous shared reasoning rather than clear ownership.

Fleet mode works best when the parent session can create clear units of work, assign one owner per unit, and define what each worker must return.

Expand Down Expand Up @@ -321,29 +321,29 @@ Keep plugin-provided sub-agent types narrow and descriptive so the orchestrator

## Best practices

- Decompose the work into independent units before starting fleet mode.
- Minimize dependencies between todos; dependencies reduce parallelism.
- Give each todo a durable ID, a clear title, and a complete description.
- Make each sub-agent own exactly one todo at a time.
- Use background sub-agents for truly parallel work.
- Use synchronous sub-agent calls for serialized steps or validation gates.
- Provide each sub-agent with complete context; sub-agents are stateless across calls.
- Include file paths, commands, expected outputs, and constraints in each worker prompt.
- Do not dispatch a single background sub-agent; prefer a synchronous call or batch multiple workers in parallel.
- Avoid assigning overlapping files to different workers unless the parent agent will reconcile conflicts explicitly.
- Require every worker to report what it changed, how it validated the change, and what remains blocked.
- Have the parent agent verify the combined result after workers finish.
* Decompose the work into independent units before starting fleet mode.
* Minimize dependencies between todos; dependencies reduce parallelism.
* Give each todo a durable ID, a clear title, and a complete description.
* Make each sub-agent own exactly one todo at a time.
* Use background sub-agents for truly parallel work.
* Use synchronous sub-agent calls for serialized steps or validation gates.
* Provide each sub-agent with complete context; sub-agents are stateless across calls.
* Include file paths, commands, expected outputs, and constraints in each worker prompt.
* Do not dispatch a single background sub-agent; prefer a synchronous call or batch multiple workers in parallel.
* Avoid assigning overlapping files to different workers unless the parent agent will reconcile conflicts explicitly.
* Require every worker to report what it changed, how it validated the change, and what remains blocked.
* Have the parent agent verify the combined result after workers finish.

## Limitations and open questions

- Fleet mode is exposed through generated session RPC bindings and is marked experimental in several SDKs.
- The SQL todos pattern is the canonical coordination model in the runtime guidance, but whether it is a stable extensibility contract for SDK consumers is still an open question.
- `subagentStart` and `subagentStop` are runtime hook names; this branch exposes sub-agent lifecycle to SDK consumers through the generic session event stream, not dedicated hook callbacks.
- Plugin sub-agent registration is configured at the runtime layer through `--plugin-dir`; no SDK-level plugin registration helper was verified on this branch.
- Java native typed bindings for `session.fleet.start` were not found in the Java SDK source on this branch.
- Fleet mode does not remove the need for parent-agent review. Parallel workers can produce inconsistent assumptions that the orchestrator must reconcile.
* Fleet mode is exposed through generated session RPC bindings and is marked experimental in several SDKs.
* The SQL todos pattern is the canonical coordination model in the runtime guidance, but whether it is a stable extensibility contract for SDK consumers is still an open question.
* `subagentStart` and `subagentStop` are runtime hook names; this branch exposes sub-agent lifecycle to SDK consumers through the generic session event stream, not dedicated hook callbacks.
* Plugin sub-agent registration is configured at the runtime layer through `--plugin-dir`; no SDK-level plugin registration helper was verified on this branch.
* Java native typed bindings for `session.fleet.start` were not found in the Java SDK source on this branch.
* Fleet mode does not remove the need for parent-agent review. Parallel workers can produce inconsistent assumptions that the orchestrator must reconcile.

## See also

- [Custom agents and sub-agent orchestration](custom-agents.md)
- [Hooks](hooks.md)
* [Custom agents and sub-agent orchestration](custom-agents.md)
* [Hooks](hooks.md)
20 changes: 10 additions & 10 deletions docs/features/hooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -1051,16 +1051,16 @@ const session = await client.createSession({

For full type definitions, input/output field tables, and additional examples for every hook, see the API reference:

- [Hooks Overview](../hooks/hooks-overview.md)
- [Pre-Tool Use](../hooks/pre-tool-use.md)
- [Post-Tool Use](../hooks/post-tool-use.md)
- [User Prompt Submitted](../hooks/user-prompt-submitted.md)
- [Session Lifecycle](../hooks/session-lifecycle.md)
- [Error Handling](../hooks/error-handling.md)
* [Hooks Overview](../hooks/hooks-overview.md)
* [Pre-Tool Use](../hooks/pre-tool-use.md)
* [Post-Tool Use](../hooks/post-tool-use.md)
* [User Prompt Submitted](../hooks/user-prompt-submitted.md)
* [Session Lifecycle](../hooks/session-lifecycle.md)
* [Error Handling](../hooks/error-handling.md)

## See also

- [Getting Started](../getting-started.md)
- [Custom Agents & Sub-Agent Orchestration](./custom-agents.md)
- [Streaming Session Events](./streaming-events.md)
- [Debugging Guide](../troubleshooting/debugging.md)
* [Getting Started](../getting-started.md)
* [Custom Agents & Sub-Agent Orchestration](./custom-agents.md)
* [Streaming Session Events](./streaming-events.md)
* [Debugging Guide](../troubleshooting/debugging.md)
40 changes: 20 additions & 20 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ Create `index.ts`:
import { CopilotClient } from "@github/copilot-sdk";

const client = new CopilotClient();
const session = await client.createSession({ model: "gpt-4.1" });
const session = await client.createSession({ model: "auto" });

const response = await session.sendAndWait({ prompt: "What is 2 + 2?" });
console.log(response?.data.content);
Expand Down Expand Up @@ -181,7 +181,7 @@ async def main():
client = CopilotClient()
await client.start()

session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1")
session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto")
response = await session.send_and_wait("What is 2 + 2?")
print(response.data.content)

Expand Down Expand Up @@ -223,7 +223,7 @@ func main() {
}
defer client.Stop()

session, err := client.CreateSession(ctx, &copilot.SessionConfig{Model: "gpt-4.1"})
session, err := client.CreateSession(ctx, &copilot.SessionConfig{Model: "auto"})
if err != nil {
log.Fatal(err)
}
Expand Down Expand Up @@ -304,7 +304,7 @@ using GitHub.Copilot;
await using var client = new CopilotClient();
await using var session = await client.CreateSessionAsync(new SessionConfig
{
Model = "gpt-4.1",
Model = "auto",
OnPermissionRequest = PermissionHandler.ApproveAll
});

Expand Down Expand Up @@ -337,7 +337,7 @@ public class HelloCopilot {

var session = client.createSession(
new SessionConfig()
.setModel("gpt-4.1")
.setModel("auto")
.setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
).get();

Expand Down Expand Up @@ -383,7 +383,7 @@ import { CopilotClient } from "@github/copilot-sdk";

const client = new CopilotClient();
const session = await client.createSession({
model: "gpt-4.1",
model: "auto",
streaming: true,
});

Expand Down Expand Up @@ -419,7 +419,7 @@ async def main():
client = CopilotClient()
await client.start()

session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1", streaming=True)
session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto", streaming=True)

# Listen for response chunks
def handle_event(event):
Expand Down Expand Up @@ -466,7 +466,7 @@ func main() {
defer client.Stop()

session, err := client.CreateSession(ctx, &copilot.SessionConfig{
Model: "gpt-4.1",
Model: "auto",
Streaming: copilot.Bool(true),
})
if err != nil {
Expand Down Expand Up @@ -562,7 +562,7 @@ using GitHub.Copilot;
await using var client = new CopilotClient();
await using var session = await client.CreateSessionAsync(new SessionConfig
{
Model = "gpt-4.1",
Model = "auto",
OnPermissionRequest = PermissionHandler.ApproveAll,
Streaming = true,
});
Expand Down Expand Up @@ -602,7 +602,7 @@ public class HelloCopilot {

var session = client.createSession(
new SessionConfig()
.setModel("gpt-4.1")
.setModel("auto")
.setStreaming(true)
.setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
).get();
Expand Down Expand Up @@ -912,7 +912,7 @@ const getWeather = defineTool("get_weather", {

const client = new CopilotClient();
const session = await client.createSession({
model: "gpt-4.1",
model: "auto",
streaming: true,
tools: [getWeather],
});
Expand Down Expand Up @@ -968,7 +968,7 @@ async def main():
client = CopilotClient()
await client.start()

session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1", streaming=True, tools=[get_weather])
session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto", streaming=True, tools=[get_weather])

def handle_event(event):
if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
Expand Down Expand Up @@ -1045,7 +1045,7 @@ func main() {
defer client.Stop()

session, err := client.CreateSession(ctx, &copilot.SessionConfig{
Model: "gpt-4.1",
Model: "auto",
Streaming: copilot.Bool(true),
Tools: []copilot.Tool{getWeather},
})
Expand Down Expand Up @@ -1185,7 +1185,7 @@ var getWeather = CopilotTool.DefineTool(

await using var session = await client.CreateSessionAsync(new SessionConfig
{
Model = "gpt-4.1",
Model = "auto",
OnPermissionRequest = PermissionHandler.ApproveAll,
Streaming = true,
Tools = [getWeather],
Expand Down Expand Up @@ -1259,7 +1259,7 @@ public class HelloCopilot {

var session = client.createSession(
new SessionConfig()
.setModel("gpt-4.1")
.setModel("auto")
.setStreaming(true)
.setTools(List.of(getWeather))
.setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
Expand Down Expand Up @@ -1316,7 +1316,7 @@ const getWeather = defineTool("get_weather", {

const client = new CopilotClient();
const session = await client.createSession({
model: "gpt-4.1",
model: "auto",
streaming: true,
tools: [getWeather],
});
Expand Down Expand Up @@ -1389,7 +1389,7 @@ async def main():
client = CopilotClient()
await client.start()

session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1", streaming=True, tools=[get_weather])
session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto", streaming=True, tools=[get_weather])

def handle_event(event):
if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
Expand Down Expand Up @@ -1482,7 +1482,7 @@ func main() {
defer client.Stop()

session, err := client.CreateSession(ctx, &copilot.SessionConfig{
Model: "gpt-4.1",
Model: "auto",
Streaming: copilot.Bool(true),
Tools: []copilot.Tool{getWeather},
})
Expand Down Expand Up @@ -1671,7 +1671,7 @@ var getWeather = CopilotTool.DefineTool(
await using var client = new CopilotClient();
await using var session = await client.CreateSessionAsync(new SessionConfig
{
Model = "gpt-4.1",
Model = "auto",
OnPermissionRequest = PermissionHandler.ApproveAll,
Streaming = true,
Tools = [getWeather]
Expand Down Expand Up @@ -1765,7 +1765,7 @@ public class WeatherAssistant {

var session = client.createSession(
new SessionConfig()
.setModel("gpt-4.1")
.setModel("auto")
.setStreaming(true)
.setOnPermissionRequest(request ->
CompletableFuture.completedFuture(PermissionDecision.allow())
Expand Down
14 changes: 7 additions & 7 deletions docs/hooks/post-tool-use.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

The `onPostToolUse` hook is called **after** a tool executes **successfully**. Use it to:

- Transform or filter tool results
- Log tool execution for auditing
- Add context based on results
- Suppress results from the conversation
* Transform or filter tool results
* Log tool execution for auditing
* Add context based on results
* Suppress results from the conversation

> **Failure variant** — `onPostToolUse` only fires for successful tool executions. To observe **failed** tool calls, register `onPostToolUseFailure` (`on_post_tool_use_failure` in Python, `OnPostToolUseFailure` in Go/.NET, `on_post_tool_use_failure` in Rust). The handler receives `{ sessionId, toolName, toolArgs, error, timestamp, workingDirectory }` — the `error` field is a string extracted from the tool's failure result — and may return `{ additionalContext: string }` to inject extra guidance for the model (e.g. retry hints). See the [hooks overview](./hooks-overview.md) for the full list.
> <a id="failure-variant"></a>
Expand Down Expand Up @@ -507,6 +507,6 @@ const session = await client.createSession({

## See also

- [Hooks Overview](./README.md)
- [Pre-Tool Use Hook](./pre-tool-use.md)
- [Error Handling Hook](./error-handling.md)
* [Hooks Overview](./README.md)
* [Pre-Tool Use Hook](./pre-tool-use.md)
* [Error Handling Hook](./error-handling.md)
2 changes: 1 addition & 1 deletion docs/setup/azure-managed-identity.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ await client.stop();

| Variable | Description | Example |
|----------|-------------|---------|
| `AZURE_TOKEN_CREDENTIALS` | When running in **Azure**, set it to `ManagedIdentityCredential`. When running **locally**, set it to either `dev` or a developer tool credential name, such as `AzureCliCredential`. | |
| `AZURE_TOKEN_CREDENTIALS` | When running in **Azure**, set it to `ManagedIdentityCredential`. When running **locally**, set it to either `dev` or a developer tool credential name, such as `AzureCliCredential`. | `ManagedIdentityCredential` |
| `FOUNDRY_RESOURCE_URL` | Your Microsoft Foundry resource URL | `https://<my-resource>.openai.azure.com` |

No API key environment variable is needed—authentication is handled by `DefaultAzureCredential`, which automatically supports:
Expand Down
Loading