github · sunbrye · Jun 25, 2026 · Jul 2, 2026
@@ -8,18 +8,18 @@ Fleet mode is useful when the work can be decomposed before execution and each u
 
 Good fits include:
 
-- Multi-file refactors where each worker owns a file, package, or language SDK.
-- Batch reviews where each worker checks a separate diff, module, or alert group.
-- Parallel research across independent repositories, services, or feature areas.
-- Documentation refreshes where each worker owns a page or topic.
-- Migration tasks where each worker can validate its own slice and report back.
+* Multi-file refactors where each worker owns a file, package, or language SDK.
+* Batch reviews where each worker checks a separate diff, module, or alert group.
+* Parallel research across independent repositories, services, or feature areas.
+* Documentation refreshes where each worker owns a page or topic.
+* Migration tasks where each worker can validate its own slice and report back.
 
 Avoid fleet mode for:
 
-- Sequential tasks where step 2 needs the concrete output from step 1.
-- Tightly coupled edits where workers would contend for the same files.
-- Small tasks that one synchronous sub-agent or the parent agent can finish quickly.
-- Tasks that require continuous shared reasoning rather than clear ownership.
+* Sequential tasks where step 2 needs the concrete output from step 1.
+* Tightly coupled edits where workers would contend for the same files.
+* Small tasks that one synchronous sub-agent or the parent agent can finish quickly.
+* Tasks that require continuous shared reasoning rather than clear ownership.
 
 Fleet mode works best when the parent session can create clear units of work, assign one owner per unit, and define what each worker must return.
 
@@ -321,29 +321,29 @@ Keep plugin-provided sub-agent types narrow and descriptive so the orchestrator
 
 ## Best practices
 
-- Decompose the work into independent units before starting fleet mode.
-- Minimize dependencies between todos; dependencies reduce parallelism.
-- Give each todo a durable ID, a clear title, and a complete description.
-- Make each sub-agent own exactly one todo at a time.
-- Use background sub-agents for truly parallel work.
-- Use synchronous sub-agent calls for serialized steps or validation gates.
-- Provide each sub-agent with complete context; sub-agents are stateless across calls.
-- Include file paths, commands, expected outputs, and constraints in each worker prompt.
-- Do not dispatch a single background sub-agent; prefer a synchronous call or batch multiple workers in parallel.
-- Avoid assigning overlapping files to different workers unless the parent agent will reconcile conflicts explicitly.
-- Require every worker to report what it changed, how it validated the change, and what remains blocked.
-- Have the parent agent verify the combined result after workers finish.
+* Decompose the work into independent units before starting fleet mode.
+* Minimize dependencies between todos; dependencies reduce parallelism.
+* Give each todo a durable ID, a clear title, and a complete description.
+* Make each sub-agent own exactly one todo at a time.
+* Use background sub-agents for truly parallel work.
+* Use synchronous sub-agent calls for serialized steps or validation gates.
+* Provide each sub-agent with complete context; sub-agents are stateless across calls.
+* Include file paths, commands, expected outputs, and constraints in each worker prompt.
+* Do not dispatch a single background sub-agent; prefer a synchronous call or batch multiple workers in parallel.
+* Avoid assigning overlapping files to different workers unless the parent agent will reconcile conflicts explicitly.
+* Require every worker to report what it changed, how it validated the change, and what remains blocked.
+* Have the parent agent verify the combined result after workers finish.
 
 ## Limitations and open questions
 
-- Fleet mode is exposed through generated session RPC bindings and is marked experimental in several SDKs.
-- The SQL todos pattern is the canonical coordination model in the runtime guidance, but whether it is a stable extensibility contract for SDK consumers is still an open question.
-- `subagentStart` and `subagentStop` are runtime hook names; this branch exposes sub-agent lifecycle to SDK consumers through the generic session event stream, not dedicated hook callbacks.
-- Plugin sub-agent registration is configured at the runtime layer through `--plugin-dir`; no SDK-level plugin registration helper was verified on this branch.
-- Java native typed bindings for `session.fleet.start` were not found in the Java SDK source on this branch.
-- Fleet mode does not remove the need for parent-agent review. Parallel workers can produce inconsistent assumptions that the orchestrator must reconcile.
+* Fleet mode is exposed through generated session RPC bindings and is marked experimental in several SDKs.
+* The SQL todos pattern is the canonical coordination model in the runtime guidance, but whether it is a stable extensibility contract for SDK consumers is still an open question.
+* `subagentStart` and `subagentStop` are runtime hook names; this branch exposes sub-agent lifecycle to SDK consumers through the generic session event stream, not dedicated hook callbacks.
+* Plugin sub-agent registration is configured at the runtime layer through `--plugin-dir`; no SDK-level plugin registration helper was verified on this branch.
+* Java native typed bindings for `session.fleet.start` were not found in the Java SDK source on this branch.
+* Fleet mode does not remove the need for parent-agent review. Parallel workers can produce inconsistent assumptions that the orchestrator must reconcile.
 
 ## See also
 
-- [Custom agents and sub-agent orchestration](custom-agents.md)
-- [Hooks](hooks.md)
+* [Custom agents and sub-agent orchestration](custom-agents.md)
+* [Hooks](hooks.md)
@@ -1051,16 +1051,16 @@ const session = await client.createSession({
 
 For full type definitions, input/output field tables, and additional examples for every hook, see the API reference:
 
-- [Hooks Overview](../hooks/hooks-overview.md)
-- [Pre-Tool Use](../hooks/pre-tool-use.md)
-- [Post-Tool Use](../hooks/post-tool-use.md)
-- [User Prompt Submitted](../hooks/user-prompt-submitted.md)
-- [Session Lifecycle](../hooks/session-lifecycle.md)
-- [Error Handling](../hooks/error-handling.md)
+* [Hooks Overview](../hooks/hooks-overview.md)
+* [Pre-Tool Use](../hooks/pre-tool-use.md)
+* [Post-Tool Use](../hooks/post-tool-use.md)
+* [User Prompt Submitted](../hooks/user-prompt-submitted.md)
+* [Session Lifecycle](../hooks/session-lifecycle.md)
+* [Error Handling](../hooks/error-handling.md)
 
 ## See also
 
-- [Getting Started](../getting-started.md)
-- [Custom Agents & Sub-Agent Orchestration](./custom-agents.md)
-- [Streaming Session Events](./streaming-events.md)
-- [Debugging Guide](../troubleshooting/debugging.md)
+* [Getting Started](../getting-started.md)
+* [Custom Agents & Sub-Agent Orchestration](./custom-agents.md)
+* [Streaming Session Events](./streaming-events.md)
+* [Debugging Guide](../troubleshooting/debugging.md)
@@ -150,7 +150,7 @@ Create `index.ts`:
 import { CopilotClient } from "@github/copilot-sdk";
 
 const client = new CopilotClient();
-const session = await client.createSession({ model: "gpt-4.1" });
+const session = await client.createSession({ model: "auto" });
 
 const response = await session.sendAndWait({ prompt: "What is 2 + 2?" });
 console.log(response?.data.content);
@@ -181,7 +181,7 @@ async def main():
     client = CopilotClient()
     await client.start()
 
-    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1")
+    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto")
     response = await session.send_and_wait("What is 2 + 2?")
     print(response.data.content)
 
@@ -223,7 +223,7 @@ func main() {
 	}
 	defer client.Stop()
 
-	session, err := client.CreateSession(ctx, &copilot.SessionConfig{Model: "gpt-4.1"})
+	session, err := client.CreateSession(ctx, &copilot.SessionConfig{Model: "auto"})
 	if err != nil {
 		log.Fatal(err)
 	}
@@ -304,7 +304,7 @@ using GitHub.Copilot;
 await using var client = new CopilotClient();
 await using var session = await client.CreateSessionAsync(new SessionConfig
 {
-    Model = "gpt-4.1",
+    Model = "auto",
     OnPermissionRequest = PermissionHandler.ApproveAll
 });
 
@@ -337,7 +337,7 @@ public class HelloCopilot {
 
             var session = client.createSession(
                 new SessionConfig()
-                    .setModel("gpt-4.1")
+                    .setModel("auto")
                     .setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
             ).get();
 
@@ -383,7 +383,7 @@ import { CopilotClient } from "@github/copilot-sdk";
 
 const client = new CopilotClient();
 const session = await client.createSession({
-    model: "gpt-4.1",
+    model: "auto",
     streaming: true,
 });
 
@@ -419,7 +419,7 @@ async def main():
     client = CopilotClient()
     await client.start()
 
-    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1", streaming=True)
+    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto", streaming=True)
 
     # Listen for response chunks
     def handle_event(event):
@@ -466,7 +466,7 @@ func main() {
 	defer client.Stop()
 
 	session, err := client.CreateSession(ctx, &copilot.SessionConfig{
-		Model:     "gpt-4.1",
+		Model:     "auto",
 		Streaming: copilot.Bool(true),
 	})
 	if err != nil {
@@ -562,7 +562,7 @@ using GitHub.Copilot;
 await using var client = new CopilotClient();
 await using var session = await client.CreateSessionAsync(new SessionConfig
 {
-    Model = "gpt-4.1",
+    Model = "auto",
     OnPermissionRequest = PermissionHandler.ApproveAll,
     Streaming = true,
 });
@@ -602,7 +602,7 @@ public class HelloCopilot {
 
             var session = client.createSession(
                 new SessionConfig()
-                    .setModel("gpt-4.1")
+                    .setModel("auto")
                     .setStreaming(true)
                     .setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
             ).get();
@@ -912,7 +912,7 @@ const getWeather = defineTool("get_weather", {
 
 const client = new CopilotClient();
 const session = await client.createSession({
-    model: "gpt-4.1",
+    model: "auto",
     streaming: true,
     tools: [getWeather],
 });
@@ -968,7 +968,7 @@ async def main():
     client = CopilotClient()
     await client.start()
 
-    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1", streaming=True, tools=[get_weather])
+    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto", streaming=True, tools=[get_weather])
 
     def handle_event(event):
         if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
@@ -1045,7 +1045,7 @@ func main() {
 	defer client.Stop()
 
 	session, err := client.CreateSession(ctx, &copilot.SessionConfig{
-		Model:     "gpt-4.1",
+		Model:     "auto",
 		Streaming: copilot.Bool(true),
 		Tools:     []copilot.Tool{getWeather},
 	})
@@ -1185,7 +1185,7 @@ var getWeather = CopilotTool.DefineTool(
 
 await using var session = await client.CreateSessionAsync(new SessionConfig
 {
-    Model = "gpt-4.1",
+    Model = "auto",
     OnPermissionRequest = PermissionHandler.ApproveAll,
     Streaming = true,
     Tools = [getWeather],
@@ -1259,7 +1259,7 @@ public class HelloCopilot {
 
             var session = client.createSession(
                 new SessionConfig()
-                    .setModel("gpt-4.1")
+                    .setModel("auto")
                     .setStreaming(true)
                     .setTools(List.of(getWeather))
                     .setOnPermissionRequest(PermissionHandler.APPROVE_ALL)
@@ -1316,7 +1316,7 @@ const getWeather = defineTool("get_weather", {
 
 const client = new CopilotClient();
 const session = await client.createSession({
-    model: "gpt-4.1",
+    model: "auto",
     streaming: true,
     tools: [getWeather],
 });
@@ -1389,7 +1389,7 @@ async def main():
     client = CopilotClient()
     await client.start()
 
-    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="gpt-4.1", streaming=True, tools=[get_weather])
+    session = await client.create_session(on_permission_request=PermissionHandler.approve_all, model="auto", streaming=True, tools=[get_weather])
 
     def handle_event(event):
         if event.type == SessionEventType.ASSISTANT_MESSAGE_DELTA:
@@ -1482,7 +1482,7 @@ func main() {
 	defer client.Stop()
 
 	session, err := client.CreateSession(ctx, &copilot.SessionConfig{
-		Model:     "gpt-4.1",
+		Model:     "auto",
 		Streaming: copilot.Bool(true),
 		Tools:     []copilot.Tool{getWeather},
 	})
@@ -1671,7 +1671,7 @@ var getWeather = CopilotTool.DefineTool(
 await using var client = new CopilotClient();
 await using var session = await client.CreateSessionAsync(new SessionConfig
 {
-    Model = "gpt-4.1",
+    Model = "auto",
     OnPermissionRequest = PermissionHandler.ApproveAll,
     Streaming = true,
     Tools = [getWeather]
@@ -1765,7 +1765,7 @@ public class WeatherAssistant {
 
             var session = client.createSession(
                 new SessionConfig()
-                    .setModel("gpt-4.1")
+                    .setModel("auto")
                     .setStreaming(true)
                     .setOnPermissionRequest(request ->
                         CompletableFuture.completedFuture(PermissionDecision.allow())

@@ -2,10 +2,10 @@
 
 The `onPostToolUse` hook is called **after** a tool executes **successfully**. Use it to:
 
-- Transform or filter tool results
-- Log tool execution for auditing
-- Add context based on results
-- Suppress results from the conversation
+* Transform or filter tool results
+* Log tool execution for auditing
+* Add context based on results
+* Suppress results from the conversation
 
 > **Failure variant** — `onPostToolUse` only fires for successful tool executions. To observe **failed** tool calls, register `onPostToolUseFailure` (`on_post_tool_use_failure` in Python, `OnPostToolUseFailure` in Go/.NET, `on_post_tool_use_failure` in Rust). The handler receives `{ sessionId, toolName, toolArgs, error, timestamp, workingDirectory }` — the `error` field is a string extracted from the tool's failure result — and may return `{ additionalContext: string }` to inject extra guidance for the model (e.g. retry hints). See the [hooks overview](./hooks-overview.md) for the full list.
 > <a id="failure-variant"></a>
@@ -507,6 +507,6 @@ const session = await client.createSession({
 
 ## See also
 
-- [Hooks Overview](./README.md)
-- [Pre-Tool Use Hook](./pre-tool-use.md)
-- [Error Handling Hook](./error-handling.md)
+* [Hooks Overview](./README.md)
+* [Pre-Tool Use Hook](./pre-tool-use.md)
+* [Error Handling Hook](./error-handling.md)
@@ -233,7 +233,7 @@ await client.stop();
 
 | Variable | Description | Example |
 |----------|-------------|---------|
-| `AZURE_TOKEN_CREDENTIALS` | When running in **Azure**, set it to `ManagedIdentityCredential`. When running **locally**, set it to either `dev` or a developer tool credential name, such as `AzureCliCredential`. | |
+| `AZURE_TOKEN_CREDENTIALS` | When running in **Azure**, set it to `ManagedIdentityCredential`. When running **locally**, set it to either `dev` or a developer tool credential name, such as `AzureCliCredential`. | `ManagedIdentityCredential` |
 | `FOUNDRY_RESOURCE_URL` | Your Microsoft Foundry resource URL | `https://<my-resource>.openai.azure.com` |
 
 No API key environment variable is needed—authentication is handled by `DefaultAzureCredential`, which automatically supports: