Build a Duplicate row action on the policies page, and pin four scenarios across the surface by rajshah4 · Pull Request #1124 · RhysSullivan/executor

rajshah4 · 2026-06-25T02:18:51Z

This PR is an OpenHands QA-loop submission for @RhysSullivan's autonomous QA idea: the agent made a real product change, converted the behavior into repo E2E scenarios, ran the same scenarios against cloud and selfhost targets, and attached video and trace artifacts so the result can be reviewed from GitHub without cloning locally.

What this adds

A Duplicate row action on the policies page that copies a rule's pattern, action, and owner into the add form, focuses the pattern input with its content selected, and lets a tweaked clone ship in one keystroke. Plus four scenarios in e2e/scenarios/: three covering existing policies behavior the suite did not yet pin, and one pinning the new Duplicate flow end to end.

Watch the harness exercise the new feature

A 17-second recording (E2E_FILM=1 slow-mo, 400 ms per Playwright action) of the policies-duplicate scenario walking the new Duplicate row action end to end, add a Block rule, click Duplicate on its row, see the form prefill with the source's pattern + action + focus, tweak the pattern to a sibling, see both rows in the list.

If the inline player does not render in your client: direct download. The full Playwright trace is also attached, drag it into trace.playwright.dev for the time-travel DOM + network viewer.

How this PR maps to the six requirements

Ask	This PR
Real developer tools (chrome, terminal)	Standard sandbox: bash, the `gh` CLI, a code editor, the e2e harness's Chromium. Nothing else, no specialized plugins.
Develops the app, then turns the behavior into tests	Commit `48cc867b` adds a Duplicate row action to the policies page (the development). The same commit lands `policies-duplicate.test.ts` (the test that pins it). The video above is the test exercising the feature against the running app.
Multiple targets, same tests	All four scenarios live in `e2e/scenarios/` and run on cloud + selfhost. Verified 8/8.
Different OS environments	Browser + API only in v1, a CLI scenario hitting `cli-macos.config.ts` / `cli-linux` / `cli-windows` is the obvious follow-up. Out of scope for this PR; the loop's authoring shape is identical.
Open source + run locally	Built with the OpenHands SDK, open source. `pip install openhands-sdk openhands-tools`, point at any repo, hand it that repo's `AGENTS.md`, the same loop runs against any target.
Videos playable to see the output	The harness's own `session.mp4` (the `E2E_FILM=1` slow-mo pace `e2e/src/surfaces/browser.ts` already supports), embedded above. The trace.zip carries every DOM mutation.

And the meta-requirement, "verify the work without running anything locally, purely by looking at the e2e test and test output": every artifact you need is on this page, the commit diff is the feature, the test source is the spec, the video is the proof, the QA Report comment below is the run. No clone required.

The develop-then-test loop, as four commits

b4b6fa65, Cover the policies surface with lifecycle, landing, and UI round-trip scenarios. Three scenarios for existing behavior, filling gaps the suite did not yet pin:

policies-lifecycle.test.ts, typed API CRUD round-trip. The existing policies.test.ts covers create + list; this adds update (both full-payload and the action-only partial the row badge's handleUpdate actually sends) and remove, plus an assertion that create returns a non-empty position (the fractional-indexing key the page's sort and reorder math both depend on).
policies-landing.test.ts, a fresh workspace renders the explainer empty state with a gated add form. Asserts on the disabled attribute's value rather than .toBe(false), so a regression that ungates the submit prints the actual element state.
policies-round-trip.test.ts, the page is an authoring surface: add a rule through the form, flip its action via the row badge select, remove it via the row's overflow menu, watch the empty state return. Server-side policies.list reads zero rows for the suffix afterwards, so the UI remove isn't optimistic-cache fiction.

48cc867b, Add a Duplicate row action to the policies page, pinned by its scenario. The development half. The row's overflow menu went from Move up / Move down / Remove to Move up / Move down / Duplicate / Remove. Duplicate copies the source row's pattern + action + owner into the add form, focuses the pattern input with its content selected, lets the user save a tweaked clone in one keystroke. policies-duplicate.test.ts lands in the same commit and pins the surface end to end.

Mechanics worth flagging:

The form's prefill carries a monotonic counter as its nonce, so duplicating the SAME row twice always re-syncs the form, even within the same millisecond, and the closure-over-latest pattern reads the prior counter through the functional setPrefill(prev => ...) updater. (The first cut used Date.now(); the review pass below caught the same-ms collision and the fix lands in 5673a592.)
The row's DropdownMenuContent passes onCloseAutoFocus={(e) => e.preventDefault()} so Radix's default trigger-refocus doesn't yank focus out of the input the Duplicate flow is sending it to. The trigger is opacity-0 until hover, so nothing visual is lost.
A policy_duplicated analytics event lands in the catalog. It fires on the prefill (before the user submits), Duplicate is purely a client-side form-fill today, with the eventual server write going through the existing policies.create path.

5673a592, Address findings from a separate review pass. Four mechanical findings from a separate read of the previous two commits against e2e/AGENTS.md: the same-ms nonce collision (above); a redundant useEffect dep; a boolean-shaped focus assertion in policies-duplicate (now value-shaped, reads the focused element's id); and policies-round-trip's overflow click aligned with policies-duplicate's hover-first pattern so the harness uses one mechanic for one UX surface. Full review markdown in the first comment below.

4cf09900, Settle the network after Add policy before opening the row's dropdown. A flake found during a stability sweep: when a form-submit lands a new row via optimistic render, Radix's pointer listeners on the row's dropdown triggers only bind cleanly on the post-confirm re-render. A fast hover+click between the optimistic mount and the confirm landed on a stale, listener-less trigger and the dropdown silently no-opped (~1 in 6 batched runs on a cold selfhost). The e2e/AGENTS.md style guide already prescribes the fix ("settle the network before opening menus"): await page.waitForLoadState("networkidle") between the submit and the next dropdown interaction, in both policies-duplicate and policies-round-trip. Verified across 6 batched cycles (24 scenario runs, all green).

The develop-then-test loop

Three roles, author, reviewer, verifier, each its own conversation against the OpenHands SDK:

Author (b4b6fa65, 48cc867b), reads e2e/AGENTS.md, two existing scenarios as style baseline, the API definitions, and the route component. Drafts the four scenarios, builds the Duplicate feature, verifies 8/8 (4 scenarios × 2 targets).
Reviewer (first comment below) runs the code-review skill: fresh read of the author's diff against the style guide and the existing surfaces. Surfaces six findings; the four mechanical ones land in commit 5673a592. The two non-mechanical ones (a coverage follow-up + a flag on a pre-existing leak in policies.test.ts) are noted in the review.
Verifier (second comment below) runs the qa-changes skill: exercises the fixed code on both targets, captures exit codes + per-scenario timing + the artifact inventory, returns a structured QA Report.

OpenHands ships these as composable primitives, the SDK as runtime, skills as role prompts, Automations as cron and webhook triggers for the recurring half. This PR composes them for the proactive direction (agent develops a feature + writes its test in one loop) against the executor harness. The same primitives compose other directions too, a reactive direction (PR opens, agent generates tests) is its own pattern; neither is canonical, both are valid uses of the platform.

Verification

cd e2e
bun run test:cloud scenarios/policies-lifecycle.test.ts scenarios/policies-landing.test.ts \
                   scenarios/policies-round-trip.test.ts scenarios/policies-duplicate.test.ts
bun run test:selfhost scenarios/policies-lifecycle.test.ts scenarios/policies-landing.test.ts \
                      scenarios/policies-round-trip.test.ts scenarios/policies-duplicate.test.ts

→ 4/4 pass on cloud, 4/4 pass on selfhost, across 3 cold-boot batched cycles per target. bunx oxfmt --check, bunx oxlint -c .oxlintrc.jsonc --deny-warnings, and bun run typecheck (full root, 41 packages) are clean.

Each browser scenario produces step screenshots, session.mp4, and a trace.zip under e2e/runs/<target>/<scenario-slug>/.

Toolchain

Built on the OpenHands SDK (open source) in a standard sandbox with bash, gh, a code editor, and the e2e harness's Chromium. The sandbox was handed nothing but the repo URL and e2e/AGENTS.md. The reviewer and verifier roles loaded skills from OpenHands/extensions (linked above).

Open questions for review

Feature wanted? The Duplicate action lives next to Move up / Move down / Remove because the overflow menu's other clone-adjacent affordance is "retype the pattern by hand". If you would rather skip the feature and only merge the three existing-behavior scenarios (commit b4b6fa65), that is the natural split point.
Coverage gap flagged in the review (finding 5): the new scenario exercises an org-owned source. A user-owned source on a multi-owner host (where the Applies to select is visible) would close the owner-propagation gap. Happy to add a second scenario in this PR or a follow-up, your call on shape.
Different OS environments (your 4th requirement): browser + API verification only in v1. A CLI scenario hitting cli-macos.config.ts / cli-linux / cli-windows is the obvious follow-up, the same authoring shape applies. Out of scope here unless you want it pulled in.
Pre-existing leak flagged in the review (finding 6): e2e/scenarios/policies.test.ts creates a policy with the static pattern policies-scn.* and never removes it (selfhost leak). Not in this diff; flagging in case you want it fixed separately.

… scenarios The existing policies.test.ts pins create and list only. The existing policies-ui.test.ts authors rules from the integration's tool tree, not from the policies page itself. Three gaps fill in here: - policies-lifecycle: the typed API's update (full and partial payloads) and remove, plus an assertion that create returns a non-empty fractional- indexing position (the field the page's sort and reorder math both lean on). - policies-landing: a fresh workspace renders the explainer empty state with a gated add form. The disabled attribute is read as its value so a regression that ungates the submit prints the actual state. - policies-round-trip: the page itself is an authoring surface, add a rule, flip the action via the row badge, remove via the overflow menu, watch the empty state return. Server-side list reads zero rows for the suffix after, so the UI remove isn't just optimistic-cache fiction. Each scenario carries a per-run pattern suffix and an Effect.ensuring finalizer that removes any row carrying it via the typed API, so a mid-test failure on selfhost (shared bootstrap admin) can't leak state.

The row's overflow menu had Move up / Move down / Remove. Cloning a rule to a sibling pattern meant retyping the pattern and re-picking the action from a still-loaded form. The new Duplicate item copies the source row's pattern, action, and owner into the add form, focuses the pattern input with its content selected, and lets the user save the clone in one keystroke. Mechanics: - AddPolicyForm gains an optional `prefill` prop (pattern + action + nonce). The nonce is a monotonic counter, not the source id, so duplicating the SAME row twice still re-syncs the form, patterns commonly get tweaked by one character between clicks. - The focus + select call runs in a setTimeout(0) inside the effect, which lands AFTER React's commit. The row's DropdownMenuContent also passes `onCloseAutoFocus={(e) => e.preventDefault()}` so Radix's default restore-focus-to-trigger doesn't yank focus out of the input the Duplicate flow is sending it to. The trigger is opacity-0 until hover anyway, so nothing visual is lost by skipping the restoration. - A `policy_duplicated` analytics event is added to the catalog. It fires on the prefill, BEFORE the user submits, Duplicate is purely a client- side form-fill today, with the eventual server write going through the existing `policies.create` path. The scenario walks the surface end-to-end: add a Block rule through the form, click Duplicate on its row, assert the form carries the pattern + the Block action, assert the pattern input is the document's active element, edit the pattern to a sibling, submit, see both rows. Server-side policies.list confirms both rules persisted with the source's action.

Four findings from a separate review pass on the previous two commits, applied here: - Replace `Date.now()` with a monotonic counter for the prefill nonce. Two Duplicate clicks within the same millisecond produced identical nonces and the form's useEffect comparator saw no change, a silent no-op. The functional state updater reads the prior counter so the new value is always strictly greater than the last. - Trim the prefill useEffect's dependency array to the nonce alone. The prefill object identity always changes with the nonce, so depending on both was redundant. Adds an oxlint exhaustive-deps suppression with a comment explaining the closure-over-latest pattern. - Make `policies-duplicate`'s focus assertion value-shaped per e2e/AGENTS.md. Was `.toBe(true)` behind an evaluate returning boolean, a regression printed bare `false` instead of the actual focused element. Now reads `document.activeElement?.id` and asserts on the input's id (`policy-pattern`). - Align `policies-round-trip`'s overflow click with `policies-duplicate`'s hover-first pattern (and add an explicit `dropdown-menu-content` waitFor before targeting the menu item). Removes the last `force: true` click in the diff and matches how a real user reaches the menu. Follow-up flagged in the review (not in this commit): policies-duplicate exercises an org-owned source on an org-only host. A second scenario with a user-owned source on a multi-owner host would close the owner-propagation coverage gap. Verified 4/4 pass on cloud and 4/4 pass on selfhost. oxfmt, oxlint --deny-warnings, and bun run typecheck (e2e + packages/react) are all clean.

policies-duplicate and policies-round-trip both submit a rule through the add form, then open a dropdown on the resulting row (the overflow menu in duplicate, the badge select and overflow menu in round-trip). The optimistic render attaches the row immediately, but Radix's pointer listeners on the row's triggers only bind cleanly on the post-confirm re-render. A fast hover+click between the optimistic mount and the confirm landed on a stale, listener-less trigger and silently no-opped, so the menuitem wait timed out 30 seconds later. The flake was rare on warm boots (it hit roughly 1 in 6 batched runs) and never on a single-scenario re-run. The e2e style guide already calls this out: "After an action that navigates, wait for the URL/network to settle before opening menus." The Add policy submit is structurally the same situation. Adds a `page.waitForLoadState("networkidle")` in both scenarios immediately after the submit, before the row's dropdown is touched. Verified across 3 batched cycles on cloud + 3 batched cycles on selfhost (24 scenario runs, all green), plus the four standalone runs the prior commit already passed.

rajshah4 · 2026-06-25T02:19:04Z

🟡 Review pass, Acceptable, MEDIUM risk

Independent re-read of commits b4b6fa65 and 48cc867b against e2e/AGENTS.md, the existing policies surface (policies.tsx, policy-display.ts, packages/core/api/src/policies/api.ts), and three baseline scenarios (policies.test.ts, policies-ui.test.ts, connect-handoff.test.ts). Six findings, four code, one coverage, one process. The first four are addressed in 5673a592.

Findings

1 · Date.now() as the prefill nonce can collide on rapid double-clicks
packages/react/src/pages/policies.tsx, the handleDuplicate callback. Two clicks within the same millisecond produce identical nonces; the form's useEffect dep comparator sees no change and the prefill silently no-ops. Real edge case for "I clicked Duplicate twice by accident, did either fire?" Fix: monotonic counter via the functional setPrefill(prev => ...) updater. → addressed in 5673a592.

2 · useEffect dependency array is redundantly noisy
Same file, the prefill effect. [prefillNonce, props.prefill] always change together, so the second entry is dead weight. Cleaner to depend on the nonce alone and read pattern/action via closure-over-latest at the time the nonce changes, with an explicit exhaustive-deps suppression + a comment documenting the pattern (so the next reader doesn't trip over the lint rule). → addressed in 5673a592.

3 · policies-duplicate's focus check is boolean-shaped, violates the "values not booleans" rule
e2e/scenarios/policies-duplicate.test.ts step "The pattern input is focused…". The evaluate returned boolean and the assertion was .toBe(true). A regression where focus lands on the wrong element printed bare false, the actual focused id (which is the diagnostic signal) was lost. Fix: have the evaluate return (document.activeElement as HTMLElement | null)?.id ?? "" and assert .toBe("policy-pattern"). Now a regression prints the actual id of whatever stole focus. → addressed in 5673a592.

4 · policies-round-trip's overflow-menu click uses force: true; policies-duplicate uses hover() + normal click
Same diff, two scenarios, two different mechanics for the same UX surface. The hover-first pattern is closer to what a real user does, and force: true masks regressions where the trigger genuinely became unreachable. Align round-trip to the hover-first pattern (and add an explicit dropdown-menu-content waitFor before targeting the menu item, since Radix portals the content out of the row scope). → addressed in 5673a592.

5 · Coverage gap (NOT addressed): owner propagation through Duplicate is only tested implicitly
policies-duplicate exercises an org-owned source. The form's "Applies to" select is checked only via the server-side policies.list assertion at the end, which catches owner mismatch but not the in-form state. A second scenario with a user-owned source on a multi-owner host (where the Applies to select is visible) would close the gap. Out of scope for this PR; flagging as a follow-up.

6 · Pre-existing leak in e2e/scenarios/policies.test.ts (NOT in this diff)
The existing scenario creates a policy with the static pattern policies-scn.* and never removes it. On selfhost this leaks across runs. Not introduced by this PR, but worth flagging for whoever owns the surface, the cleanup pattern in this PR's new scenarios is the obvious template (per-run suffix + Effect.ensuring finalizer).

What was checked

All 6 files in the diff read top to bottom against the style guide.
Locator strategy: role-based, scoped via data-slot, no positional last() where a portaled element could disambiguate.
Cleanup hygiene: Effect.ensuring finalizer in every scenario; cleanup-by-pattern-prefix so a mid-test failure can't leak rows.
Assertion shape: value-shaped per e2e/AGENTS.md (the boolean-shape miss above was the only exception; now fixed).
Spec-shaped scenario names: yes (Policies · … for all four).
Em-dash usage: matches the target's style guide.
No app internals imported by tests, no product code modified outside the Duplicate feature.
No AI attribution in scenario bodies or in the merged source.

Verdict

🟡 Acceptable, MEDIUM risk. The four code findings are real but mechanical; none required changing the product's contract or the test's intent. The coverage gap (5) is a follow-up. The pre-existing leak (6) is out of scope.

Produced as a separate review pass on the author's commits, with a fresh reading of the style guide and the diff. Findings 1–4 land in commit 5673a592; 5 and 6 are notes for follow-up. The QA verification of the fixed code is in the next comment.

rajshah4 · 2026-06-25T02:19:05Z

✅ QA Report, PASS

Independent verification of the fixed code on commit 5673a592. The four scenarios were run on cloud and selfhost from a clean run dir. All eight runs (4 scenarios × 2 targets) pass.

Run summary

target	scenario	result	duration	session.mp4	trace.zip
cloud	policies-lifecycle	✅ PASS	107 ms	– (API-only)	–
cloud	policies-landing	✅ PASS	1,982 ms	40 KB	829 KB
cloud	policies-round-trip	✅ PASS	2,840 ms	47 KB	1.8 MB
cloud	policies-duplicate	✅ PASS	4,381 ms	51 KB	2.6 MB
selfhost	policies-lifecycle	✅ PASS	185 ms	– (API-only)	–
selfhost	policies-landing	✅ PASS	1,348 ms	57 KB	663 KB
selfhost	policies-round-trip	✅ PASS	1,767 ms	65 KB	1.6 MB
selfhost	policies-duplicate	✅ PASS	3,015 ms	79 KB	2.0 MB

Cloud: 4/4 pass, 20.21 s total (9.31 s in tests, rest in dev-server boot + transform).
Selfhost: 4/4 pass, 10.19 s total (6.32 s in tests).

Static gates

bunx oxfmt --check                              → 0 warnings
bunx oxlint -c .oxlintrc.jsonc --deny-warnings  → 0 warnings, 0 errors
cd e2e && bun run typecheck                     → clean (tsc --noEmit, exit 0)
cd packages/react && bun run typecheck          → clean (tsgo --noEmit, exit 0)

The one pre-existing suggestion in packages/react/src/routes/resume.$executionId.tsx (a JSON.parse hint) is unrelated to this PR and was present before the diff.

Per-browser-scenario artifacts (cloud)

Each browser scenario produces step screenshots, session.mp4, and a full trace.zip under e2e/runs/cloud/<scenario-slug>/. The policies-duplicate run dir:

00-open-the-policies-page-and-add-a-block-rule.png             66 KB
01-open-the-row-s-overflow-menu-and-click-duplicate.png        66 KB
02-the-form-prefilled-with-the-source-pattern-and-action.png   66 KB
03-the-pattern-input-is-focused-ready-to-be-tweaked.png        66 KB
04-tweak-the-pattern-and-submit-the-copy.png                   72 KB
05-both-rows-appear-in-the-rendered-list.png                   72 KB
session.mp4                                                   147 KB (E2E_FILM=1, 17 s slow-mo)
trace.zip                                                     6.7 MB

The E2E_FILM=1 slow-mo recording is the headline asset on the openhands-qa-demo-v2 release and embedded in the PR description.

What was verified

All four scenarios pin the user-visible product guarantees their names advertise (read the test source, each is structured as one user journey with step labels that are user actions).
The Duplicate feature works end to end on the cloud and selfhost targets, both render the row, both surface the menu item, both prefill the form, both focus the input by id, both persist the duplicated policy server-side.
The review fixes in 5673a592 did not regress any of the previously-passing scenarios.
All Effect.ensuring finalizers reach their cleanup (each scenario's policies.list at the end reads zero rows carrying the scenario's per-run suffix).

Verdict

✅ PASS. Branch openhands/qa-loop at 5673a592 is mergeable from a verification standpoint. The static gates and runtime gates are all green on both targets.

Produced from a fresh local run of the harness against the committed code in the same sandbox the author used (same shell, same bun, same Playwright Chromium). Independence comes from running the harness against the committed code and reading the run dir, not from sharing the author's intent.

greptile-apps · 2026-06-25T02:24:21Z

Greptile Summary

This PR adds a Duplicate row action to the policies page — copying a rule's pattern, action, and owner into the add form with focus + selection on the pattern input — and lands four new E2E scenarios (policies-lifecycle, policies-landing, policies-round-trip, policies-duplicate) covering existing behavior gaps plus the new feature end-to-end.

The product implementation uses a monotonic-nonce prefill prop on AddPolicyForm to guarantee re-sync on repeat duplicates, a closeReasonRef in PolicyRow to selectively suppress Radix's focus-restoration only for the Duplicate path (keyboard navigation for Move up/down is preserved), and a policy_duplicated analytics event that fires on prefill.
The four scenarios follow the e2e style guide (Effect cleanup finalizers, value-shaped assertions, hover-first dropdown pattern, networkidle settle before menu interactions) and run on both cloud and selfhost targets.
A minor flakiness risk exists in policies-duplicate.test.ts: the copy submit's waitForLoadState(\"networkidle\") is missing before the browser session ends, so the immediate server-side policies.list() assertion can race the POST on slow selfhost — the same pattern the PR already fixed for the first submit's dropdown interaction.

Confidence Score: 5/5

Safe to merge; the product code is correct and the previous keyboard-navigation concern has been addressed with closeReasonRef.

The Duplicate feature implementation is well-structured: the nonce counter prevents same-ms collision, closeReasonRef correctly scopes focus-restoration suppression to Duplicate only, and the form's useEffect deps are clean primitives with no linter suppression. The four scenarios follow the e2e style guide. The one open issue — missing waitForLoadState after the copy submit in policies-duplicate.test.ts — is a test reliability concern on slow selfhost, not a product defect.

e2e/scenarios/policies-duplicate.test.ts and e2e/scenarios/policies-round-trip.test.ts both have spots where a waitForLoadState("networkidle") would close a potential race before the server-side assertion or the overflow menu open.

Important Files Changed

Filename	Overview
e2e/scenarios/policies-duplicate.test.ts	New scenario pins the Duplicate row action end-to-end; well-structured with cleanup, value-shaped assertions, and proper networkidle settle before the first dropdown interaction — but the second submit (the copy) is missing networkidle before the browser session ends, leaving the subsequent server-side policies.list() assertion susceptible to a race on slow targets.
e2e/scenarios/policies-landing.test.ts	New scenario pins the empty-state landing page; correctly scopes assertions to the card-stack content area and reads the disabled attribute value rather than a boolean.
e2e/scenarios/policies-lifecycle.test.ts	New API-level CRUD scenario; closes gaps in the existing suite by adding update (full and partial), remove, and a position-non-empty assertion. Cleanup finalizer correctly removes all prefix-matched rows.
e2e/scenarios/policies-round-trip.test.ts	New browser scenario covering add → badge flip → remove; follows the hover-first dropdown pattern from the style guide, but is missing a waitForLoadState("networkidle") after the badge flip before the overflow menu open, the same race the PR diagnosed and fixed for the Add path.
packages/react/src/pages/policies.tsx	Adds Duplicate row action: prefill prop + nonce-based useEffect on AddPolicyForm, closeReasonRef on PolicyRow to selectively suppress Radix focus-restoration only for Duplicate, handleDuplicate in PoliciesPage with monotonic counter. Implementation is correct and the previous P1 (blanket onCloseAutoFocus suppress) is properly addressed.
packages/react/src/api/analytics.tsx	Adds policy_duplicated event to the analytics type catalog and converts several em-dash comment separators to commas; no logic changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User clicks Duplicate on row] --> B[handleDuplicate called]
    B --> C[setTargetOwner - sync owner]
    B --> D["setPrefill - nonce++, pattern, action"]
    B --> E[trackEvent policy_duplicated]
    D --> F[AddPolicyForm re-renders with new prefill prop]
    F --> G["useEffect fires - prefillNonce changed"]
    G --> H[setPattern + setAction from prefill]
    G --> I[setTimeout 0 - focus + select pattern input]
    I --> J[Radix close fires setTimeout 0]
    J --> K[closeReasonRef = duplicate - e.preventDefault in onCloseAutoFocus]
    K --> L[Pattern input focused with content selected]
    L --> M[User tweaks pattern, clicks Add]
    M --> N[policies.create - new row appears]

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[User clicks Duplicate on row] --> B[handleDuplicate called]
    B --> C[setTargetOwner - sync owner]
    B --> D["setPrefill - nonce++, pattern, action"]
    B --> E[trackEvent policy_duplicated]
    D --> F[AddPolicyForm re-renders with new prefill prop]
    F --> G["useEffect fires - prefillNonce changed"]
    G --> H[setPattern + setAction from prefill]
    G --> I[setTimeout 0 - focus + select pattern input]
    I --> J[Radix close fires setTimeout 0]
    J --> K[closeReasonRef = duplicate - e.preventDefault in onCloseAutoFocus]
    K --> L[Pattern input focused with content selected]
    L --> M[User tweaks pattern, clicks Add]
    M --> N[policies.create - new row appears]

_{Reviews (2): Last reviewed commit: "Scope onCloseAutoFocus to Duplicate so M..." | Re-trigger Greptile}

The row's dropdown menu had `onCloseAutoFocus={(e) => e.preventDefault()}` applied to the shared `DropdownMenuContent`, which fires for every item. That works for Duplicate (which sends focus to the form's pattern input) and is benign for Remove (the row unmounts, the default refocus is moot either way), but it drops focus to `<body>` after Move up or Move down, where the trigger persists in the DOM and a keyboard user needs Radix's default trigger-refocus to land somewhere useful. A `closeReasonRef` set inside Duplicate's `onClick` and checked in `onCloseAutoFocus` lets the suppression scope to the Duplicate case only, restoring keyboard focus restoration for Move up / Move down. Also rewrites the `useEffect` that copies the prefill into the form to include `prefillPattern` and `prefillAction` in its dep array, dropping the `react-hooks/exhaustive-deps` disable. The parent constructs a new prefill object on every Duplicate click, so the three primitives always change together, the effect fires the same number of times, and the new deps array reads cleanly without the disable comment.

rajshah4 · 2026-06-25T02:52:14Z

Thanks for the review. All three findings looked at, two applied in commit e1b35411.

P1, onCloseAutoFocus over-broad: fixed. A closeReasonRef set inside Duplicate's onClick and checked inside onCloseAutoFocus scopes the suppression to the Duplicate case only. Move up / Move down now keep Radix's default trigger-refocus (which is what a keyboard user needs since the trigger persists). Remove was already moot either way, so it stays on the default path too.

P2, misleading exhaustive-deps comment: fixed. You were right that prefillPattern / prefillAction are primitives and that depending on them is the cleaner shape. Added the two primitives to the dep array and dropped the disable + the comment that justified it. The parent constructs a new prefill object on every Duplicate click, so the three primitives change together and the effect fires the same number of times.

P2, missing waitForLoadState("networkidle") after the badge flip: not applied, and the reason is worth flagging. I tried this first and reran 6 batched cycles, three per target. Cloud regressed: three of three cycles had policies-round-trip time out, all on the click after the new networkidle wait. Selfhost stayed green because its dev server has nothing publishing in the background. Cloud has telemetry endpoints whose response cadence apparently prevents the page from ever hitting the 500ms-idle window, so the wait consumed the whole 30-second test budget before the next click could run.

Reverting that one change took stability back to 6/6 batched cycles, three per target, with the other two findings still applied. The existing UI assertion that polls the badge textContent for the new action (expect.poll(rowBadge.textContent).toContain("Always run")) is already serving as the settle point between the PATCH landing and the next overflow open. The Add Policy step still uses networkidle because the optimistic mount before that point has a different timing profile (no rendered row to assert on at all yet).

Happy to revisit if you have a settle pattern that works on cloud's background traffic. waitForResponse against the policies endpoint would be tighter than networkidle but couples the test to the URL shape, which the style guide warns against.

openhands-agent added 4 commits June 24, 2026 19:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build a Duplicate row action on the policies page, and pin four scenarios across the surface#1124

Build a Duplicate row action on the policies page, and pin four scenarios across the surface#1124
rajshah4 wants to merge 5 commits into
RhysSullivan:mainfrom
rajshah4:openhands/qa-loop

rajshah4 commented Jun 25, 2026

Uh oh!

rajshah4 commented Jun 25, 2026

Uh oh!

rajshah4 commented Jun 25, 2026

Uh oh!

greptile-apps Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

rajshah4 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rajshah4 commented Jun 25, 2026

What this adds

Watch the harness exercise the new feature

How this PR maps to the six requirements

The develop-then-test loop, as four commits

The develop-then-test loop

Verification

Toolchain

Open questions for review

Uh oh!

rajshah4 commented Jun 25, 2026

🟡 Review pass, Acceptable, MEDIUM risk

Findings

What was checked

Verdict

Uh oh!

rajshah4 commented Jun 25, 2026

✅ QA Report, PASS

Run summary

Static gates

Per-browser-scenario artifacts (cloud)

What was verified

Verdict

Uh oh!

greptile-apps Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

rajshah4 commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Jun 25, 2026 •

edited

Loading