feat: improve OpenAI error handling and surfacing by HardeepAsrani · Pull Request #178 · Codeinwp/hyve-lite

HardeepAsrani · 2026-06-30T04:44:54Z

Summary

Improves how the plugin handles and surfaces OpenAI errors, addressing three support-driven issues:

Codeinwp/hyve#149 — a free/no-credit key no longer validates as "good" then fails silently later.
Codeinwp/hyve#199 — knowledge-base indexing failures are now visible instead of failing silently.
Codeinwp/hyve#200 — admins get a clear dashboard notice when something requiring their action breaks the bot.

What changed

Key validation (#149). Validation now hits the embeddings endpoint (the capability the plugin actually uses), so a key is accepted only when it's genuinely usable — valid auth and available credits. Real-world finding: a brand-new unfunded account returns 429 even on the free moderation endpoint (with a null error code), so the old moderation-based check was both wrong and unhelpful. Embeddings returns a clean insufficient_quota. Invalid keys are blocked; account-level problems (no credits, rate limit) save the key and warn rather than blocking, so the user can proceed once they add credits.

Actionable messages (#149/#199). A single source of truth maps OpenAI error codes to actionable, translatable copy (e.g. "…no available credits… add billing or upgrade to a paid plan"), consumed by both REST responses and the dashboard notice. Error-code lists are centralised into OpenAI::AUTH_ERROR_CODES / OpenAI::PERSISTED_ERROR_CODES constants so they can't drift.

KB indexing failures (#199). process_post() previously swallowed embedding/Qdrant failures, retried forever, and add_post() still reported success. Failures are now classified: fatal errors (bad key, no credits, billing) stop immediately and mark the entry failed; transient errors (rate limit, network) retry with backoff up to a cap (5 attempts) then give up. The reason is surfaced both as an immediate warning toast at add time and as a status badge in the data list -- "…retried automatically" vs "…fix the problem and re-add" -- and clears on a successful attempt.

Dashboard notice (#200). The service-error notice is gated to the last 24 hours, cleared on the next successful request, excludes transient rate-limits, and now updates without a page reload (store-backed ErrorSection + an apiFetch middleware that syncs from any /settings response). The notice is reconciled with the saved key only after the save lands, so it can never reflect a key that wasn't stored.

Notes

Companion pro PR (Advanced-panel warning handling): Codeinwp/hyve#230

Test plan

Save a valid funded key → "Settings saved", no warning/notice.
Save a no-credit key → key saves, amber warning + dashboard notice appear immediately (no refresh).
Save an invalid key → blocked, no notice, key not stored.
With a no-credit key, add KB content → "Indexing failed… will retry" badge; add credits + reprocess → badge clears.
Notice clears on next successful save; disappears after 24h.

⚠️ Ingestion pipeline refactor — behavior change across ALL data sources

The latest commit reworks how content is added to the Knowledge Base, not just error handling. Previously each of the four sources — Posts, Custom Data, Site Crawl, Sitemap — re-implemented the same tokenize → moderate → insert → embed flow with subtle differences. They now all funnel through a single method, DB_Table::ingest_document(). (Pro callers move over in Codeinwp/hyve#230.)

Because this touches every ingestion path, QA must re-test all four data sources end to end — this is a regression check on the whole "add to KB" surface, not only the OpenAI-error cases in the test plan above.

Also note one intentional behavior change: Posts "Add" is now fully synchronous — a failed embedding is surfaced immediately (warning toast) and is no longer retried in the background. Posts "Update" still retries via cron.

QA — confirm each source works as before (no regression)

For each of Posts, Custom Data, Site Crawl, Sitemap:

Add content → it appears in the Knowledge Base, status reaches indexed, and the chatbot can answer from it.
Update / edit → the change is re-indexed (Posts: on save via cron; Custom Data & Site Crawl: immediately; Sitemap: delete + re-add).
Delete → it's removed from the list and from the vector store.
Multi-chunk content (paste/import something >~1000 tokens) → adds, updates and deletes as a single document (not split into several entries), and all chunks index.
Moderation-flagged content → still blocked / shows the review as before.
With a no-credit or invalid key → the failure surfaces (toast + badge) and nothing is half-saved.

Run the above with both storage backends: default WordPress storage and Qdrant.

If anything regresses, this is isolated in its own commit on top of the error-handling work, so it can be reverted independently.

github-actions · 2026-06-30T04:46:31Z

Plugin build for 6005ae5 is ready 🛎️!

Download Plugin - Download

Validate API keys against the embeddings endpoint so a key is accepted only when it is genuinely usable (valid auth and available credits). Invalid keys are blocked, but account-level problems (no credits, rate limits) save the key and warn instead of blocking — new, unfunded accounts return a 429 even on the free moderation endpoint, so blocking on validation was wrong. Map OpenAI error codes to actionable, translatable messages from a single source of truth (centralised code-list constants), surface knowledge base indexing failures in the data UI with automatic retry, and show a dashboard service-error notice that is gated to the last 24 hours, cleared on the next successful request, and updates without a page reload via an apiFetch middleware. Refs Codeinwp/hyve#149, Codeinwp/hyve#199, Codeinwp/hyve#200

Add DB_Table::ingest_document() as the shared tokenize -> moderate -> resolve post -> insert chunks -> embed path used by every data source; add_post() becomes a thin wrapper over it. process_post() now returns its result and accepts an $allow_retry flag so a caller can opt out of the background hyve_process_post retry. Posts add runs synchronously (retry_async=false): a failed embedding is surfaced immediately instead of being retried in the background.

Qdrant only reacted to 403, so a deleted or paused cluster (404) failed silently on every chat while the Integrations page still showed it as connected. Route all Qdrant exceptions through a single handler that persists the error so it surfaces in the dashboard notice, and mark the connection inactive on 401/403/404 so the UI no longer reports a connection that no longer works. Add per-code messages so admins see the actual cause.

HardeepAsrani force-pushed the feat/openai-error-handling branch from d28aec0 to 5fcb65e Compare June 30, 2026 05:21

HardeepAsrani requested a review from Soare-Robert-Daniel June 30, 2026 05:22

Soare-Robert-Daniel approved these changes Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: improve OpenAI error handling and surfacing#178

feat: improve OpenAI error handling and surfacing#178
HardeepAsrani wants to merge 3 commits into
developmentfrom
feat/openai-error-handling

HardeepAsrani commented Jun 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

HardeepAsrani commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Notes

Test plan

⚠️ Ingestion pipeline refactor — behavior change across ALL data sources

QA — confirm each source works as before (no regression)

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HardeepAsrani commented Jun 30, 2026 •

edited

Loading

github-actions Bot commented Jun 30, 2026 •

edited

Loading