Skip to content

[Feature]: Allow Users to Select MinerU Parsing Mode (Pipeline / VLM / Hybrid) #16182

@kennyzheng-builds

Description

@kennyzheng-builds

Background

When parsing documents with MinerU for knowledge bases or in-conversation file uploads, different document types and accuracy needs call for different parsing strategies. Today the parsing strategy is fixed, so users cannot trade off between speed and extraction quality, and scanned/image-heavy or complex-layout documents may be parsed sub-optimally.

Goal

Let users choose the MinerU parsing mode (Pipeline / VLM / Hybrid) so document parsing can be matched to the document type and the desired accuracy.

Spec

  • In the MinerU document-parsing settings, expose a parsing-mode selector with at least: Pipeline, VLM, and Hybrid.
  • Each mode shows a short, plain-language description of when to use it (e.g. fast/general vs. higher accuracy for complex or image-based documents).
  • The selected mode is persisted and applied to subsequent MinerU parsing tasks (knowledge base preprocessing and in-conversation file parsing) until the user changes it.
  • A sensible default mode is preselected so users who do not care still get reasonable results without configuration.
  • The mode choice is respected for both single-file and directory/batch uploads.

Verification

  • Opening MinerU parsing settings shows a mode selector offering Pipeline, VLM, and Hybrid.
  • Selecting a non-default mode and parsing a document causes that mode to be used; reopening the app preserves the selection.
  • Parsing an image-based / scanned PDF in VLM (or Hybrid) yields better text extraction than Pipeline on the same file.
  • Switching modes does not break existing knowledge bases already parsed under a previous mode.
  • Default mode is applied when the user has never changed the setting.

Related

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions