add metadata and coco to asset store generation by BryonLewis · Pull Request #1707 · Kitware/dive

BryonLewis · 2026-06-19T18:08:30Z

Update to the generateSampleData.py script that will include coco json and metadata information when creating assetstore importing data. The

This script is used to create a folder structure with random video and image-sequence datasets. After running this folder you can uv run --script minIOConfig.py to create a minIO bucket with some auth credentials for importing into Girder.

This can be used to simulate Google Cloud Buckets or AWS S3 buckets and importing data using this format. You can then check for proper conversion and importing of the relevant data.

So now the system can create viame-csv, DIVE JSON, or COCO JSON annotations files. I've defaulting to utilizing viame-csv and coco json files with embedded random dataset metadata.

I've ran this and confirmed that ingestion of the data also includes the metadata properly.

* Round-trip datasetInfo through KWCOCO export and import Carry the dataset's datasetInfo metadata through the KWCOCO/coco_json path on both the Girder server and the Electron desktop client, mirroring the existing VIAME CSV passthrough. datasetInfo travels under a single `datasetInfo` key in the COCO `info` block, advertised in `info.dive_extensions`, and is omitted entirely when empty so exports stay byte-unchanged for datasets without it. Export: export_dive_as_coco / coco.ts serializeFile write info.datasetInfo when non-empty; the server export endpoint reads it from folder metadata and the desktop export already loads it onto JsonMeta. Import: load_coco_as_tracks_and_attributes returns info.datasetInfo (4th tuple element, like the VIAME loader's fps) and process_items merges it per-key into the folder's datasetInfo (imported values win, existing-only keys preserved, other metadata untouched); coco.ts parseFile surfaces it so the desktop import plumbing merges it onto the dataset JsonMeta. Values are treated as opaque strings. Tests: extend the server kwcoco tests and desktop coco.spec.ts for export-writes, empty-omits (byte-unchanged), and import-restores on both platforms. * Extract and test datasetInfo import merge; type the export param * Unify single-dataset and multicam COCO export paths Route the single-dataset coco_json export through _coco_json_export_text instead of duplicating the track-filtering, image_filenames, and export_dive_as_coco logic inline. This fixes the multicam export dropping datasetInfo: the merge now lives in one place feeding both paths. * Round-trip datasetInfo through VIAME CSV; snake_case the serialized keys Serialized keys: rename the per-dataset station metadata key to snake_case to match each export format's conventions -- `dive_dataset_info` in the COCO `info` block (alongside `dive_notes`, `dive_detection_attributes`) and `dataset_info` on the VIAME CSV `# metadata` line. Internal DIVE meta/model/ client keys stay camelCase (`datasetInfo`); the serializers translate at the boundary. VIAME CSV import: add a symmetric `# metadata` parser (mirror of writeHeader) that restores datasetInfo on both the server and desktop paths, so VIAME CSV round-trips it like COCO. This also fixes fps import -- read `fps:` case- insensitively; the importer only matched a capitalized `Fps:`, though DIVE and native VIAME both write lowercase `fps:`. load_csv_as_tracks_and_attributes now returns datasetInfo as a 5th tuple element. Import merge semantics: datasetInfo now follows the import "Overwrite" checkbox on both COCO and VIAME, mirroring annotations -- Overwrite (default) replaces the block, additive merges per-key (imported values win). A file that carries no datasetInfo never touches existing metadata, in either mode. Tests and docs updated to match; flip the prior "ignored on parse" VIAME test to assert restore and add an fps case-insensitivity regression. * Fix desktop Overwrite datasetInfo replace and a broken fps test assertion Desktop import persists meta via lodash `merge`, which deep-merges datasetInfo and keeps stale keys, so the "Overwrite" path never actually replaced the block the way the server (and the docs) describe. Assign the imported block wholesale in dataFileImport when not additive; the additive path keeps its per-key pre-merge. Correct the now-inaccurate comment that credited saveMetadata. Drop a stray `assert warnings == []` that the new fps case-insensitivity test carried over from a sibling test -- `warnings` is unbound there, so the test raised NameError instead of running. * Simplify datasetInfo import: centralize merge, drop redundant guards - common.ts: assign datasetInfo explicitly in dataFileImport for both the Overwrite and additive cases, removing the second merge block (and its redundant metadata re-read) from _ingestFilePath. Mirrors the server. - coco.ts: drop the redundant truthiness clause; the typeof+isEmpty guard already covers it. - crud_rpc.py: return the meta dict directly instead of collapsing an empty dict back to None (the sole caller treats both as falsy). * Centralize datasetInfo import resolution; share serializer type aliases - Replace merge_imported_dataset_info with resolve_imported_dataset_info, which owns the full overwrite/additive/absent decision as a pure function so the import call site drops to two lines. - Add Attributes, Warnings, and DatasetInfo aliases to dive_utils.types and point the KWCOCO and VIAME serializers, the CocoMetadata/JsonMeta models, and the resolver at them. - Expand the resolver tests to cover the overwrite-replaces and absent-block branches the helper now owns. * Harden VIAME dataset_info parsing and tidy serializer comments Reject JSON arrays/null (not just non-objects) when reading the dataset_info comment field, report the actual kind in the warning, and cover the malformed/number/array/null cases with tests. Trim the now over-explained comments and docstrings in the VIAME/COCO serializers. * Cover dataset info config imports * add metadata and coco to asset store generation (#1707) --------- Co-authored-by: Bryon Lewis <61746913+BryonLewis@users.noreply.github.com>

add metadata and coco to asset store generation

dd3eba6

BryonLewis merged commit e1c49e6 into coco-dataset-info Jun 22, 2026
3 checks passed

BryonLewis deleted the assetstore-metadata-import branch June 22, 2026 12:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add metadata and coco to asset store generation#1707

add metadata and coco to asset store generation#1707
BryonLewis merged 1 commit into
coco-dataset-infofrom
assetstore-metadata-import

BryonLewis commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BryonLewis commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant