Summary
On Windows, several skill-creator scripts read and write files using Python's default text encoding (cp1252 on Windows) instead of UTF-8. Any skill whose SKILL.md — or any eval/report/JSON file the scripts touch — contains a character outside the cp1252 set (arrows →, many Unicode dashes/symbols, emoji, or other non-Latin-1 punctuation) causes the script to crash.
Repro
On Windows (Python 3.13), with a SKILL.md whose body contains a character outside cp1252 (in the repro file, byte 0x9d at position 4389):
python -m scripts.quick_validate path\to\skill
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 4389: character maps to <undefined>
The crash originates here in scripts/quick_validate.py:
content = skill_md.read_text() # no encoding -> cp1252 on Windows
Scope
This is not isolated to quick_validate. The same unencoded read_text() / write_text() / open() pattern appears at ~30 sites across scripts/ and eval-viewer/, several of which break core flows:
scripts/utils.py — parse_skill_md does read_text(); this underlies the whole eval/optimize loop, so run_eval / run_loop crash on any UTF-8 skill.
scripts/run_eval.py — command_file.write_text(command_content) crashes when the skill description contains a UTF-8 character (em dashes are common in descriptions).
scripts/run_loop.py, scripts/generate_report.py, scripts/aggregate_benchmark.py, scripts/improve_description.py, eval-viewer/generate_review.py — HTML/JSON readers and writers with the same issue.
Fix
Add encoding="utf-8" to every text-mode read/write:
content = skill_md.read_text(encoding="utf-8")
path.write_text(html, encoding="utf-8")
with open(metadata_path, encoding="utf-8") as f: ...
I applied exactly this to a local copy — all ~30 sites; py_compile is clean and quick_validate then passes on Windows with no environment workaround. (Interim workaround for other users: set PYTHONUTF8=1.)
Happy to open a PR with the change if that's useful.
Environment
- Windows 10, Python 3.13.2
Summary
On Windows, several
skill-creatorscripts read and write files using Python's default text encoding (cp1252 on Windows) instead of UTF-8. Any skill whoseSKILL.md— or any eval/report/JSON file the scripts touch — contains a character outside the cp1252 set (arrows→, many Unicode dashes/symbols, emoji, or other non-Latin-1 punctuation) causes the script to crash.Repro
On Windows (Python 3.13), with a
SKILL.mdwhose body contains a character outside cp1252 (in the repro file, byte0x9dat position 4389):The crash originates here in
scripts/quick_validate.py:Scope
This is not isolated to
quick_validate. The same unencodedread_text()/write_text()/open()pattern appears at ~30 sites acrossscripts/andeval-viewer/, several of which break core flows:scripts/utils.py—parse_skill_mddoesread_text(); this underlies the whole eval/optimize loop, sorun_eval/run_loopcrash on any UTF-8 skill.scripts/run_eval.py—command_file.write_text(command_content)crashes when the skill description contains a UTF-8 character (em dashes are common in descriptions).scripts/run_loop.py,scripts/generate_report.py,scripts/aggregate_benchmark.py,scripts/improve_description.py,eval-viewer/generate_review.py— HTML/JSON readers and writers with the same issue.Fix
Add
encoding="utf-8"to every text-mode read/write:I applied exactly this to a local copy — all ~30 sites;
py_compileis clean andquick_validatethen passes on Windows with no environment workaround. (Interim workaround for other users: setPYTHONUTF8=1.)Happy to open a PR with the change if that's useful.
Environment