An project to help with distillation of models to get dataset with Batch API (50% discount)
- GitHub: Claude Opus 4.6 • GPT-5.4
- HuggingFace: Claude Opus 4.6 • GPT-5.4
- Easy to use commands
- JS/TS Stack
- Hugging Face compatible
- Bun installed
- AI Provider API Key
- A money in AI Provider balance
- Source dataset
bun installDevelopment:
bun run devProduction:
bun run startbun run exportAn file should be ready at dataset/
bun run statPROMPT:
Generate a dataset for different/mixed purposes with focus on programming language.
Most of dataset should be:
- Rust
- Golang
- TypeScript/JavaScript/Node.js
- Also other programming languages
should be included.
Most output should be simpler, not verbose or expanded.
Only last 15% rows should be verbose with expanded details.
Output dataset entries: ~2000 rowsApache-2.0