WikiText syntax dataset for auto UI generation in TiddlyWiki.
- Read folders in the config. This will create snapshot of TW core and plugins. And skip the duplication if text is same as first item's input (See Data for detail about snapshot).
- Generate more QA pair with templates
- Generate missing Q or A using LLM
- Generate review API call and upload to review platform
- Ask community for help reviewing
- Import updated wikitext when TiddlyWiki version bump and rerun the above pipeline
- Export dataset for LLM fine-tune
We generate material so human don't need to create them from scratch.
Prompts are WikiText tiddlers with variable transclusion, in the wiki's 'prompts folder.
Each tiddler's wikified body will be put to review platform. And they can use following variables from pipeline.
InputTiddler
.tid
file content with metadata part, read from TW core or other plugins.inputWikiText
Text part of tiddler, extracted by pipelineaIOutput
GPT generated output, generated by pipeline
- Use pure JS to get
InputTiddler
andinputWikiText
- These variables will be available in tiddlers in the pipeline's 'prompts folder.
- We process "input" X "Data prompt tiddler" matrix member one by one, we can get
prompt
field of data tiddler, asDataPrompt
variable - Compose the variables using WikiText, and we use wikified tiddler text as AI input, to get the content for
aIOutput
variable.
graph TD
A[TW core or other plugins] --> B[InputTiddler and inputWikiText]
B --> D[Pipeline Template]
C[Data Prompt Templates' Metadata] --> D
D --> E[aIOutput]
E --> F[Data Prompt Templates' Body]
B --> F
F --> G[Text to Review]
In the review platform, there are "original language" and "translation" areas, because we are using a translating review platform.
Original WikiText + AI prompt that generate the material will be the "original language", and AI generated chat material can will put in the "translation" area, open for human to review.
You can join the project in Paratranz, and review each material, edit them to be correct (AI will have mistake, they don't understand WikiText well, because there was never good learning material before like we are creating).
- Have Deno installed
- Clone this project and TiddlyWiki5 repo side by side in a folder.
- "cd" into this project's folder
- Get a DeepSeek API key or OpenAI API key, put it in a
.env
file, copied from .env.template file. - Run
deno task dev
- Get a DeepSeek API key or OpenAI API key, put it in a