Skip to content

Commit

Permalink
[kosha] Simplify and optimize create_kosha
Browse files Browse the repository at this point in the history
This commit aims to improve the ergonomics, build time, and disk usage
of `Kosha` and its associated classes. It is likely the first of several
such commits, but it is an important checkpoint as we continue to lean
more and more on data from `vidyut-prakriya`.
  • Loading branch information
akprasad committed Dec 25, 2024
1 parent 5d6cd17 commit 863c667
Show file tree
Hide file tree
Showing 53 changed files with 1,719 additions and 2,279 deletions.
44 changes: 23 additions & 21 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 5 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,13 @@ create_sandhi_rules:
RUST_LOG=info cargo run --release --bin create_sandhi_rules -- \
--data-dir data/build/vidyut-latest

# Creates a koshas and write it to disk.
# Creates a kosha and write it to disk.
create_kosha:
RUST_LOG=info cargo run --release --bin create_kosha -- \
--input-dir data/raw/lex \
--dhatupatha vidyut-prakriya/data/dhatupatha.tsv \
--output-dir data/build/vidyut-latest
--output-dir data/build/vidyut-latest/kosha


# Trains a padaccheda model and saves important features to disk.
# NOTE: when training, exclude the file paths used in `make eval`.
Expand All @@ -57,7 +58,8 @@ train_cheda:

# Runs basic end-to-end tests against the given kosha.
test_kosha:
RUST_LOG=info cargo run --release --bin test_kosha -- --data-dir data/build/vidyut-latest/kosha
RUST_LOG=info cargo run --release --bin test_kosha -- \
--data-dir data/build/vidyut-latest/kosha


# Evaluate our parsing quality on a large sample of text.
Expand Down
2 changes: 1 addition & 1 deletion scripts/create_all_data.sh
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ echo "========================="
echo "vidyut-chandas"
echo "========================="
mkdir -p "${OUTPUT_DIR}/chandas"
cp -r vidyut-chandas/data "${OUTPUT_DIR}/chandas"
cp -r vidyut-chandas/data/* "${OUTPUT_DIR}/chandas"
echo "Copied files to output dir."
echo
echo "========================="
Expand Down
Loading

0 comments on commit 863c667

Please sign in to comment.