Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update code to support TA - Tamil language #33

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

arcturusannamalai
Copy link

  • update code to support TA - Tamil language
  • add README to show how to add new language
  • add option to download wordlist from different repository
  • add assertions for incorrectly downloaded files etc.

@hisbaan
Copy link
Owner

hisbaan commented Dec 21, 2023

Hi there @arcturusannamalai!

First off, I'd like to say I sincerely appreciate you putting in the work to add a language and making didyoumean more accessible. There are a few comments that I have regarding your PR. These all need to be addressed before a merge. I'll add them as a list so it's easier to keep track of them.

  1. Please, in the future, have your commits be more atomic. This means that each commit should try and do one thing. It greatly helps with reviewing and making comments like this :) In thise case, all of the changes are all in one commit so it makes it hard for me to see which code changes correspond to which features. For this time, it's okay since you've already done it and messing with git can be a pain, just please keep this in mind if you decide to contribute in the future.
  2. Please run cargo fmt before committing and pushing code. A lot of your changes have a change in indenting probably inserted by your editor that do not match the style of the repo. I use cargo fmt to ensure that it is standard and we do not end up with unnecessarily large and hard-to-read diffs. As it is right now, it's hard to tell what has actually changed and what was just and indenting change.
  3. Regarding the README.md, there is already a readme located at docs/README.md. In general, please make edits to that file instead of creating another file. More specific to this case, when demonstrating binary features, I typically like to use the actual binary instead of cargo run -- as this will confuse non-developer users. I also do not include the output of the commands as that clutters what is supposed to be a general overview document. Additionally, there isn't a need to demonstrate the features you have shown here as that is already all covered by the --help documentation. For this PR, I request that you remove this README.md file altogether.

Regarding the code changes, I'll take a look once you push the changes that I've requested so far because that will make it much easier to review. Just to recap that is:

  1. Run cargo fmt, commit and push the results.
  2. Remove the README.md file.

arcturusannamalai and others added 2 commits December 27, 2023 03:50
- add README to show how to add new language
- add option to download wordlist from different repository
- add assertions for incorrectly downloaded files etc.
- run rustfmt and update README.md
Bumps [openssl](https://github.com/sfackler/rust-openssl) from 0.10.55 to 0.10.60.
- [Release notes](https://github.com/sfackler/rust-openssl/releases)
- [Commits](sfackler/rust-openssl@openssl-v0.10.55...openssl-v0.10.60)

---
updated-dependencies:
- dependency-name: openssl
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
@arcturusannamalai
Copy link
Author

@hisbaan - thanks for feedback; I updated. Can you take a look ?

Copy link
Owner

@hisbaan hisbaan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so long to get back to you, had a busy busy few weeks

Comment on lines +69 to +169
===================
Update the language code in the list at


Update the language wordlist
============================
```
$ cargo run -- --lang ta --update-langs --wordlist-url https://raw.githubusercontent.com/arcturusannamalai/wordlists/main/
Finished dev [unoptimized + debuginfo] target(s) in 0.17s
Running `target/debug/dym --lang ta --update-langs --wordlist-url 'https://raw.githubusercontent.com/arcturusannamalai/wordlists/main/'`
Downloading English word list...
Accessing URL: https://raw.githubusercontent.com/arcturusannamalai/wordlists/main//en
[00:00:00] [############################################################################################################################] 4.12MiB/4.12MiB (0s)
Downloading Tamil word list...
Accessing URL: https://raw.githubusercontent.com/arcturusannamalai/wordlists/main//ta
[00:00:00] [############################################################################################################################] 1.73MiB/1.73MiB (0s)
```

Run did you mean:
================
e.g.
```
cargo run -- --lang ta கலஅ
Compiling didyoumean v1.1.4 (/Users/user/devel/rust-in-action/didyoumean)
Finished dev [unoptimized + debuginfo] target(s) in 1.18s
Running `target/debug/dym --lang ta 'கலஅ'`
Did you mean?
1. கலி
2. கலை
3. கல்
4. அ
5. அகல்

```


For more info see the help text and options,
```
$ cargo run -- --help

didyoumean user$ cargo run -- --help
Compiling didyoumean v1.1.4 (/Users/user/devel/rust-in-action/didyoumean)
Finished dev [unoptimized + debuginfo] target(s) in 2.24s
Running `target/debug/dym --help`
didyoumean 1.1.4
Hisbaan Noorani
Did You Mean: A cli spelling corrector

USAGE:
dym [OPTIONS] [SEARCH_TERM]

ARGS:
<SEARCH_TERM>


OPTIONS:
-c, --clean-output
Print a clean version of the output without the title, numbers or colour.

-h, --help
Print help information

-l, --lang <LANG>
Select the desired language using its locale code. For example, English would have the
locale code en and French would have the locale code fr. See --print-langs for a list of
locale codes and the corresponding languages.

[default: en]

-n, --number <NUMBER>
Change the number of words the program will print. The default value is five.

[default: 5]

--print-langs
Display a list of supported languages and their respective locale codes.

--update-langs
Update all language files from the repository specified by CLI @wordlist-url@.

-v, --verbose
Print verbose output including the edit distance of the found word to the queried word.

-V, --version
Print version information

-w, --wordlist-url <WORDLIST_URL>
Wordlist repository URL. The default value is
'https://raw.githubusercontent.com/hisbaan/wordlists/main'

[default: https://raw.githubusercontent.com/hisbaan/wordlists/main]

-y, --yank
Yank (copy) the selected word to the system clipboard. If no word is selected, the
clipboard will not be altered.

```



Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove these changes to the README.

  1. The information is not needed in the README. Anyone can access it via the help command or the man page where it is better formatted and more general.
  2. This information is not well formatted. You use specific to Tamil commands to showcase functionality, these (if it was to be included) should be general commands with placeholders for specific values. Also, when showing these off they should generally be using the binary to run these commands and

short = 'w',
long = "wordlist-url",
help = "Wordlist repository URL",
long_help = "Wordlist repository URL. The default value is 'https://raw.githubusercontent.com/hisbaan/wordlists/main'",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't list the default value here, it's already shown below and output by the help menu

For example:

image

Also relating to this, I found a place where I had made this same mistake, so if possible could you please also correct this? (line 13 of cli.rs please remove "The default value is 5")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust has a sensible affordance here for CLI help to also print the default; this is not the case in C++, Python @ argparse etc. nice to know!

// Get word list. The program will only get here if/when this is a valid word list.
let word_list = read_to_string(dirs::data_dir().unwrap().join("didyoumean").join(args.lang))
.expect("Error reading file");

// Get dictionary of words from words.txt.
let dictionary = word_list.split('\n');

//assert!(dictionary.clone().count()>20,"Size of wordlist > 20 words");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was used in debugging? Please remove this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still new to Rust as you can see from the clone-ing. Sorry.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! I really appreciate the effort and desire to make software better :)

@@ -17,6 +17,7 @@ pub fn yank(string: &str) {
"freebsd",
"netbsd",
"dragonfly",
"mac osx",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason "mac osx" was not on this list before is that (as far as I know) it handles clipboard functionality just fine without this hack. This is not a "supported operating systems" list but instead a list of operating systems for which a certain workaround is required. Please remove it from this list.

Comment on lines +61 to +64
assert!(
!args.wordlist_url.ends_with("/"),
"URL should end with branch name in github without trailing /"
);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but please use the proper error handling instead of just asserting. You can see an example a little below in the code or right here:

            let error = cmd.error(
                clap::ErrorKind::MissingRequiredArgument,
                format!(
                    "The {} argument was not provided.\n\n\tEither provide it as an argument or pass it in from standard input.",
                    "<SEARCH_TERM>".green()
                )
            );
            clap::Error::exit(&error);

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also note that the repository doesn't necessarily have to be on GitHub so the error message should be something like

The provided wordlist URL must not end with a trailing '/'

@@ -224,7 +227,7 @@ fn run_app() -> std::result::Result<(), Error> {
///
/// * `lang` - A locale code string to define the word list file to fetch.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the comment string here with the new parameter.

);

let url = format!("{}/{}", wordlist_url, &lang);
println!("Accessing URL: {}", url);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra print is not required as it adds extra clutter and doesn't give any new information to the user since they know the URL they added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants