Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add !!word functionality #1083

Merged
merged 17 commits into from
Jan 21, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Version 5.8.0 (XXX 2020)
* English dict: support for archaic/poetic abbreviations
* English dict: introduce OH link for vocatives/invocations.
* English dict: improved parsing of imperatives.
* Add !!word/ link-parser command for displaying extended word dict info.

Version 5.7.0 (13 Sept 2019)
* Minor efficiency improvements to the SQL-backed dictionary.
Expand Down
58 changes: 58 additions & 0 deletions data/command-help-en.txt
Original file line number Diff line number Diff line change
Expand Up @@ -305,3 +305,61 @@ Examples:
!dialect=irish
!dialect=irish,headline
!dialect=instructions,bad-spelling:2.2

[!]
This command is for debugging the dictionary or the library.
It gets as an argument a word, and optionally a regex and flags.
It splits the given word to tokens according to the current language,
and for each token it prints its matching dictionary words along with its
expression or disjunct list. The word may include a wildcard * to find
multiple matches, and a subscript can be used to limit the matches to this
subscript only.

Examples ("test.n" is an example word):

Show the expression:
!!test.n

Show the expression using macro tags:
!!test.n/m
Each macro tag is followed by its content on the same line.
The other lines are direct expression components (before and after a macro).

Show also low-level memory details of the expression:
!!test.n/l

Show the disjuncts (without duplicates):
!!test.n//

Show selected disjuncts according to the supplied regex:
!!test.n/ Wd .*<-->.*@M\b/

Display all the words that start with "test":
!!test*

Display all the words that start with "test" and have subscript ".q":
!!test*.q

A sample output of a disjunct-list display:
Token "test.n" matches:
test.n 8509 disjuncts <en/words/words.n.1-const>

Token "test.n" disjuncts:
test.n 4273/4501 disjuncts

...
test.n: [4070]1.500= Wd @hCO Ds**c <--> Ss*s @M NM
...

In the this sample output:
8509 Number of disjuncts in the dictionary expression.
4501 Number of disjuncts after applying cost-max.
4273 Number of disjuncts w/o duplicates.
4070 Disjunct ordinal number.
1.500 Disjunct cost.
= A separator to enable regex anchoring.
<--> A separator of the "-" (LHS) and "+" (RHS) connector lists.

These variables affect the output:
Disjuncts, expressions: !dialect
Disjuncts only: !cost-max
2 changes: 2 additions & 0 deletions link-grammar/dict-common/dict-common.c
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,8 @@ void dictionary_delete(Dictionary dict)
free_dialect(dict->dialect);
free(dict->dialect_tag.name);
string_id_delete(dict->dialect_tag.set);
if (dict->macro_tag != NULL) free(dict->macro_tag->name);
free(dict->macro_tag);

free((void *)dict->suppress_warning);
free_regexs(dict->regex_root);
Expand Down
1 change: 1 addition & 0 deletions link-grammar/dict-common/dict-common.h
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ struct Dictionary_s

Dialect *dialect; /* "4.0.dialect" info */
expression_tag dialect_tag; /* Expression dialect tag info */
expression_tag *macro_tag; /* Macro tags for expression debug */

/* Affixes are used during the tokenization stage. */
Dictionary affix_table;
Expand Down
2 changes: 1 addition & 1 deletion link-grammar/dict-common/dict-structures.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ static const int cost_max_dec_places = 3;
static const double cost_epsilon = 1E-5;

#define EXPTAG_SZ 100 /* Initial size for the Exptag array. */
typedef enum { Exptag_none=0, Exptag_dialect } Exptag_type;
typedef enum { Exptag_none=0, Exptag_dialect, Exptag_macro } Exptag_type;

/**
* The Exp structure defined below comprises the expression trees that are
Expand Down
5 changes: 5 additions & 0 deletions link-grammar/dict-common/dict-utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,17 @@
#define _DICT_UTILS_H_

#include "dict-common.h"
#include "utilities.h" // dyn_str

/* Exp utilities ... */
void free_Exp(Exp *);
int size_of_expression(Exp *);
Exp * copy_Exp(Exp *, Pool_desc *, Parse_Options);
bool is_exp_like_empty_word(Dictionary dict, Exp *);
void prt_exp_all(dyn_str *,Exp *, int, Dictionary);
#ifdef DEBUG
void prt_exp(Exp *, int);
#endif /* DEBUG */

/* X_node utilities ... */
X_node * catenate_X_nodes(X_node *, X_node *);
Expand Down
Loading