Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add record checking for allennlp and spacy #5

Closed
wants to merge 9 commits into from

Conversation

jennyzhang-petuum
Copy link
Collaborator

@jennyzhang-petuum jennyzhang-petuum commented Apr 2, 2021

This PR fixes #6.

Description of changes

Added record writing for SpaCy processor to note the output contains the Sentence type which is required by other processors for example AllenNlp processor. Added record checking for AllenNlp processor which means it requires Sentence type in the input datapack. The record checking could be enabled with pipeline.enforce_consistency(enforce=True).

Possible influences of this PR.

None

Test Conducted

Not yet.

Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there is no expected_types_and_attributes function for spacy and no record for allenlp?

the pipeline.
"""
expectation_dict: Dict[str, Set[str]] = dict()
expectation_dict["ft.onto.base_ontology.Sentence"] = set()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a demonstration, probably we can make this function simpler by using the literal implementation:

Suggested change
expectation_dict["ft.onto.base_ontology.Sentence"] = set()
expectation_dict: Dict[str, Set[str]] = {
"ft.onto.base_ontology.Sentence": set()
}

Copy link
Member

@hunterhector hunterhector left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments

if "pos" in self.configs.processors:
record_meta["ft.onto.base_ontology.Token"].add("pos")
if "depparse" in self.configs.processors:
record_meta["ft.onto.base_ontology.Dependency"] = set()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SRL is missing?

forte_wrapper/spacy/spacy_processors.py Show resolved Hide resolved
forte_wrapper/allennlp/allennlp_processors.py Outdated Show resolved Hide resolved
@hunterhector hunterhector mentioned this pull request Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add record checking and writing for AllenNLP and spaCy processor
2 participants