Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate NCES in ontolearn-web service and fix dllearner binding script in examples #450

Merged
merged 13 commits into from
Oct 25, 2024

Conversation

Jean-KOUAGOU
Copy link
Collaborator

This pull request addresses the issue#408: #408, and updates the dllearner binder script such that positive and negative examples initially given here: https://github.com/dice-group/Ontolearn/blob/develop/examples/dl_learner.py#L39 are first sent to the PosNegLPStandard class as required by the binder and most class expression learners, inc. Drill, EvoLearner, CELOE, CLIP, etc

Please review and merge

@Jean-KOUAGOU Jean-KOUAGOU requested a review from Demirrr October 15, 2024 14:24
@github-actions github-actions bot temporarily deployed to pull request October 15, 2024 14:30 Inactive
@@ -1572,7 +1572,7 @@ def best_hypotheses(self, n=1) -> Union[OWLClassExpression, Iterable[OWLClassExp
elif len(self.best_predictions) == 1 or n == 1:
return self.best_predictions[0].concept
else:
return self.best_predictions[:n]
return [best.concept for best in self.best_predictions[:n]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this converation ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, other concept learners, including Drill output a list of concepts, not nodes. This is why I am converting. Moreover, NCES uses its own node class for the moment. In the future, if necessary, we might investigate the use of OENode which is currently tailored towards search-based approaches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot return OENode. Since we learn OWL Class expressions, return results must fulfill Union[OWLClassExpression, Iterable[OWLClassExpression]].

ontolearn/scripts/run.py Show resolved Hide resolved
examples/dl_learner.py Show resolved Hide resolved
@github-actions github-actions bot temporarily deployed to pull request October 16, 2024 16:27 Inactive
@Demirrr
Copy link
Member

Demirrr commented Oct 17, 2024

Could you please ensure that no fixed paths are being used?
The following lines must not contain only fixed paths

  1. parser.add_argument("--path_of_nces_embeddings", type=str, default="./NCESData/")

  2. if not os.path.exists(f"./NCESData/{kb_name}/embeddings/ConEx_entity_embeddings.csv"):

  3. embeddings_path += f"{kb_name}/embeddings/ConEx_entity_embeddings.csv"
    There is no reason to force a user to use conex embeddings

  4. path_of_embeddings=get_embedding_path("https://files.dice-research.org/projects/NCES/NCES_Ontolearn_Data/NCESData.zip", args.path_knowledge_base),

owl_learner must follow the same interface.

if isinstance(owl_learner, NCES):

@github-actions github-actions bot temporarily deployed to pull request October 17, 2024 09:31 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 17, 2024 09:50 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 17, 2024 10:30 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 17, 2024 10:38 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 17, 2024 10:47 Inactive
@Jean-KOUAGOU
Copy link
Collaborator Author

Jean-KOUAGOU commented Oct 17, 2024

  1. , 2., 3., 4.

parser.add_argument("--path_of_nces_embeddings", type=str, default="./NCESData/") was removed. It now takes input from the user (same as Drill data["path_embeddings"])

if not os.path.exists(f"./NCESData/{kb_name}/embeddings/ConEx_entity_embeddings.csv"). Same as above. The function has been modified, and takes input from the user. It is only when the path provided by the user does not exist that it looks for embeddings online. This is because NCES needs embeddings to run, not like Drill which can run without embeddings.

owl_learner must follow the same interface. This has been fixed. No check whether it is an NCES instance. NCES takes PosNegLPStandard as the other learners now.

Note:

If I completely remove the possibility to download embeddings for NCES, the user will have to download them manually in case he does not have any emeddings and still needs to test NCES. As it currently is, the user can provide any embeddings and they will be used. But feel free to request that I remove any possibility to download data.

@Jean-KOUAGOU
Copy link
Collaborator Author

Another Note

NCES is trained from scratch if the provided path for pretained models does not contain any files

@github-actions github-actions bot temporarily deployed to pull request October 17, 2024 11:11 Inactive
@Demirrr
Copy link
Member

Demirrr commented Oct 17, 2024

Can you please follow the enumeration ? and fix the each point individually.

  1. , 2., 3., 4.

What does this enumeration mean ? Can you please nextime explicitly state whether the required is done or not?

This function is still in the script. This needs to be removed.

def get_embedding_path(ftp_link: str, kb_path_arg: str)->str:
        return f"./NCESData/{kb_name}/embeddings/ConEx_entity_embeddings.csv"

Please follow the same structure of preparting an owl reasoner as we have done for DRILL and TDL

@Demirrr
Copy link
Member

Demirrr commented Oct 17, 2024

  1. Is still not being fixed
    parser.add_argument("--path_of_nces_embeddings", type=str, default="./NCESData/")

@Jean-KOUAGOU
Copy link
Collaborator Author

  1. Is still not being fixed
    parser.add_argument("--path_of_nces_embeddings", type=str, default="./NCESData/")

You are probably looking at an old commit. New link here: https://github.com/dice-group/Ontolearn/blob/nces_web_and_fix_dllearner_binder/ontolearn/scripts/run.py#L55

@Jean-KOUAGOU
Copy link
Collaborator Author

Jean-KOUAGOU commented Oct 17, 2024

I replied for 1., 2., 3., 4.

They have been resolved but with notes awaiting for your confirmation. I will write notes down here again.

parser.add_argument("--path_of_nces_embeddings", type=str, default="./NCESData/") was removed. It now takes input from the user (same as Drill data["path_embeddings"])

if not os.path.exists(f"./NCESData/{kb_name}/embeddings/ConEx_entity_embeddings.csv"). Same as above. The function has been modified, and takes input from the user. It is only when the path provided by the user does not exist that it looks for embeddings online. This is because NCES needs embeddings to run, not like Drill which can run without embeddings.

owl_learner must follow the same interface. This has been fixed. No check whether it is an NCES instance. NCES takes PosNegLPStandard as the other learners now.

Note 1:

If I completely remove the possibility to download embeddings for NCES, the user will have to download them manually in case he does not have any emeddings and still needs to test NCES. As it currently is, the user can provide any embeddings and they will be used. But feel free to request that I remove any possibility to download data.

Note 2:

NCES is trained from scratch if the provided path for pretained models does not contain any files

@Jean-KOUAGOU Jean-KOUAGOU requested a review from Demirrr October 17, 2024 13:34
@Demirrr
Copy link
Member

Demirrr commented Oct 17, 2024

They have been resolved but with notes awaiting for your confirmation

If they have been resolved, please push them.

examples/train_nces.py Outdated Show resolved Hide resolved
ontolearn/scripts/run.py Outdated Show resolved Hide resolved
ontolearn/scripts/run.py Outdated Show resolved Hide resolved
num_predictions=64
)
# (2) Either load the weights of NCES or train it.
if data.get("path_to_pretrained_nces", None) and os.path.isdir(data["path_to_pretrained_nces"]) and glob.glob("*.pt"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use glob.plot with ** * **. Input items must be fixed

@Demirrr
Copy link
Member

Demirrr commented Oct 17, 2024

@Jean-KOUAGOU if it makes the integration easier, we can ask a user to provide the paths of csv file containing embeddings and pretrained NCES weights. Without these two arguments at the start, NCES would return Top concept for any learning problem

…rovided path does not contain any embeddings. ontolearn-webservice will now require path to existing embeddings for NCES to run
@github-actions github-actions bot temporarily deployed to pull request October 17, 2024 15:29 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 18, 2024 14:06 Inactive
@Jean-KOUAGOU
Copy link
Collaborator Author

Jean-KOUAGOU commented Oct 23, 2024 via email

@github-actions github-actions bot temporarily deployed to pull request October 24, 2024 13:35 Inactive
@github-actions github-actions bot temporarily deployed to pull request October 24, 2024 14:29 Inactive
@Demirrr Demirrr merged commit 4f0b2a5 into develop Oct 25, 2024
4 checks passed
@Demirrr Demirrr deleted the nces_web_and_fix_dllearner_binder branch October 25, 2024 08:57
@github-actions github-actions bot temporarily deployed to pull request October 25, 2024 08:58 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants