Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimic-iii pipeline #1

Closed
tanyuqian opened this issue Feb 4, 2022 · 6 comments · Fixed by #2 or #4
Closed

mimic-iii pipeline #1

tanyuqian opened this issue Feb 4, 2022 · 6 comments · Fixed by #2 or #4
Assignees

Comments

@tanyuqian
Copy link
Member

Is your feature request related to a problem? Please describe.
Adapting the clinical pipeline from the forte example into this pacakge.

Describe the solution you'd like
place the components (ontology, dataset reader, and processor, etc.) into corresponding folders in forte_medical/, and provide the pipeline example into examples/mimic-iii/.

Describe alternatives you've considered
Given it's the first example in this package, so feel free to change the code framework if necessary.

@Piyush13y
Copy link
Collaborator

@hunterhector Are we planning to have independent ontology, dataset readers and processors for this repo or will we be reusing most of these components from forte itself?

@hunterhector
Copy link
Member

I don't think the main Forte repo itself provides enough components. It should be a case-by-case decided by @tanyuqian

@tanyuqian
Copy link
Member Author

After reviewing the code of PR #2 detailedly, I realize some points to polish in our code:

  • removing redundant files, e.g., __pycache__ , .DS_Store etc.
  • building ontologies in the root folder of this repo instead of in the ontologie_specs/. (the folder can be named as "ftm/" and work similarly as ft/) in forte.

@hunterhector Are we planning to have independent ontology, dataset readers and processors for this repo or will we be reusing most of these components from forte itself?

  • we don't expect the users of this package to learn the use of forte/fortex, so yes, we want to have an independent ontology (except concepts in the base_ontology of forte), dataset readers and processors. Some of them may just be trivial wrappers of existing components in forte/fortex.

@tanyuqian tanyuqian reopened this Feb 18, 2022
@hunterhector
Copy link
Member

Thanks to Bowen for the review. To help clarify one of Bowen's points, we will need to put the ontology at the root folder with a good ontology prefix. "demo.clinical" is not really a permanent choice. Also, since the generated content is python source, so we don't want to put them in another directory like "ontologies_specs", we simply put them at the top level, so that the package prefix actually matches the package path.

Also note that we do not want to mix generated sources with your own code (this didn't happen, just a heads up in advance).

Btw, there might be reasons that we want to stick to "ftx". One reason is that annotators in forte-wrappers are already using the classes under "ftx". Another reason is that we keep inventing new prefixes for every project we will have a lot by the end of the day :p. So, to do this, we should use name space packageing. This will ensure even we have "ftx" package in different places they can still be imported.

We are developing the new namespace packaging feature (asyml/forte#577) in the generator, but we can also do it manually by removing an __init__ file.

So the directory structure would look like this:

- forte_medical (source code our main content)
- ftx (generated source code)
  |_ med
      |_ __init__.py (note that __init__ is only at this level, not in the ftx folder)
      |_ clinical
      |_ biological

Note that we can even modify the "forte-wrapper" repo to make both sides to be consistent if we want.

@Piyush13y
Copy link
Collaborator

Piyush13y commented Feb 19, 2022

Thank you for the review.
I have created a PR for the immediate changes.

Also, creating issues in forte and forte-wrappers, to make the ftx structure consistent since its a small task.
@hunterhector I wasn't sure if 'biological' would be the right tag for our existing medical ontology classes, i.e. UMLSConceptLink and MedicalEntityMention. Just wanted to double check this once.

@hunterhector
Copy link
Member

hunterhector commented Feb 19, 2022

Also, creating issues in forte and forte-wrappers, to make the ftx structure consistent since its a small task. @hunterhector I wasn't sure if 'biological' would be the right tag for our existing medical ontology classes, i.e. UMLSConceptLink and MedicalEntityMention. Just wanted to double check this once.

You are right, I used biological just as an example so you know there can be multiple files there, we probably will stick with clinical for a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants