Skip to content
This repository has been archived by the owner on Nov 23, 2024. It is now read-only.

Extraction from various docstrings that contain Boundary Type #48

Closed
mmdoja opened this issue Feb 6, 2022 · 1 comment · Fixed by #111
Closed

Extraction from various docstrings that contain Boundary Type #48

mmdoja opened this issue Feb 6, 2022 · 1 comment · Fixed by #111
Labels
@boundary Related to the @boundary annotation enhancement 💡 New feature or request

Comments

@mmdoja
Copy link
Contributor

mmdoja commented Feb 6, 2022

Is your feature request related to a problem? Please describe

Extraction of boundary type of the following two formats:

Must be between 0 and 1 
tuple (q_min, q_max), 0.0 < q_min < q_max < 100.0
The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1
float > 0 and <= 1
non-negative float
Must be strictly positive
must be a positive float

Desired solution

Although some particular format can be solved with the help of Regex, the rest can be solved using Matchers in spaCy. Phrase matcher and Dependency matcher features of spaCy can also be looked into for solving this problem statement. The reason for using spaCy here is for the machine to process the natural language in which the description of the boundaries are given.

For example: non-negative float so non-negative here means inclusion of 0.

Possible alternatives (optional)

Screenshots (optional)

Additional context (optional)

@mmdoja mmdoja added the enhancement 💡 New feature or request label Feb 6, 2022
@duklin duklin changed the title Extraction of boundaries of other categories. Extraction from various docstrings that contain Boundary Type Feb 7, 2022
@duklin
Copy link
Contributor

duklin commented Feb 7, 2022

Ideally, all the different docstrings would be first transformed using regex as in a pre-processing stage before using spacy to find the relations between the boundary and the parameter itself.
This issue seems very related to lars-reimann/api-editor#404

@lars-reimann lars-reimann transferred this issue from lars-reimann/sem21 Feb 10, 2022
@lars-reimann lars-reimann added the @boundary Related to the @boundary annotation label Jun 25, 2022
@lars-reimann lars-reimann transferred this issue from Safe-DS/API-Editor Mar 19, 2023
@nvollroth nvollroth linked a pull request Apr 26, 2023 that will close this issue
lars-reimann added a commit that referenced this issue May 5, 2023
Closes #48, closes #36, closes #35, closes #32, closes #31, closes #30,
closes #27, closes #8.
 

### Summary of Changes

SpaCy rules were generated to recognize named examples and extract the
resulting boundaries.

### Instructions for Manual Testing (if required)

1. Run `pytest` for `test_extract_boundary_values.py`.
2. Check the results of `pytest`.

---------

Co-authored-by: megalinter-bot <[email protected]>
Co-authored-by: Lars Reimann <[email protected]>
@github-project-automation github-project-automation bot moved this from Backlog to ✔️ Done in Library Analysis May 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@boundary Related to the @boundary annotation enhancement 💡 New feature or request
Projects
Status: ✔️ Done
Development

Successfully merging a pull request may close this issue.

3 participants