Remove redundant columns after doi_retrieval #35

SagevdBrand · 2021-12-15T14:11:06Z

With this PR the deduplication script and specifically the conservative deduplication part is patched.
When importing the results from the doi_retrieval part, it appeared that some columns were created through R, which botched the conservative_deduplication script.

With this PR specifically the following is patched:

Remove the redundant columns described above
Store the output after doi_deduplication in the R environment as a variable called df_doi_deduplicated (this a temporarily saved object, this prevents you from having to repeat the doi deduplication of more than 20.300 duplicated sets)
Fix the text to be put on the console after deduplicating via the conservative strategy.
Add descriptives to be outputted to the console for the quality check
Add manual check for duplicates identified based on title for quality check
Add records to TEST file which will pop up in conservative deduplication strategy

This PR fixes issue #36 and issue #37.

SagevdBrand · 2021-12-15T14:52:36Z

When running the quality check script, you will now be probed to manually check whether the title, abstract, authors, year, and journal are identical for duplicates identified only on title:

Possibly duplicated titles:
 Outcome of panic disorder with or without concomitant depression: A 2-year prospective follow-up study
Outcome of panic disorder with or without concomitant depression: A 2-year prospective follow-up study 
 
 Abstract:
 In a prospective 2-year follow-up study, 32 patients with panic disorder alone and 20 with panic disorder and concomitant depression were investigated. After controlled treatment with either imipramine or doxepin, patients received naturalistic treatment with antidepressants, benzodiazepines, and supportive psychotherapy. They were evaluated for anxiety, depression, and social disability at least every 3 months during the follow-up period. The data showed fluctuation of symptoms in both groups and a less favorable outcome for the patients with comorbid conditions. However, the overall outcome was better than that reported in other studies and indicates that panic disorder is quite responsive to appropriate treatment.
 
In a prospective 2-year follow-up study, 32 patients with panic disorder alone and 20 with panic disorder and concomitant depression were investigated. After controlled treatment with either imipramine or doxepin, patients received naturalistic treatment with antidepressants, benzodiazepines, and supportive psychotherapy. They were evaluated for anxiety, depression, and social disability at least every 3 months during the follow-up period. The data showed fluctuation of symptoms in both groups and a less favorable outcome for the patients with comorbid conditions. However, the overall outcome was better than that reported in other studies and indicates that panic disorder is quite responsive to appropriate treatment. 
 
 Authors:
 Albus, M., Scheibe, G.
 
Albus, M., Scheibe, G. 
 
 Year:
 1993
1993 
 
 Journal:
 The American Journal of Psychiatry
The American Journal of Psychiatry
Is this an actual duplicate? Y or N?

You can answer with Y or N, and depending on that answer, the records will either be deduplicated, or kept as they are.

SagevdBrand · 2021-12-16T10:03:01Z

With the last commits the following was achieved:
The TEST files are slightly adapted to:

prompt the conservative_deduplication strategy
prompt the extra deduplication based on title strategy within the quality check

Update the merging function in the conservative strategy, which increase the number of found duplicates!

…/asreview/paper-megameta-postprocessing-screeningresults into conservative-deduplication-patch

…' into conservative-deduplication-patch

This reverts commit 0aed29b.

…/asreview/paper-megameta-postprocessing-screeningresults into conservative-deduplication-patch

SagevdBrand added 3 commits December 15, 2021 12:38

Remove redundant columns after doi_retrieval

a72e263

Fix conservative deduplication function

e021b3a

Create counter for quality check deduplication and patch

f03cc6f

This was linked to issues Dec 15, 2021

Conservative deduplication does not run #36

Closed

Deduplication during quality check is not conservative #37

Closed

SagevdBrand added 4 commits December 16, 2021 08:23

Adapt TEST files for conservative dedup test

ac4a993

Fix rounding when printing info

d52f852

Update merge function for cons_deduplication

75c83e8

Add test for quality_check_dedup and update README

0b87511

SagevdBrand and others added 12 commits December 16, 2021 11:42

Print more information to console

2cec372

correct file name in readme

a9a0a4d

add instruction to install package tqdm

44aa454

add link to installatIon instructions jupiter notebook

98645c7

Add source function for printing info

473f0da

Merge branch 'conservative-deduplication-patch' of https://github.com…

60fee85

…/asreview/paper-megameta-postprocessing-screeningresults into conservative-deduplication-patch

clarify results test-data

38ad730

Merge remote-tracking branch 'origin/conservative-deduplication-patch…

68c7956

…' into conservative-deduplication-patch

update output test data for q2

0aed29b

Revert "update output test data for q2"

a4a8b3c

This reverts commit 0aed29b.

Allow for both y and Y as input from user

8c14862

Merge branch 'conservative-deduplication-patch' of https://github.com…

0612166

…/asreview/paper-megameta-postprocessing-screeningresults into conservative-deduplication-patch

Rensvandeschoot merged commit de9d7a8 into main Dec 16, 2021

Rensvandeschoot deleted the conservative-deduplication-patch branch December 16, 2021 13:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove redundant columns after doi_retrieval #35

Remove redundant columns after doi_retrieval #35

SagevdBrand commented Dec 15, 2021 •

edited

Loading

SagevdBrand commented Dec 15, 2021

SagevdBrand commented Dec 16, 2021

Remove redundant columns after doi_retrieval #35

Remove redundant columns after doi_retrieval #35

Conversation

SagevdBrand commented Dec 15, 2021 • edited Loading

SagevdBrand commented Dec 15, 2021

SagevdBrand commented Dec 16, 2021

SagevdBrand commented Dec 15, 2021 •

edited

Loading