You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importcsvimportmultiprocessingimportpubmed_parserasppdefwrite_to_file(f, pmid, result):
try:
ifisinstance(result, dict) and"pmid_cited"inresult:
f.write(f'## PMID : {pmid}\n')
f.write(f'PMID CITED : {result["pmid_cited"]}\n')
# You can add more information from `result` here if neededelse:
f.write(f'Error processing PMID {pmid}: Invalid result format\n')
exceptExceptionase:
f.write(f'Error processing PMID {pmid}: {str(e)}\n')
defprocess_pmid(pmid):
try:
returnpp.parse_outgoing_citation_web(pmid, id_type='PMID')
exceptExceptionase:
returnf'Error processing PMID {pmid}: {str(e)}'if__name__=='__main__':
# Output Markdown fileoutput_file='out1.md'# Open the output file for writingwithopen(output_file, 'w') asf:
# Write Markdown headers or other content here if neededf.write("# Outgoing Citations\n")
# Open and read the CSV file with PMID valueswithopen('pmidfinal.csv', 'r') ascsvfile:
csvreader=csv.reader(csvfile)
# Skip the first 16021 rowsforiinrange(16021):
next(csvreader, None)
# Create a multiprocessing poolpool=multiprocessing.Pool()
forrowincsvreader:
ifrow:
pmid=str(row[0]) # Assuming the 'PMID' column is the first (index 0) columnpool.apply_async(process_pmid, args=(pmid,), callback=lambdaresult: write_to_file(f, pmid, result))
pool.close()
pool.join()
print("Process Complete")
This is my code for the parser (skipped first 16021 rows as i had already gotten information on the ones before)
I have a csv file containing only PMIDs
This is how it looks all PMIDs where taken from pubmeds oa subset
The text was updated successfully, but these errors were encountered:
Thanks @chungimungi! I do not have time to take a look at the code. However, it seems like we need to check parse_outgoing_citation_web to see what goes wrong. The XML format may have changed quite a bit since my last time written this code.
Error In
Parse Outgoing XML citations from website
for a lot of the PMIDs this error is shown
This is my code for the parser (skipped first 16021 rows as i had already gotten information on the ones before)
I have a csv file containing only PMIDs
This is how it looks all PMIDs where taken from pubmeds oa subset
The text was updated successfully, but these errors were encountered: