Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Issue when searching MURs by citation 30123 in keyword search #5208

Closed
2 tasks
Tracked by #159
JonellaCulmer opened this issue May 9, 2022 · 3 comments
Closed
2 tasks
Tracked by #159

Comments

@JonellaCulmer
Copy link
Contributor

JonellaCulmer commented May 9, 2022

What we're after:
Correcting issue with front-end display after searching by 30123 in MUR keyword search.

Search query: https://www.fec.gov/data/legal/search/murs/?search_type=murs&search=30123&case_no=&case_respondents=&case_min_open_date=&case_max_open_date=&case_min_close_date=&case_max_close_date=

In this similar issue, the offending characters were in the OCR'd content:
Similar issue: #1861

What the page looks like:

Before:
Screen Shot 2022-05-09 at 10.35.32 AM.png

After:
Screen Shot 2022-05-09 at 10.35.51 AM.png

Completion criteria:

@JonellaCulmer JonellaCulmer added this to the Sprint 18.2 milestone May 9, 2022
@patphongs patphongs changed the title Bug: Issue when searching by 30123 in keyword search Bug: Issue when searching MURs by 30123 in keyword search May 11, 2022
@patphongs patphongs changed the title Bug: Issue when searching MURs by 30123 in keyword search Bug: Issue when searching MURs by citation 30123 in keyword search May 11, 2022
@patphongs
Copy link
Member

patphongs commented May 18, 2022

There seems to be an issue with the data for these particular MUR documents. They are identical documents with the exception of the MUR document header number at the top of every PDF page. The special characters within this is causing the HTML to form badly on the results.

https://www.fec.gov/files/legal/murs/6917/6917_33.pdf
https://www.fec.gov/files/legal/murs/6929/6929_28.pdf

The highlights return something that is incoherent when searching for the keyword "30123" in particular. The return is:

5*<em>30123</em>!20<*,!6@/<!1360#3/*6! #-02=*0<*,N!9**'!*N=N'!GHIL(((L(((((7!`*5#1-!5*66#=*!?,25!GHI!

Sample API calls:

@patphongs
Copy link
Member

Reached out to internal team to ask about the scan on these two PDF files. Will update once I receive a response.

@patphongs
Copy link
Member

These two MUR documents have been rescanned correctly. It no longer shows in the search results for 30123.

https://www.fec.gov/files/legal/murs/6917/6917_33.pdf
https://www.fec.gov/files/legal/murs/6929/6929_28.pdf

This issue is now resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants