-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[proofing] search and replace matching strings across all pages in a project #476
[proofing] search and replace matching strings across all pages in a project #476
Conversation
@shreevatsa or @suhasm , can you try search+replace interface and give me feedback? I think to try locally this PR you can I deployed it on my staging env: http://147.182.220.33:5001/proofing/satya/replace |
…da into issue-470-search-replace
* search and replace matching strings across all pages in a project * added regex query support * fixed lint issue
* search and replace matching strings across all pages in a project * added regex query support * fixed lint issue * updated Search & Replace comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a first pass it looks good to me. I'd like to experiment with it in staging -- can you help me use your staging deploy? Otherwise I'll try it on dev this weekend.
"line_num": line_num, | ||
} | ||
) | ||
except TimeoutError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the use case for regex search? If it has a big risk surface we can disable it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suhasm asked for it. The use case is real where Vision api might have incorrectly extracted a pattern of words and the proofer wants to correct them across the book.
It increases the risk surface, but I also acknowledge the fact that these proofers are registered users who are committed to proofreading and helping out. Also, the implementation limits the surface to one line at a time.
replace_form_key = f"match{page_.slug}-{line_num}-replace" | ||
|
||
if selected_keys.get(form_key) == "selected": | ||
LOG.debug(f"{__name__}: {form_key}: {selected_keys.get(form_key)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than explicit __name__
in these logs, we should try to either configure the logger or use a helper function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I punted that work and just forgot about it. I find logging extremely helpful but I didn't invest time to fix it systematically.
@@ -105,6 +105,71 @@ def test_search__bad_project(rama_client): | |||
assert resp.status_code == 404 | |||
|
|||
|
|||
def test_replace(moderator_client): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the tests!
lambda m: Markup(f"<mark>{escape(m.group(0))}</mark>"), line | ||
) | ||
marked_replace = query_pattern.sub( | ||
Markup(f"<mark>{escape(replace)}</mark>"), line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than build HTML directly, let's move all of this logic into the template. Then marked_replace
can be a simple list of strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than build HTML directly, let's move all of this logic into the template. Then
marked_replace
can be a simple list of strings.
HTML and js take time for me to get right. I'll try to find a way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, no problem -- I can clean it up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that it's desirable to move HTML markup into Jinja, but it seems to complicate the solution somewhat.
When using query_patten.sub(), the matches are marked in-place with as they are found. If an unmarked line is passed to Jinja, we need to either send a list of matches in the line or re-run the regex in Jinja code. Additionally, to mark the replaced substrings accurately, they should be passed as a list. Therefore, marking these substrings in _replace_text() seems to strike a good balance between clean language separation and the simplicity of the solution.
for key, value in request.form.items() | ||
if key.startswith("match") and not key.endswith("replace") | ||
} | ||
render = _select_changes(project_, selected_keys, query=query, replace=replace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very thoughtful! I was imaging a simple batch search/replace, but being able to review all changes makes this really special.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still more work to be done. Ideally, pagination should be added to the review page. We also need to catch runaway searches such as 'ॐ' or 'the' to ensure memory consumption is kept under control. However, I am relying on users to be responsible in their searches.
Please above link with "http://" instead of "https://" as some browsers suggest it to be. |
Feature
Search and replace a string works. May need stylesheet changes to pretty it up. Completes #455.
Closed #407. Created a new clean branch and as the base branch for #470 pulled an existing commit.
Tests