Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add risky word alerts to cases uploaded by recap.email #49

Open
mlissner opened this issue May 5, 2023 · 4 comments
Open

Add risky word alerts to cases uploaded by recap.email #49

mlissner opened this issue May 5, 2023 · 4 comments

Comments

@mlissner
Copy link
Member

mlissner commented May 5, 2023

I haven't thought through this completely yet, but it's an idea that I think people might find very useful.

Once we have alerts for RECAP content, as part of freelawproject/courtlistener#612 and freelawproject/courtlistener#1234, we should think about a new kind of per-case alert that we can suggest to people using @recap.email.

Imagine the following:

  1. You're an attorney that is using @recap.email to file cases into CourtListener.
  2. You're working on a case about a minor child, whose name should never appear in public filings.
  3. The opposing counsel kind of sucks (or maybe they're fine, but, anyway, you don't trust them).
  4. You want to react as soon as possible if the child's name appears in a document.
  5. So, you log into CourtListener, find the case, and there's an option to create "Redacted Word Alerts". You input the child's name into a query box as well as a few other tokens that shouldn't show up in the case.
  6. You sit back and hope you never get an alert.
  7. BUT one day, the opposing counsel files something in PACER, which we pick up via @recap.email.
  8. The document has the child's name, so our redacted word alert triggers.
  9. You get a loud email that says something like, "A word you wished to keep out of public filings may have appeared in a recently filed document." The email has some info about the document, like its link, a snippet with the word, etc. Maybe it even has a button to flag it for CourtListener or to auto-generate a sealing demand that you can file with the court (woah).

The idea here is to build on #62, so that when you start using @recap.email you get some immediate benefits and audits on the cases you are working on.

Of course, this could probably also have a webhook, but maybe that's unnecessary? Webhook all the things?

@mlissner
Copy link
Member Author

mlissner commented May 5, 2023

From an architecture perspective, I think this is very similar to freelawproject/courtlistener#612, but it triggers different email content.

@troglodite2
Copy link

I think there is a set of operations that should be performed as we ingest a PDF. We already do the convert to text, we will be able to do an X-ray looking for bad redactions shortly. Besides that, we should be running eyecite to find and create citation links. A word/phrase search seems to be another option.

I'd suggest that this be created as another microservice provided by doctor.

The service would consist of sending a document (text) and a list of phrases with a token associated with each. The return would be a list of tokens and positions in the document where that phrase was found.

The user of the microservices can then use that token to find who gets the alert.

@mlissner
Copy link
Member Author

Doctor is really about converting documents, so I don't think this CL-specific thing would necessarily go in there. I'd think this would work better as a search query, and we'll have the architecture for that fairly soon.

One thing that does merit cleanup though is consolidating all these tasks into a better pipeline, though I think even that is tricky since some of the things can be done before you extract text (like, well, extracting text), and others need the text (like x-ray, citation extraction, alerts). That's only relevant because text extraction is async, so it's kind of a pain to get it all laced together nicely, but not a huge thing.

@mlissner
Copy link
Member Author

mlissner commented Jun 5, 2023

See also #45, which allows for customized include/exclude words for docket alerts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants