Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to Policy and Guidance search before transitioning to sitemaps for global search #5595

Closed
5 tasks done
Tracked by #139
johnnyporkchops opened this issue Feb 2, 2023 · 6 comments
Closed
5 tasks done
Tracked by #139
Assignees
Labels
Guidance Search Work associated with Executive Order requiring Policy and Guidance search page Pairing opportunity Work: Content Work: Front-end

Comments

@johnnyporkchops
Copy link
Contributor

johnnyporkchops commented Feb 2, 2023

Summary

What we're after:
The current way we limit the Policy and guidance search to specific files and webpages will not work once we implement sitemaps for FEC.gov global site search as well. We will need to limit the policy and guidance search site (https://search.usa.gov/sites/8042/domains) to specific folders to continue to limit search results to only the policy and guidance html and PDF content.

Policy and guidance search site:: https://search.usa.gov/sites/8042
Global search site: https://search.usa.gov/sites/6738
Sandbox: https://search.usa.gov/sites/8940

Related issues/PR:

Issues:

PR:

Completion criteria

Policy guidance search continues to limit search results to the pdfs and webpages listed in the sitemaps once we implement sitemaps for global search.

Tech steps or considerations (optional)

List any considerations the tech team should know. Additionally, any specific tech steps can be included here.

  • For policy-guidance PDFs on s3: Create a subfolder on s3 for the files, something like /guidance or /policy-guidance
    • should this be at /resources/cms-content/documents/<new folder> --or just-- /<new folder> ?
    • Move current policy-guidance PDFs into folder
    • Document process to coordinate between content team and frontend team when a new doc is added or updated
      • consider the viability of using some S3-command wizardry to automate(or partially automate) moving files
  • For policy-guidance HTML Pages:: Any policy guidance HTML pages need to be aliased under a new parent page (for example: /guidance in Wagtail.)
    • Create parent page
    • Alias existing pages under new parent
    • Document process for creating new, or adding existing Wagtail pages to the list of html pages searchable by policy-guidance, including the alias process.
  • Add both domains(paths) to https://search.usa.gov/sites/8042/domains

Before doing the above steps for the policy-guidance search.gov site, we need to test these on search.gov sandbox using dev.fec.gov:

  • Test using dev.fec.gov and search.gov in sandbox using the domains section (https://search.usa.gov/sites/8940/domains ) with dev urls: for example: dev.fec.gov/resources/cms-content/documents/policy-guidance/ and dev.fec.gov/updates/guidance-search
  • If the above test is not definitive, setup a PR to dev where the whole search config is setup like prod, including pointing to the sandbox endpoint.

Future work

Once this is setup and tested, we can implement sitemaps with global search. #5579

@dorothyyeager dorothyyeager added the Guidance Search Work associated with Executive Order requiring Policy and Guidance search page label Feb 2, 2023
@johnnyporkchops
Copy link
Contributor Author

johnnyporkchops commented Feb 3, 2023

@patphongs I think it would be better to copy the pages to be under the new parent page and set it as a page alias, then any update to the original will be updated. Then we do not have to worry about redirects and losing track of pages in /updates.
https://guide.wagtail.org/en-latest/how-to-guides/copying-and-aliasing-pages/#alias-pages

@patphongs
Copy link
Member

new parent page and set it as a page alias

Excellent find @johnnyporkchops! I just tested this out in the dev environment for one of the pages and it works out great! One question I do have though is if the search.gov will be able to see that there's an update on the alias page? Do they just crawl through our guidance search directory and check for changes between the pages?

@johnnyporkchops
Copy link
Contributor Author

johnnyporkchops commented Feb 7, 2023

Do they just crawl through our guidance search directory and check for changes between the pages?

@patphongs Yes, I think that is how it works. I'll add that to a second round of questions for search.gov support (Amani). I assume currently content team would update the lastmod date on the sitemap-html.xml file if a page is changed. Is that their current practice? (cc @kathycarothers )?

@kathycarothers
Copy link
Contributor

@johnnyporkchops Here is what we do:

Updating the Site Maps
We have sitemaps that are XML files (one for the Guidance search documents that are HTML and one for the documents that are PDF). The files are NOT HTML files but XML. We have uploaded both of these into Github and Wagtail. You’ll have to update both versions. The Wagtail one will update the Guidance Search immediately. The GitHub code is just keeping our code updated and stored in our repo.

When we upload a new file or publish a replacement PDF file, you will need to update the PDF site map in both places. When we add a new HTML page or update an existing one, you will need to update the HTML site map in both places.
For replacement files or updated HTML pages, you’ll be updating the revision date.

Updating the XML file that is uploaded to Wagtail:
Go into the shared folder https://drive.google.com/drive/folders/19D5ElBRi0lPt8yJSBawHNGCN4Lr3TK6J?usp=sharing
Update the code using Notepad++ or some other editor. MAKE SURE IT IS SAVED AS XML and with the same name as before.
Upload the revised code to Wagtail into the XML collection (delete the old file first).
Upload the revised file to Content team folder.

@johnnyporkchops johnnyporkchops mentioned this issue Feb 8, 2023
1 task
@johnnyporkchops
Copy link
Contributor Author

johnnyporkchops commented Feb 22, 2023

New folder structure on S3 for guidance search is:

/resources/cms-content/documents/policy-guidance/

@pkfec
Copy link
Contributor

pkfec commented Jun 15, 2023

Research done! Closing this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Guidance Search Work associated with Executive Order requiring Policy and Guidance search page Pairing opportunity Work: Content Work: Front-end
Projects
None yet
Development

No branches or pull requests

6 participants