Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEO 2024 queries #3791

Merged
merged 8 commits into from
Nov 10, 2024
Merged

SEO 2024 queries #3791

merged 8 commits into from
Nov 10, 2024

Conversation

henryp25
Copy link
Contributor

@henryp25 henryp25 commented Oct 14, 2024

Makes progress on #3600

This PR adds the finalized SQL files which now include an is_root_page element that differentiates between the homepage and secondary pages. All SQL files utilize the June dataset, as it was the originating dataset used during the construction of these queries.

Context:
These changes were made to finalize the SQL queries for the 2024 SEO analysis. The new is_root_page element improves data separation between homepages and other pages, enhancing the overall analysis accuracy. Additionally, minor updates were applied to the SQL queries from 2022 to align with the new dataset structure. Common Table Expressions (CTEs) were introduced to improve efficiency and query readability.

Changes Made:

  • Introduced an is_root_page element to separate homepage and secondary page data.
  • Updated all queries to use the June dataset for consistency with the original development.
  • Slight modifications to 2022 SQL files for better compatibility with the new dataset and added CTEs to improve efficiency.

@tunetheweb tunetheweb changed the title Add finalized SQL files with is_root_page element for improved efficiency SEO 2024 queries Oct 14, 2024
@tunetheweb tunetheweb added the analysis Querying the dataset label Oct 14, 2024
Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM with a couple of small comments.

Let me know when good to merge.

sql/2024/seo/image-loading-property-usage-2024.sql Outdated Show resolved Hide resolved
sql/2024/seo/robots-text-size-2024.sql Outdated Show resolved Hide resolved
@tunetheweb tunetheweb merged commit 083de67 into HTTPArchive:main Nov 10, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis Querying the dataset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants