This repository contains a set of SQL queries for analyzing a LEGO Set dataset. The dataset includes information about various LEGO sets, such as set ID, name, year of release, theme, category, and more.
The dataset is structured with the following columns:
- Set_ID
- Name
- Year
- Theme
- Theme_Group
- Subtheme
- Category
- Packaging
- Num_Instructions
- Availability
- Pieces
- Minifigures
- Owned
- Rating
- USD_MSRP
- Total_Quantity
- Current_Price
- Select all columns from the dataset:
- Filter data for a specific year:
- Count the number of rows in the dataset:
- Find the unique themes in the dataset:
- Calculate the average rating:
- Find the highest USD_MSRP:
- Show the top 10 highest-rated sets
- Count the number of sets in each theme:
- Find the average USD_MSRP for each year:
- List the sets with more than 500 pieces:
- Calculate the total number of pieces for each theme:
- Find the sets with the highest number of minifigures:
- Count the number of sets in each category for a specific year:
- Calculate the average number of minifigures for each theme group:
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Ensure that you have the dataset file (Lego_set_data.csv) available. Modify the file name in the queries to match the actual file name. Executing Queries:
Copy and paste the SQL queries into your preferred SQL environment (e.g., MySQL, SQLite, PostgreSQL). Execute the queries to retrieve specific information or insights from the dataset. Analysis and Interpretation:
Use the results of the queries to analyze various aspects of the LEGO Set dataset. Explore trends, distributions, and relationships between different columns.