LEGO Database: A Pandas Joining and Merging Project

codevalidated

List Spare Parts in Inventory

Create a list of unique part names that are spare in the inventory. Store this list, sorted in ascending lexicographical order, in the variable spare_part_names.

Your list should look like this:

['1/4 CIRCLE TILE 1X1',
 '1/4 CIRCLE TILE 1X1 with Lattice Pie Print',
 '1/4 CIRCLE TILE 1X1 with Pizza Print',
 '1/4 CIRCLE TILE 1X1 with Watermelon Print',
 '10240stk01',
 '10240stk02',
 '10240stk03',
 '3.2 shaft w/ knob',
 '3D Glasses Atlantis',
 'Adventurers Mini Comic Book 1',
 'Adventurers Mini Comic Book 2',
 ...
]

codevalidated

Count Parts with Quantity Over 5 for Each Transparent Color

Calculate, for each transparent color, the number of parts in the inventory with a quantity greater than 5. Store the resulting series of counts in the variable trans_colors.

Your series should look like this:

Activity 2

codevalidated

Identify Theme(s) with the Highest Number of Sets (1970-2000)

Determine which theme or themes had the most sets released between the years 1970 and 2000. Store the list of themes with the highest set counts in the variable themes_with_max_count_names.

codevalidated

Count the Number of Parts in Each Category

Calculate the number of parts belonging to each category and store the resulting series of counts in the variable categories.

codevalidated

Count the Total Parts by Color

Utilizing the colors_df and inventory_parts_df dataframes, calculate the total number of parts in the inventory for each color_id. Store the resulting series of counts in the variable colors.

codevalidated

Calculate Average Parts per Set for Each Theme and Find the Theme(s) with the Highest Average

Compute the average number of parts per set for each theme and identify the theme or themes with the highest average. Store the resulting list of theme names in the variable highest_average_themes.

codevalidated

Identify Inventory with the Most Sets and Highest Version

Determine the inventory with the maximum number of sets and the highest version simultaneously. Store the resulting inventory name in the variable inventory_name and the version in the variable version. The set_num column in the inventories_df dataframe represents the name of the inventory.

codevalidated

Identify Inventory with the Highest Total Quantity Across Part Categories

Determine the inventory or inventories with the highest total quantity of parts across all part categories. Store the resulting list of inventory names in the variable inventory_names. The set_num column in the inventories_df dataframe represents the name of the inventory.

Hint: Use thegroupby() operation to group the data by inventory_id and part_cat_id. Calculate the total quantity of parts for each inventory and part category.

codevalidated

Determine the parent theme with the highest number of child themes.

Store the resulting list in the variable parent_themes_with_highest_count.

Hint: Think of merging themes_df with itself using suitable suffixes. Count the number of unique child themes for each parent theme. Finally, find the parent theme with the maximum number of child themes.

codevalidated

Identify the set numbers of the inventory with the highest total quantity of spare parts

Store the resulting list of unique set numbers in the variable set_nums_inventory_with_highest_spare_parts.

codevalidated

Determine the inventory that contains sets with the highest total number of unique colors

Store the resulting list in the variable inventory_names.

codevalidated

Calculate the total number of sets per theme category and year.

Store the resulting series in the variable total_sets_per_theme_year.

Anurag Verma

Project Activities

List Spare Parts in Inventory

Count Parts with Quantity Over 5 for Each Transparent Color

Identify Theme(s) with the Highest Number of Sets (1970-2000)

Count the Number of Parts in Each Category

Count the Total Parts by Color

Calculate Average Parts per Set for Each Theme and Find the Theme(s) with the Highest Average

Identify Inventory with the Most Sets and Highest Version

Identify Inventory with the Highest Total Quantity Across Part Categories

Determine the parent theme with the highest number of child themes.

Identify the set numbers of the inventory with the highest total quantity of spare parts

Determine the inventory that contains sets with the highest total number of unique colors

Calculate the total number of sets per theme category and year.

Anurag Verma

Data Wrangling with Pandas

Set Operations using Sakila

LIKE Operator using World

Membership and Range Operators with World Database