All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Create a list of unique part names that are spare in the inventory. Store this list, sorted in ascending lexicographical order, in the variable spare_part_names
.
Your list should look like this:
['1/4 CIRCLE TILE 1X1',
'1/4 CIRCLE TILE 1X1 with Lattice Pie Print',
'1/4 CIRCLE TILE 1X1 with Pizza Print',
'1/4 CIRCLE TILE 1X1 with Watermelon Print',
'10240stk01',
'10240stk02',
'10240stk03',
'3.2 shaft w/ knob',
'3D Glasses Atlantis',
'Adventurers Mini Comic Book 1',
'Adventurers Mini Comic Book 2',
...
]
Calculate, for each transparent color, the number of parts in the inventory with a quantity greater than 5. Store the resulting series of counts in the variable trans_colors
.
Your series should look like this:
Determine which theme or themes had the most sets released between the years 1970 and 2000. Store the list of themes with the highest set counts in the variable themes_with_max_count_names
.
Calculate the number of parts belonging to each category and store the resulting series of counts in the variable categories
.
Utilizing the colors_df
and inventory_parts_df
dataframes, calculate the total number of parts in the inventory for each color_id
. Store the resulting series of counts in the variable colors
.
Compute the average number of parts per set for each theme and identify the theme or themes with the highest average. Store the resulting list of theme names in the variable highest_average_themes
.
Determine the inventory with the maximum number of sets and the highest version simultaneously. Store the resulting inventory name in the variable inventory_name
and the version in the variable version
. The set_num
column in the inventories_df
dataframe represents the name of the inventory.
Determine the inventory or inventories with the highest total quantity of parts across all part categories. Store the resulting list of inventory names in the variable inventory_names
. The set_num
column in the inventories_df
dataframe represents the name of the inventory.
Hint: Use the
groupby()
operation to group the data byinventory_id
andpart_cat_id
. Calculate the total quantity of parts for each inventory and part category.
Store the resulting list in the variable parent_themes_with_highest_count
.
Hint: Think of merging
themes_df
with itself using suitable suffixes. Count the number of unique child themes for each parent theme. Finally, find the parent theme with the maximum number of child themes.
Store the resulting list of unique set numbers in the variable set_nums_inventory_with_highest_spare_parts
.
Store the resulting list in the variable inventory_names
.
Store the resulting series in the variable total_sets_per_theme_year
.