All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Determine and input the aggregate number of games that Nintendo
has published. This should be presented as an integer value.
Please note that you should count the games by their unique
name
, rather than by theirrank
.
Enter the name of a game with the highest global sales as a string.
Enter the total number of games released in year 2012 as an integer.
Count unique games by
name
and not byrank
.
Create a dictionary top_10_games
with the name of the game as key and the global sales as value.
If there are multiple games with the same name, then the game with the highest global sales should be considered.
Example expected output:
{'Wii Sports': 82.74,
'Super Mario Bros.': 40.24,
'Mario Kart Wii': 35.82,
'Wii Sports Resort': 33.0,
...
}
Your input should be formatted as platform: global_sales
e.g Duck Hunt: 28.31
.
There is a space after the colon.
Create a dictionary called platform_sales
, which will consist of nested dictionaries. Each nested dictionary will have two key-value pairs: one with the key total_sales
and its corresponding value, and another with the key game_count
and its corresponding value."
{'2600': {'total_sales': 97.08000000000003, 'game_count': 133},
'PSP': {'total_sales': 296.2799999999948, 'game_count': 1213},
'XOne': {'total_sales': 141.05999999999995, 'game_count': 213},
'GC': {'total_sales': 199.3600000000007, 'game_count': 556},
'WiiU': {'total_sales': 81.86000000000006, 'game_count': 143},
'GEN': {'total_sales': 28.360000000000003, 'game_count': 27},
...
}
Create a list distribution
which will have a tuple with the Name of the Platform and its percentage.
Percentage will be the Platform's Global_Sales / Total Global_Sales
If one platform has multiple games, then the platform's global sales will be the sum of all the games' global sales.
Example expected output:
[('Wii', 10.38861311773706),
('NES', 2.8145472644843057),
('GB', 2.8636479814892892),
('DS', 9.220285098043025),
('X360', 10.985556766256583),
('PS3', 10.737586935172041),...]
Enter the name of the genre with highest total sales in North America as a string.
Create a dictionary publisher_game_count
where the keys are the publishers
and the values are the counts of global_sales
greater than 1.0 (million).
Example expected output:
{ 'Take-Two Interactive': 91,
'Sony Computer Entertainment': 147,
'Activision': 161,
'Ubisoft': 114,
'Bethesda Softworks': 22,
'Electronic Arts': 338,
'Sega': 74,
'SquareSoft': 19,
'Atari': 42,
'505 Games': 7,...
}
There can be multiple game with same publisher, you have to take the sum of all the games' global sales for that publisher.
Create a list game_year
which will have a tuple with the Name of the Game and its Release Year.
Example expected output:
[('Wii Sports', 2006),
('Super Mario Bros.', 1985),
('Mario Kart Wii', 2008),
('Wii Sports Resort', 2009),
('Pokemon Red/Pokemon Blue', 1996),...]
Create a new variable named total_sales_by_platform
that contains the platform names as keys and their corresponding total sales as values.
Example of expected output:
{'Wii': 0.01,
'NES': 0.12,
'GB': 0.12,
'DS': 0.01,
'X360': 0.01,
'PS3': 0.02,
'PS2': 0.01,
'SNES': 0.02,
'GBA': 0.02,
'3DS': 0.02,..}
Create a dictionary year_game
where the keys will be the year and the values will be the list of games released in that year.
Example of expected output:
{2006: ['Wii Sports',
'Gears of War',
'Pokemon Diamond/Pokemon Pearl',
'New Super Mario Bros.',
'Wii Play',
'Final Fantasy XII',
'Brain Age: Train Your Brain in Minutes a Day',
...],
2008: ['Mario Kart Wii',
'Grand Theft Auto IV',
'Wii Fit',
...],
...
}
Create a dictionary publisher_platform
where the keys will be the publisher and the values will be the list of platforms associated with that publisher.
Example of expected output:
{ 'Nintendo': ['Wii', 'NES', 'GB', 'DS', 'SNES', ... ],
'Microsoft Game Studios': ['X360', 'PC', 'XOne', 'XB', 'XBL', ... ],
'Take-Two Interactive': ['PS2', 'PS3', 'X360', 'PS4', 'PC', ... ],
'Sony Computer Entertainment': ['PS', 'PS2', 'PS3', 'PS4', 'PSP', ... ],
'Activision': ['PS2', 'PS3', 'X360', 'PS', 'Wii', ... ],
...
}
If a publisher has multiple games on the same platform, then the platform should only be listed once.
we have a variable called games
that contains details of various video games, such as their names, platforms, release years, genres, publishers, and sales figures in different regions. However, we noticed that some games share the same year, platform, genre, and publisher, leading to duplicate entries in the dataset.
To address this issue, you will create a new variable called normalized_games
in a way that it contains all the unique games' details without any duplications based on the combination of year, platform, genre, and publisher.
Expected structure of the normalized_games
variable:
normalized_games = {
year: {
platform: {
genre: {
publisher: {
games: [
{
'name': name,
'na_sales': na_sales,
'eu_sales': eu_sales,
'jp_sales': jp_sales,
'other_sales': other_sales,
'global_sales': global_sales
},
...
]
}
}
}
}
}
Create a sorted list named top_3_platforms
with the top 3 platforms with the highest average sales per game, considering only platforms that have released at least 50 games.
Enter the name of the genre with highest total sales in North America and Europe as a string.
Create a dictionary publisher_avg_sales
where the keys will be the publisher and the values will be the average sales for each publisher. Average sales will be the Publisher's Global_Sales / Total Global_Sales
.
Example of expected output:
{'Nintendo': 0.196,
'Microsoft Game Studios': 0.104,
'Take-Two Interactive': 0.055,
'Sony Computer Entertainment': 0.081,
...
}
Create a list top_ranked
which will have a tuple containing the rank, names, and the genre of the selected games.
Example of expected output:
[(1, 'Wii Sports', 'Sports'),
(2, 'Super Mario Bros.', 'Platform'),
(3, 'Mario Kart Wii', 'Racing'),...]
Construct a sorted list named top_genre
, consisting of genres with the highest average sales per game. Only consider games that have been released in the last 10 years.
Note: 'last 10 year' corresponds to the last decade based on the latest year present in the dataset.
Expected output:
['Shooter',
'Platform',
...
]
Create a dictionary genre_percentage_sales
which contains the genre name as keys and the percentage contribution to the global sales as values.
Example of expected output:
{'Platform': 9.319831757177944,
'Racing': 8.206321661263361,
'Role-Playing': 10.39601185591744,
'Puzzle': 2.7459407831900946,
'Misc': 9.07982117474026,...}
In this activity, you are being tasked to compute the number of unique publishers who have had their games ranked in the top 100 list. The count should be returned as an integer. The ranking of the games should be considered in your selection.
Create a list publisher_game_count
which will have a tuple with the Name of the Publisher and the number of games published by that publisher.
Example expected output:
[('Nintendo', 703),
('Microsoft Game Studios', 189),
('Take-Two Interactive', 413),
('Sony Computer Entertainment', 409),
...
]