All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Find the player with the highest number of assists that played AT LEAST 1000
minutes.
The assists value is computed "per minute". Provide the full name of the player, followed by their assists number. Only consider players who have played at least 1000
minutes.
Example of the expected input: Harry Kane, 0.33
Important: the whitespace between the comma and the number is important.
Only consider players:
* who have taken at least 10 shots
* played in the Premier League
Hint: The goals-to-shots ratio is computed by dividing the number of goals by the number of shots.
Example of expected input: Robert Lewandowski, 0.56
Important: The whitespace between comma and the number is important.
Create a dictionary named average_age_per_squad
where the keys are the squad names, and the values are the average ages rounded to the nearest whole number.
Example of expected output:
{'Liverpool': 26, 'Manchester City': 27, 'Chelsea': 25, ...}
Hint: To calculate the average age, sum up the ages of all players in each squad and divide by the number of players in that squad.
Find the player who has played the most minutes in Premier League.
Example of input: Milan
Find the maximum minutes played in La Liga
. Here La Liga
is the Premier League/Comeptition name.
Consider only teams that:
Average Goals per Game is defined as: the total goals scored divided by the number of games played by the team.
Find the highest average goals per game and the name of the team which scored the highest goals per game and select the correct answer from the options below.
Create a new variable named players_dict
that contains ALL the players but each player represented as a dictionary, containing only the keys:
player
nation
position
squad
competition
age
The resulting variable should have this structure:
[{'player': 'Brenden Aaronson',
'nation': 'USA',
'position': 'MFFW',
'squad': 'Leeds United',
'competition': 'Premier League',
'age': 22},
{'player': 'Yunis Abdelhamid',
'nation': 'MAR',
'position': 'DF',
'squad': 'Reims',
'competition': 'Ligue 1',
'age': 35}]
The variable should be a dictionary where the keys are the nation names, and the values are lists of players belonging to that nation.
Each player should be represented as a dictionary containing the keys:
player
position
squad
competition
age
.Example of expected output:
{'USA': [{'player': 'Brenden Aaronson',
'position': 'MFFW',
'squad': 'Leeds United',
'competition': 'Premier League',
'age': 31},
{'player': 'Tyler Adams',
'position': 'MF',
'squad': 'Leeds United',
'competition': 'Premier League',
'age': 31},
...
],
...
}
The variable should be a dictionary where the keys are the squad names, and the values are lists of players belonging to that squad.
Each player should be represented as a dictionary containing the keys:
player
position
squad
competition
age
.Example of expected output:
{'Leeds United': [{'player': 'Brenden Aaronson',
'position': 'MFFW',
'squad': 'Leeds United',
'competition': 'Premier League',
'age': 22},
{'player': 'Tyler Adams',
'position': 'MF',
'squad': 'Leeds United',
'competition': 'Premier League',
'age': 23},
...
],
...
}
Create a new variable named average_age_by_competition
that contains the average age for each competition. The competitions should be the keys, and the values should be the average ages rounded to one decimal place.
Example of expected output:
{
'Premier League': 27.5,
'La Liga': 26.8,
'Bundesliga': 25.9,
...
}
Note that, the value for
age
is rounded to the nearest 1 decimal point
Transform the dataset to create a new variable named average_stats_by_position
. This variable should contain the average values of goals, assists, and shots on target (SoT) for each position across all players. The positions should be the keys, and the values should be dictionaries with the keys goals
, assists
, and sot
, representing the average values for each statistic.
Example of expected output:
{
'FW': {'goals': 12.5, 'assists': 5.3, 'sot': 8.1},
'MF': {'goals': 6.2, 'assists': 8.7, 'sot': 3.9},
'DF': {'goals': 2.8, 'assists': 3.1, 'sot': 1.6},
...
}
Note that, the values for
goals, assists, sot
are rounded to the nearest 1 decimal point
Create a dictionary containing each competition as key, and the sum of all the goals scored as a value. Store the result in the variable goals_per_comp
. It should look something like:
{
'Premier League': ...,
'Serie A': ...,
...
'Bundesliga': ...
}
Create a dictionary named total_minutes_by_squad
where the keys are the squad names, and the values are the total minutes played by that squad.
Example of expected output:
{
'Manchester United': 4578,
'Real Madrid': 5123,
'Bayern Munich': 3984,
...
}
Create a dictionary named average_starts_per_comp
where the keys are the competition names, and the values are the average number of starts rounded to the nearest whole number.
Example of expected output:
{
'Premier League': 23,
'La Liga': 19,
'Bundesliga': 21,
...
}
Find the top scorers of each competition (maximum number of goals scored). Store your results in the variable top_scorers_per_comp
. Attention! There might be more than one top scorer in the league, so your result should be a list of dictionaries containing each player and their goals as a tuple. Example:
# this is not real data or real result
# just to demonstrate the structure
{
"Ligue 1": [
("Lionel Messi", 14),
("Kylian Mbappe", 14),
],
'La Liga': [
('Robert Lewandowski', 18)
]
}
Create a list named goals_and_assists_by_player
which is a list of dictionaries. Each dictionary should contain the keys player
, competition
, goals
, and assists
.
Example of expected output:
[
{'player': 'Harry Kane', 'competition': 'Premier League', 'goals': 25, 'assists': 12},
{'player': 'Lionel Messi', 'competition': 'La Liga', 'goals': 30, 'assists': 18},
{'player': 'Cristiano Ronaldo', 'competition': 'Serie A', 'goals': 27, 'assists': 10},
...
]
Expected Output is a dictionary with the player name as the key and the value is a dictionary with keys goals
and assists
and their respective values.
{
'Harry Kane': {'goals': 25, 'assists': 12},
'Lionel Messi': {'goals': 30, 'assists': 18},
'Cristiano Ronaldo': {'goals': 27, 'assists': 10},
...
}
Create a new variable named players_by_age_group
that groups players into different age groups. The variable should be a dictionary where the keys are the age group names (e.g., 'Under 20', '20-25', '26-30', 'Over 30'), and the values are lists of players belonging to that age group. Each player should be represented as a dictionary containing the keys player
, nation
, position
, squad
, competition
, and age
.
The age group is considered as:
<20
20 <= age <= 25
26 <= ge <=30
>30
Example of expected output:
{
'Under 20': [
{'player': 'Player A', 'nation': 'Country A', 'position': 'Position A', 'squad': 'Squad A', 'competition': 'Competition A', 'age': 19},
{'player': 'Player B', 'nation': 'Country B', 'position': 'Position B', 'squad': 'Squad B', 'competition': 'Competition B', 'age': 18},
...
],
'20-25': [
{'player': 'Player C', 'nation': 'Country C', 'position': 'Position C', 'squad': 'Squad C', 'competition': 'Competition C', 'age': 22},
{'player': 'Player D', 'nation': 'Country D', 'position': 'Position D', 'squad': 'Squad D', 'competition': 'Competition D', 'age': 25},
...
],
...
}