All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
In this activity, you'll create a histogram to analyze the distribution of athlete ages from the dataset. Follow the specifications below to structure your plot:
Figure size : 10 by 6
Colour : green
Edgecolor : black
Bins : 15
Title: Distribution of Athlete Ages
Vertical Line for average age
Color : black
linestyle : dashed
linewidth : 2
In this activity, you will use the medal_counts
DataFrame to create a stacked bar chart(with the .plot
method) that visually represents the number of medals won in each sport, broken down by medal type.
Set the figure size to 12 x 8
. For colors, assign gold to Gold medals, silver to Silver medals, and brown to Bronze medals for clear differentiation. Label the x-axis as Sport
and the y-axis as Number of Medals
, and add the title : Number of Medals by Sport and Medal Type
. Include a legend for the medal colors and Rotate the x-axis labels by 45 degrees and set the horizontal alignment to right, to prevent overlap and make the labels easier to read. Finally, apply .tight_layout()
to ensure a clean, organized appearance without overlapping or clipping elements.
Using the created DataFrame gender_counts
, create a grouped bar chart with the .plot
method where each sport has two bars, one for each gender using the following information:
Figuresize : 12 by 8
Color : blue for male and pink for female
Title : Number of Athletes by Gender in Each Sport
Include a legend to distinguish between the genders, and rotate the x-axis labels by 45 degrees for better readability, and set the horizontal alignment to right. Apply tight_layout()
to prevent any overlapping or clipping of plot elements, ensuring a clean and organized appearance. Finally, display the chart to see how gender participation varies across different sports.
Using the created medal_counts_by_country
series, create a bar chart with the .plot
method to represent the total number of medals won by each country, where each bar corresponds to a country and its height indicates the total number of medals won. Use gold-colored bars to signify the medals, and add black edges to make the bars stand out.
Figure size : 12 by 8
x-axis label : Country
y-axis label : Total Number of Medals
Title : Total Number of Medals Won by Each Country
Finally, rotate the x-axis labels by 90 degrees and set the horizontal alignment to right . Finally, apply .tight_layout()
to ensure that all plot elements fit neatly without overlap or clipping
With the series medal_counts
, create a pie chart to represent the distribution of medals in Snowboarding.
Use a figure size of 8 by 8, and autopct=%1.1f%%
to display the percentage of each medal type.
For clear differentiation, assign these colors to each medal type:
gold :(#FFD700)
silver :(#C0C0C0)
bronze :(#CD7F32)
.
Set the chart to have equal aspect ratios so the pie appears as a perfect circle. Finally, add a title (Distribution of Medal Ranks in Snowboarding) to provide context to your visualization.
In this activity, you will create a line plot using the medals_per_year
series to visualize the trend in the number of medals awarded over various years. Use the following parameters for the plot:
Figure size: Set the plot size to 10x6 inches.
Line color: Use blue to plot the line.
Marker: Use a circle marker (o
) to represent each data point.
Line style: Use a solid line (-
) to connect the points.
Title: Add a title to the plot: "Number of Medals Awarded Over Different Years".
Finally, to emphasize recent trends, use a gray shaded area to highlight the years after 2000. Apply an alpha value of 0.5
for transparency and add the label "Post-2000" to indicate this section of the graph.
In this activity, you'll use the top_athletes
DataFrame to create a horizontal bar chart showcasing the top 10 athletes based on their medal counts.
Figure size : 12 by 8
Title: Top 10 Athletes by Medal Count
y-axis: Athlete Name
x-axis: Number of Medals
Finally, adjust the layout to make sure everything fits nicely and displays the plot.
Begin by categorizing athletes into age groups using the following bins: [15, 20, 25, 30, 35, 40, 45, 50, 55, 60]
, and corresponding labels: ['15-19', '20-24', '25-29', '30-34', '35-39', '40-44', '45-49', '50-54', '55+']
. Then, use the created cross-tabulation DataFrame sport_age_counts
to create a visualization with a heatmap , where each cell shows the number of athletes for each sport and age group combination.
Figure size : 12 by 8
Cmap : Blues
fmt : d
Annotation : True
Title: Number of Athletes by Sport and Age Group
Rotate the x-axis labels(45) for better readability, and adjust the layout for clarity. Finally, display the plot to illustrate the distribution of athletes across various sports and age groups.
In this activity, you'll create a bar plot using the sports_counts
series to highlight the sports in which athletes under 18 are excelling. Use the following parameters:
Figure size : 12 by 8
Color : palette(defined in notebook)
Edgecolor : black
Grid : linestyle :--
, alpha=0.7
xlabel : Sport, fontsize : 12
ylabel : Number of Medals, fontsize : 12
Title: Sports Where Athletes Under 18 are Thriving, fontsize:16, fontweight:bold
For data labels and annotations, use the following settings to display the value on top of each bar:
Text format: Defined in the notebook for consistent styling.
Position (xy): Center the text horizontally by setting (bar.get_x() + bar.get_width() / 2, bar.get_height()).
Horizontal alignment (ha): center.
Vertical alignment (va): center.
Text offset: Use (0, 8) for positioning above each bar.
Font size: Set to 10, with bold text for emphasis.
Font color: Black for readability.
To make the chart more insightful, highlight sports with multiple medals by setting their bars to red with a linewidth of 3. Rotate the x-axis labels by 45 degrees to enhance readability, allowing viewers to easily identify the sports where young athletes are performing exceptionally well.
In this activity, you will create a horizontal bar chart to visualize the countries that have won the most gold medals, using the created gold_medals_per_country
series. Follow these customization guidelines:
Figure size: Set to 12x8 inches for a clear, large visualization.
Bar color: Use gold to represent gold medals, with black edges for contrast.
Title: "Countries with the Most Gold Medals" (fontsize: 16, fontweight: bold).
x-axis label: Set as "Number of Gold Medals" (fontsize: 14).
y-axis label: Set as "Country" (fontsize: 14).
Next, enhance the chart by adding data labels to the right of each bar, showing the exact count of gold medals. Use the following settings for data labels:
xytext: (5, 0) to position the text slightly to the right of the bar.
textcoords: 'offset points'.
Horizontal alignment (ha): left and Vertical alignment (va): center for proper alignment.
Font size: 10, with black text for clarity.
Finally, improve readability by applying a grid along the x-axis with a dashed line (linestyle='--', alpha=0.7). This chart will effectively showcase the countries excelling in gold medal achievements.
In this activity, you will visualize the distribution of medals by rank across all sports using a bar chart.
Utilize the medal_rank_counts
series, where the x-values correspond to the medal ranks (Gold, Silver, Bronze) and the y-values represent the number of medals awarded for each rank.
Set the figure size to 10 by 6
inches, and customize the bar colors to match the medal types: gold for Gold, silver for Silver, and brown for Bronze. Add a chart title: Medal Rank Distribution
, and label the x-axis(Medal Rank
) and y-axis(Number of Medals
) accordingly.
For clarity, adjust the x-tick labels to display 1 (Gold), 2 (Silver), and 3 (Bronze). Finally, apply .tight_layout()
to ensure the chart is neatly arranged without layout issues.