All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
The acousticness
column in the DataFrame represents the acoustic level of each song. To make the column name more descriptive and readable, use the rename()
function to change the column name from acousticness
to acoustic_level
. Set the inplace
parameter to True
to modify the DataFrame directly without creating a new copy. This renaming operation will update the column name in the original DataFrame df
.
Rename multiple columns in the DataFrame df
to make them more descriptive, concise, and easily understandable. Change 'danceability'
to 'dance_score'
, 'duration_ms'
to 'duration_milliseconds'
, 'instrumentalness'
to 'instrumental'
, 'liveness'
to 'live_performance'
, and 'speechiness'
to 'speech_presence'
. Assign the resulting DataFrame with the renamed columns back to the variable df
to update the original DataFrame.
Convert the duration_milliseconds
column values to seconds and store the result in a new column named duration_seconds
.
Note : New column added at the end of the
df
Rescale the values in the popularity
column by multiplying them with 0.01 and store the rescaled values in a new column named popularity_score
.
Note : New column added at the end of the
df
Create a new column is_popular
that contains 1 for rows where the popularity
value is greater than 70, and 0 otherwise. Convert the boolean result to integer values, where True becomes 1 and False becomes 0. This new column will indicate whether a song is popular or not, with 1 representing popular songs and 0 representing non-popular songs.
Note : New column added at the end of the
df
Calculate the number of artists for each row by counting the number of commas in the artists
column and adding 1, then store the result in a new column named artist_count
.
Note : New column added at the end of the
df
Convert the duration_seconds
column from seconds to minutes and store the result in a new column named duration_minutes
.
Note : New column added at the end of the
df
Increase the values in the popularity
column by adding 10 to each value.
Reduce the values in the speech_presence
column by multiplying them with 0.8.
Note : Use
df.head().T
for viewing yourdf
.df.head().T
provides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.
Decrease the values in the dance_score
column by subtracting 0.1 from each value.
Replace the numerical values in the mode
column with textual representations, where 0 is replaced with 'Minor' and 1 is replaced with 'Major'.
Limit the maximum value in the tempo
column to 150 by clipping any values above 150 to 150.
Note : Use
df.head().T
for viewing yourdf
.df.head().T
provides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.
Replace the numerical values in the key
column with their corresponding note names, the mappings are:
0 → 'C'
, 1 → 'C#'
, 2 → 'D'
, 3 → 'D#'
, 4 → 'E'
, 5 → 'F'
, 6 → 'F#'
, 7 → 'G'
, 8 → 'G#'
, 9 → 'A'
, 10 → 'A#'
, 11 → 'B'
Replace the numerical values in the explicit
column with textual representations, where 0 is replaced with 'Not Explicit' and 1 is replaced with 'Explicit'.
Note : Use
df.head().T
for viewing yourdf
.df.head().T
provides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.
For rows where the year
value is less than 1950, replace the year
value with 1950.
Limit the tempo
column values between 50 and 150. For values exceeding 150, replace them with 150, and for values below 50, replace them with 50.
Note : Use
df.head().T
for viewing yourdf
.df.head().T
provides a compact way to view the initial rows as columns, making it easier to scan the data horizontally.