All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
If you explore the DataFrame, you'll see that the column first name
is "inconsistent" with its capitalization. Some names are capitalized (Alexis, Jodi), but some others are not (misty, matrick).
Create a new series capital_first_name
that contains the results of the column first name
correctly capitalized.
Now we can see the Column last name
has very messy format too. Some of the middle letters are capitalized like in HarRISs, DaniEEIs
and some of the first names are not capitalize. So, convert all of them to lower case and store the result in the variable lower_last_name
Lets count all the Customers in the column usertype
and sum them up. Store your sum in the customer_counts
variable.
As you got the total number of Customers in usertype
from the previous question, then also find how many Subscribers are there in total.
You can subtract the number of Customers from the total lenght of the dataframe to find the remain - which are Subscribers.
Find all the names in the first name
column that start with the latter "Z"
. Store the result in the variable starts_with_z
.
Be careful! It's capital
Z
, not lowercasez
.
Use str.join()
method to join the bikeid with in the Column
bikeid
and store the output in the variable spaced_bikeids
Once you split all the emails in the Column emails
on @
then the value at second index will be the domain of the email.