Samba Pandas
Samba Pandas Data Science Project
Data Cleaning with Pandas

Samba Pandas

This lab explores the history of the SMB protocol and Samba. Participants will replicate a task of finding words with specific characters using Pandas. Activities include reading words into a DataFrame, selecting rows with character sequences, and writing a generic search function. This enhances understanding of Pandas string handling and regular expressions.
Start this project
Samba PandasSamba Pandas
Project Created by

Santiago Basulto

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

codevalidated

Read the words from `/usr/share/dict` in a pandas dataframe

Read the words from linux's dictionary into the variable df with a single column word. The list of words doesn't have a header, so you'll need to pass special parameters to handle this.

And WARNING! Your df should NOT contain null values afterwards. There's a special trick that you'll need to apply.

codevalidated

Select all the rows that have the chars `s`, `m`, `b`, with a twist

Your task in this activity is to find ONLY the words in which the first m is before the first b. And of course, start with an s.

Store the result in the variable smb_df.

codevalidated

Write a generic function that can search any combination of characters

Write the function find_words_with_chars that can generically find words with the combinations of characters passed following our previous rules. Make sure your function is NOT modifying the original dataframe passed:

def find_words_with_chars(df, chars):
    aux_df = df.copy()
    # Your code here
Samba PandasSamba Pandas
Project Created by

Santiago Basulto

This project is part of

Data Cleaning with Pandas

Explore other projects