DataFrames on Film: Concatenating San Francisco's Cinematic Locations
DataFrames on Film: Concatenating San Francisco's Cinematic Locations Data Science Project
Data Wrangling with Pandas

DataFrames on Film: Concatenating San Francisco's Cinematic Locations

This project takes you on a journey through the sets and locations of films shot in the beautiful city `Sans Francisco`. In this project by exploring various methods of `concatenation`, we delve into the rich tapestry of film locations in San Francisco, as catalogued in our dataset.
Start this project
DataFrames on Film: Concatenating San Francisco's Cinematic LocationsDataFrames on Film: Concatenating San Francisco's Cinematic Locations
Project Created by

Vidhi Shah

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

multiplechoice

Basic understanding of concatenation

What is the primary function of the pd.concat() method in pandas?

codevalidated

Concatenate data by year

Concatenate DataFrames for movies released in even and odd years to analyse trends based on year. Use the Release Year column to filter the data.

Store the result in concatenated_years variable.

The result should match the following output :

act2

codevalidated

Concatenate data by specific director

Concatenate DataFrames for movies directed by Blake Edwards and Zachary Shedd to compare their works. Use the Director column to filter the data.

Store the result in concatenated_directors variable.

The result should match the following output :

act3

codevalidated

Concatenate two decades

Create DataFrames for movies released in the 1990s and 2000s. Concatenate them to analyze trends over these decades using the Release Year column.

Store the result in concatenated_decades variable.

The result should match the following output :

act4

codevalidated

Concatenate rows of historical films (pre-1970) and recent films (post-2010)

Concatenate rows of historical films (pre-1970) and recent films (post-2010) to compare trends across eras using the Release Year column.

Store the result in historical_recent variable.

The result should match the following output :

act5

codevalidated

Concatenate series of location names and fun facts side by side

Concatenate the Locations and Fun Facts columns side by side to create a combined view of film locations and their fun facts.

Store the result in locations_fun_facts variable.

The result should match the following output :

act6

codevalidated

Concatenate series of release years and locations side by side

Concatenate the Release Year and Locations columns side by side to see the release years of movies along with their filming locations.

Store the result in year_location variable.

The result should match the following output :

act7

codevalidated

Concatenate director and writer pairs

Extract Director and Writer columns and concatenate them to explore frequent collaborations in the film industry.

Store the result in director_writer_pairs variable.

The result should match the following ouptut : act8

multiplechoice

How would you concatenate two DataFrames that include films released before 1990 and after 2000, ensuring the concatenation is along the rows?

codevalidated

Concatenate actor lists

Extract the Actor 1, Actor 2, and Actor 3 columns and concatenate them to create a comprehensive list of actors.

Store the result in actors variable.

The result should match the following output :

act9

codevalidated

Concatenate director to movie titles

Create a Series by concatenating Director names directly with Title to form listings in this format: "Director Name, directed Title".

Store the result in directed_titles variable.

The result should match the following output :

act10

input

Concatenation with transformation

Group by Title and calculate the count of unique filming Locations per title.

Enter the number of unique location counts across all films.

multiplechoice

If you want to create a single DataFrame that combines information about films by the same director but filmed in different decades, which concatenation strategy would be most appropriate?

codevalidated

Concatenate top and bottom filmed locations

Find the top 5 and bottom 5 most filmed locations and concatenate their records to analyze differences.

Store the result in concatenated_locations variable.

The result should match the following output :

act15

codevalidated

Urban versus Natural set analysis through concatenation

Differentiate films shot in distinctly urban settings from those shot in natural or park-like settings within San Francisco.

Filter records by Locations containing keywords like Financial District and Golden Gate Park.

Store the result in urban_natural_sets variable.

The result should match the following output : act16

DataFrames on Film: Concatenating San Francisco's Cinematic LocationsDataFrames on Film: Concatenating San Francisco's Cinematic Locations
Project Created by

Vidhi Shah

This project is part of

Data Wrangling with Pandas

Explore other projects