Pandas Basics

Series, DataFrame, Rows, Columns, Filtering

Published

2025-01-27

Open In Colab

Python Pandas Basics

Pandas is a powerful library for data analysis and manipulation in Python.

It provides two main data structures: - Series: A one-dimensional array-like object. - DataFrame: A two-dimensional table with labeled axes (rows and columns).


# Importing pandas
import pandas as pd

Creating a Series


# Creating a Series from a list
data = [10, 20, 30, 40, 50]
series = pd.Series(data)
series
0
0 10
1 20
2 30
3 40
4 50

Creating a DataFrame


# Creating a DataFrame from a dictionary
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35],
    "City": ["New York", "Los Angeles", "Chicago"]
}
df = pd.DataFrame(data)
df
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago

Exploring Data


# Display the first few rows
df.head()

# Display the shape of the DataFrame
print("Shape:", df.shape)

# Display summary statistics
df.describe()
Shape: (3, 3)
Age
count 3.0
mean 30.0
std 5.0
min 25.0
25% 27.5
50% 30.0
75% 32.5
max 35.0

Selecting Data

# Selecting a single column
df["Name"]
Name
0 Alice
1 Bob
2 Charlie

# Selecting multiple columns
df[["Name", "City"]]
Name City
0 Alice New York
1 Bob Los Angeles
2 Charlie Chicago
# Selecting rows by index
df.iloc[0]
0
Name Alice
Age 25
City New York

Filtering Data

# Filtering rows where Age is greater than 25
filtered_df = df[df["Age"] > 25]
filtered_df
Name Age City
1 Bob 30 Los Angeles
2 Charlie 35 Chicago

Adding a New Column


# Adding a new column
df["Salary"] = [50000, 60000, 70000]
df
Name Age City Salary
0 Alice 25 New York 50000
1 Bob 30 Los Angeles 60000
2 Charlie 35 Chicago 70000
    ## Conclusion

    This notebook covers the basic operations of pandas. You can explore more advanced features like merging,
    joining, and working with time series data in pandas documentation: https://pandas.pydata.org/docs/