Pandas Basics

Series, DataFrame, Rows, Columns, Filtering

Published

2025-01-27

Python Pandas Basics

Pandas is a powerful library for data analysis and manipulation in Python.

It provides two main data structures: - Series: A one-dimensional array-like object. - DataFrame: A two-dimensional table with labeled axes (rows and columns).


# Importing pandas
import pandas as pd

Creating a Series


# Creating a Series from a list
data = [10, 20, 30, 40, 50]
series = pd.Series(data)
series

	0
0	10
1	20
2	30
3	40
4	50

dtype: int64

Creating a DataFrame


# Creating a DataFrame from a dictionary
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [25, 30, 35],
    "City": ["New York", "Los Angeles", "Chicago"]
}
df = pd.DataFrame(data)
df

	Name	Age	City
0	Alice	25	New York
1	Bob	30	Los Angeles
2	Charlie	35	Chicago

Exploring Data


# Display the first few rows
df.head()

# Display the shape of the DataFrame
print("Shape:", df.shape)

# Display summary statistics
df.describe()

Shape: (3, 3)

	Age
count	3.0
mean	30.0
std	5.0
min	25.0
25%	27.5
50%	30.0
75%	32.5
max	35.0

Selecting Data

# Selecting a single column
df["Name"]

	Name
0	Alice
1	Bob
2	Charlie

dtype: object

# Selecting multiple columns
df[["Name", "City"]]

	Name	City
0	Alice	New York
1	Bob	Los Angeles
2	Charlie	Chicago

# Selecting rows by index
df.iloc[0]

	0
Name	Alice
Age	25
City	New York

dtype: object

Filtering Data

# Filtering rows where Age is greater than 25
filtered_df = df[df["Age"] > 25]
filtered_df

	Name	Age	City
1	Bob	30	Los Angeles
2	Charlie	35	Chicago

Adding a New Column


# Adding a new column
df["Salary"] = [50000, 60000, 70000]
df

	Name	Age	City	Salary
0	Alice	25	New York	50000
1	Bob	30	Los Angeles	60000
2	Charlie	35	Chicago	70000

    ## Conclusion

    This notebook covers the basic operations of pandas. You can explore more advanced features like merging,
    joining, and working with time series data in pandas documentation: https://pandas.pydata.org/docs/