Learn how to use the Pandas module in Python for data analysis. Explore functions, examples, and best practices for efficient data manipulation.
Handling CSV (Comma-Separated Values) files is a common task in Python, especially for data analysis, automation, and working with spreadsheets. This guide covers reading, writing, and manipulating CSV files with beginner-friendly examples and real-world use cases.
# Read CSV and calculate average
def calculate_average_scores():
with open('scores.csv', 'r') as file:
lines = file.readlines()
total = 0
count = 0
for line in lines:
name, score = line.strip().split(',')
total += int(score)
count += 1
average = total / count
print(f"Average score: {average:.2f}")
# Sample scores.csv:
# Alice,92
# Bob,85
# Charlie,78
csv.reader()
(Basic Method)import csv
with open('employees.csv', 'r') as file:
csv_reader = csv.reader(file)
for row in csv_reader:
print(row) # Each row is a list
Output Example:
['Name', 'Age', 'Department']
['Alice', '28', 'HR']
['Bob', '32', 'Engineering']
['Charlie', '45', 'Finance']
Real-Life Use Case:
csv.DictReader()
(Easier for Column Access)import csv
with open('employees.csv', 'r') as file:
csv_reader = csv.DictReader(file)
for row in csv_reader:
print(row['Name'], row['Department']) # Access columns by name
Output Example:
Alice HR
Bob Engineering
Charlie Finance
Real-Life Use Case:
csv.writer()
(Basic Method)import csv
data = [
["Product", "Price", "Stock"],
["Laptop", "999", "50"],
["Phone", "699", "100"],
["Tablet", "299", "30"]
]
with open('inventory.csv', 'w', newline='') as file:
csv_writer = csv.writer(file)
csv_writer.writerows(data)
Real-Life Use Case:
csv.DictWriter()
(Better for Structured Data)import csv
employees = [
{"Name": "Alice", "Age": 28, "Department": "HR"},
{"Name": "Bob", "Age": 32, "Department": "Engineering"},
{"Name": "Charlie", "Age": 45, "Department": "Finance"}
]
with open('employees_new.csv', 'w', newline='') as file:
fieldnames = ["Name", "Age", "Department"]
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader() # Writes column names
writer.writerows(employees)
Real-Life Use Case:
import csv
# Read CSV and filter
with open('employees.csv', 'r') as file:
csv_reader = csv.DictReader(file)
engineering_employees = [row for row in csv_reader if row['Department'] == 'Engineering']
# Write filtered data to a new CSV
with open('engineering_employees.csv', 'w', newline='') as file:
fieldnames = ["Name", "Age", "Department"]
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(engineering_employees)
Real-Life Use Case:
Pandas is a powerful library for handling CSV files efficiently.
pip install pandas
import pandas as pd
df = pd.read_csv('employees.csv')
print(df.head()) # Shows first 5 rows
df.to_csv('updated_employees.csv', index=False) # index=False avoids extra column
Real-Life Use Case:
| Task | Code Example |
|———-|——————|
| Read CSV | csv.reader(file)
or pd.read_csv()
|
| Write CSV | csv.writer(file)
or df.to_csv()
|
| Skip Header | next(csv_reader)
(before loop) |
| Read as Dictionary | csv.DictReader(file)
|
| Write with Headers | csv.DictWriter(file, fieldnames)
|
| Append to CSV | Open in 'a'
(append) mode |
csv.reader()
and csv.writer()
for basic CSV handling.csv.DictReader()
and csv.DictWriter()
for column-based access.pd.read_csv()
, df.to_csv()
) for advanced data manipulation.