CSV Handling in Python

« Back Next »

Learn how to use the Pandas module in Python for data analysis. Explore functions, examples, and best practices for efficient data manipulation.

Handling CSV (Comma-Separated Values) files is a common task in Python, especially for data analysis, automation, and working with spreadsheets. This guide covers reading, writing, and manipulating CSV files with beginner-friendly examples and real-world use cases.

Example Processing CSV Data (Simple Version)

# Read CSV and calculate average
def calculate_average_scores():
    with open('scores.csv', 'r') as file:
        lines = file.readlines()
    
    total = 0
    count = 0
    
    for line in lines:
        name, score = line.strip().split(',')
        total += int(score)
        count += 1
    
    average = total / count
    print(f"Average score: {average:.2f}")

# Sample scores.csv:
# Alice,92
# Bob,85
# Charlie,78

2. Reading a CSV File

Using `csv.reader()` (Basic Method)

import csv

with open('employees.csv', 'r') as file:
    csv_reader = csv.reader(file)
    for row in csv_reader:
        print(row)  # Each row is a list

Output Example:

['Name', 'Age', 'Department']
['Alice', '28', 'HR']
['Bob', '32', 'Engineering']
['Charlie', '45', 'Finance']

Real-Life Use Case:

Reading an exported list of employees from an HR system.

Using `csv.DictReader()` (Easier for Column Access)

import csv

with open('employees.csv', 'r') as file:
    csv_reader = csv.DictReader(file)
    for row in csv_reader:
        print(row['Name'], row['Department'])  # Access columns by name

Output Example:

Alice HR
Bob Engineering
Charlie Finance

Real-Life Use Case:

Extracting specific columns (e.g., only names and emails from a contact list).

2. Writing to a CSV File

Using `csv.writer()` (Basic Method)

import csv

data = [
    ["Product", "Price", "Stock"],
    ["Laptop", "999", "50"],
    ["Phone", "699", "100"],
    ["Tablet", "299", "30"]
]

with open('inventory.csv', 'w', newline='') as file:
    csv_writer = csv.writer(file)
    csv_writer.writerows(data)

Real-Life Use Case:

Generating an inventory report for an e-commerce store.

Using `csv.DictWriter()` (Better for Structured Data)

import csv

employees = [
    {"Name": "Alice", "Age": 28, "Department": "HR"},
    {"Name": "Bob", "Age": 32, "Department": "Engineering"},
    {"Name": "Charlie", "Age": 45, "Department": "Finance"}
]

with open('employees_new.csv', 'w', newline='') as file:
    fieldnames = ["Name", "Age", "Department"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    writer.writeheader()  # Writes column names
    writer.writerows(employees)

Real-Life Use Case:

Exporting structured data (e.g., employee records) to a CSV file.

3. Modifying CSV Data (Real-Life Example)

Example: Filtering Data (Only Keep Engineering Dept.)

import csv

# Read CSV and filter
with open('employees.csv', 'r') as file:
    csv_reader = csv.DictReader(file)
    engineering_employees = [row for row in csv_reader if row['Department'] == 'Engineering']

# Write filtered data to a new CSV
with open('engineering_employees.csv', 'w', newline='') as file:
    fieldnames = ["Name", "Age", "Department"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(engineering_employees)

Real-Life Use Case:

Extracting only relevant rows (e.g., employees in a specific department).

4. Working with CSV & Pandas (For Data Analysis)

Pandas is a powerful library for handling CSV files efficiently.

Install Pandas (if not installed)

pip install pandas

Reading CSV with Pandas

import pandas as pd

df = pd.read_csv('employees.csv')
print(df.head())  # Shows first 5 rows

Writing CSV with Pandas

df.to_csv('updated_employees.csv', index=False)  # index=False avoids extra column

Real-Life Use Case:

Cleaning and analyzing large datasets (e.g., sales records, survey responses).

5. Common CSV Tasks (Cheat Sheet)

Real-World CSV Use Cases

Importing/Exporting Data (Excel, Google Sheets)
Data Cleaning (Removing duplicates, fixing errors)
Automating Reports (Sales, inventory, logs)
Machine Learning Datasets (Storing training data)
Web Scraping Results (Saving scraped data to CSV)

Summary

Use csv.reader() and csv.writer() for basic CSV handling.
Use csv.DictReader() and csv.DictWriter() for column-based access.
Use Pandas (pd.read_csv(), df.to_csv()) for advanced data manipulation.
Real-world applications include HR records, inventory, sales data, and more.

CSV Handling in Python

Example Processing CSV Data (Simple Version)

2. Reading a CSV File

Using csv.reader() (Basic Method)

Using csv.DictReader() (Easier for Column Access)

2. Writing to a CSV File

Using csv.writer() (Basic Method)

Using csv.DictWriter() (Better for Structured Data)

3. Modifying CSV Data (Real-Life Example)

Example: Filtering Data (Only Keep Engineering Dept.)

4. Working with CSV & Pandas (For Data Analysis)

Install Pandas (if not installed)

Reading CSV with Pandas

Writing CSV with Pandas

5. Common CSV Tasks (Cheat Sheet)

Real-World CSV Use Cases

Summary

Using `csv.reader()` (Basic Method)

Using `csv.DictReader()` (Easier for Column Access)

Using `csv.writer()` (Basic Method)

Using `csv.DictWriter()` (Better for Structured Data)