4-Week Python for Finance Learning Plan

Who this is for: A finance professional with zero coding experience who wants to land analyst, operations, or data-oriented roles that list Python as a requirement.

Time commitment: 1–1.5 hours per day, 6 days a week, with one catch-up day built in. You can shift days around to fit your schedule — consistency matters more than sequence.

What this is NOT: This is not a software engineering curriculum. No algorithms, no LeetCode, no computer science theory. Everything here is chosen because it shows up in finance job descriptions or makes you dangerous with data.

Before You Start

Setup (Day 0 — 45–75 minutes)

Install Python 3.12+ via python.org
Install VS Code — code.visualstudio.com — then add the Python extension
Or skip local setup entirely — use Google Colab to write and run Python in your browser for free. No installs, no configuration. Great for Weeks 1–2 while you're learning. You can switch to a local setup later when you need to work with local files and databases.

Set up a virtual environment (local installs only — skip this if using Colab):

# In your terminal, create a project folder and virtual environment
mkdir python-finance
cd python-finance
python -m venv venv

# Activate it (do this every time you open a new terminal)
# macOS/Linux:
source venv/bin/activate
# Windows:
venv\Scripts\activate

# You'll see (venv) in your prompt — that means it's active.
# Always activate before running pip install.

A virtual environment keeps your project's packages separate from your system Python. This prevents version conflicts and makes pip freeze > requirements.txt actually useful later.

Create a GitHub account — github.com — this becomes your portfolio
Install Git — git-scm.com — you'll push projects here starting Week 2
Bookmark these — you'll reference them constantly:
- Python official tutorial — the authoritative reference
- Automate the Boring Stuff with Python — free online, your primary Week 1 textbook
- Real Python — high-quality articles and tutorials (some articles are free, some require a subscription)
- pandas documentation
- Kaggle Learn — free hands-on courses for pandas, data cleaning, and visualization

Mindset

You don't need to memorize syntax. You need to build muscle memory for looking things up and solving problems. Every professional developer Googles things daily. Your goal is fluency, not memorization.

Week 1: Python Fundamentals

Goal: Be comfortable writing simple scripts that process data. Understand variables, loops, conditionals, functions, basic data structures, and how to read an error message.

Resources This Week:

Automate the Boring Stuff with Python — your primary textbook, free online

Python official tutorial — reference when you want deeper explanation

Google Colab — write and run code in your browser, no setup needed

Corey Schafer YouTube — excellent video walkthroughs of every Python fundamental (search his channel for the day's topic)

Day 1 — Variables, Types & Basic Operations (~60 min)

Learn:

Variables and assignment (name = "Sarah", revenue = 150000.50)
Data types: str, int, float, bool
Basic math operations (+, -, *, /, //, %, **)
print() and f-strings (f"Revenue is ${revenue:,.2f}")

Resource: Automate the Boring Stuff — Chapter 1

Practice:

# Calculate compound interest
principal = 10000
rate = 0.07
years = 5
future_value = principal * (1 + rate) ** years
print(f"${principal:,.2f} at {rate:.0%} for {years} years = ${future_value:,.2f}")

Day 2 — Strings & User Input (~60 min)

Learn:

String methods: .upper(), .lower(), .strip(), .replace(), .split()
String slicing: ticker[:3]
input() for user interaction
Type conversion: int(), float(), str()

Resource: Automate the Boring Stuff — Chapter 6

Practice:

# Clean up messy ticker symbols
raw_input = "  aapl, MSFT , goog,  AMZN  "
tickers = [t.strip().upper() for t in raw_input.split(",")]
print(tickers)  # ['AAPL', 'MSFT', 'GOOG', 'AMZN']

# Interactive currency converter
usd = float(input("Enter amount in USD: "))
rate = 0.92  # EUR/USD rate
print(f"${usd:,.2f} = €{usd * rate:,.2f}")

Day 3 — Conditionals & Comparisons (~60 min)

Learn:

if, elif, else
Comparison operators: ==, !=, >, <, >=, <=
Logical operators: and, or, not
Truthiness (empty strings/lists are False)

Resource: Automate the Boring Stuff — Chapter 2

Practice:

# Flag transactions for review
amount = 15000
category = "wire_transfer"

if amount > 10000 and category == "wire_transfer":
    print("FLAG: Large wire transfer — requires compliance review")
elif amount > 10000:
    print("FLAG: Large transaction — manager approval needed")
else:
    print("APPROVED: Within normal limits")

Day 4 — Lists & Loops (~75 min)

Learn:

Lists: creating, indexing, slicing, .append(), .remove(), .sort()
for loops (iterating over lists)
while loops
range() and enumerate()
List comprehensions (the Pythonic shortcut)

Resource: Automate the Boring Stuff — Chapter 4 (covers lists, loops, and range())

Practice:

# Identify which months beat target
monthly_revenue = [82000, 91000, 78000, 95000, 88000, 102000,
                   97000, 85000, 110000, 93000, 99000, 115000]
target = 90000

above_target = [i+1 for i, rev in enumerate(monthly_revenue) if rev > target]
print(f"Months above ${target:,} target: {above_target}")
print(f"Hit rate: {len(above_target)}/{len(monthly_revenue)}")

Day 5 — Dictionaries & Sets (~75 min)

Learn:

Dictionaries: key-value pairs, .keys(), .values(), .items(), .get()
Nested dictionaries (common in financial data/JSON)
Sets: unique values, intersections, unions
When to use each: list vs dict vs set

Resource: Automate the Boring Stuff — Chapter 5

Practice:

# Portfolio holdings
portfolio = {
    "AAPL": {"shares": 50, "avg_cost": 142.50},
    "MSFT": {"shares": 30, "avg_cost": 285.00},
    "GOOG": {"shares": 10, "avg_cost": 2750.00},
}

current_prices = {"AAPL": 178.50, "MSFT": 310.00, "GOOG": 2900.00}

for ticker, holding in portfolio.items():
    cost_basis = holding["shares"] * holding["avg_cost"]
    market_value = holding["shares"] * current_prices[ticker]
    pnl = market_value - cost_basis
    print(f"{ticker}: P&L = ${pnl:,.2f}")

Day 6 — Functions & Debugging Basics (~75 min)

Learn:

Defining functions with def
Parameters, arguments, return values
Default arguments
Docstrings (brief descriptions of what a function does)
Why functions matter: reusability, readability, testing
Reading tracebacks: when Python throws an error, read it bottom to top. The last line is the error type and message; the lines above show the call stack leading to it. Get comfortable with common errors:
- NameError — you used a variable that doesn't exist (typo?)
- TypeError — you passed the wrong type (string where a number was expected?)
- IndexError — you went past the end of a list
- KeyError — you asked for a dictionary key that doesn't exist

Resource: Automate the Boring Stuff — Chapter 3

Practice:

def calculate_loan_payment(principal, annual_rate, years):
    """Calculate monthly payment for a fixed-rate loan."""
    monthly_rate = annual_rate / 12
    num_payments = years * 12
    payment = principal * (monthly_rate * (1 + monthly_rate)**num_payments) / \
              ((1 + monthly_rate)**num_payments - 1)
    return round(payment, 2)

# Test it
print(calculate_loan_payment(300000, 0.065, 30))  # ~$1,896.20
print(calculate_loan_payment(300000, 0.055, 15))  # ~$2,451.25

# Debugging exercise: intentionally break the function call below
# and practice reading the traceback Python gives you.
# Try: calculate_loan_payment("300000", 0.065, 30)  # TypeError — string, not number
# Try: calculate_loan_payment(300000, 0, 30)          # ZeroDivisionError — 0% rate

Day 7 — File I/O, Error Handling & Git Basics (~90 min)

Learn:

Reading/writing text files (open(), with statement)
Reading/writing CSV files with the csv module
try/except for error handling
Common exceptions: FileNotFoundError, ValueError, KeyError
Git basics (just enough to push code to GitHub):
- git init — turn a folder into a Git repository
- git add . — stage your files
- git commit -m "message" — save a snapshot
- git remote add origin <url> — connect to GitHub
- git push -u origin main — upload your code

Resources:

Automate the Boring Stuff — Chapter 9 (files and paths)
Automate the Boring Stuff — Chapter 11 (debugging and error handling)
GitHub Skills — Introduction to GitHub (free, interactive Git walkthrough)

Practice:

import csv

# Write sample transaction data
transactions = [
    ["date", "description", "amount", "category"],
    ["2026-01-15", "Office Supplies", 234.50, "Operations"],
    ["2026-01-16", "Client Dinner", 187.25, "Entertainment"],
    ["2026-01-17", "Software License", 1200.00, "Technology"],
]

with open("transactions.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(transactions)

# Read it back with error handling
try:
    with open("transactions.csv", "r") as f:
        reader = csv.DictReader(f)
        for row in reader:
            print(f"{row['date']}: {row['description']} — ${float(row['amount']):,.2f}")
except FileNotFoundError:
    print("Error: transactions.csv not found. Check the file path.")
except ValueError as e:
    print(f"Error parsing data: {e}")

Then push your work to GitHub:

cd python-finance
git init
git add .
git commit -m "Week 1: Python fundamentals exercises"
# Create a new repo on github.com, then:
git remote add origin https://github.com/YOUR_USERNAME/python-finance.git
git push -u origin main

Week 1 Milestone

You should be able to:

Write a script from scratch that reads a CSV, processes the data, and outputs results
Use functions to organize your code
Handle basic errors with try/except
Read a Python traceback and identify what went wrong
Push your code to GitHub using basic Git commands
Feel comfortable reading Python code even if you couldn't write it from memory yet

Week 2: pandas — Your New Excel

Goal: Replace 80% of what you do in Excel with pandas. This is the single most valuable skill for finance Python roles.

Resources This Week:

Kaggle — pandas Course — free, interactive, browser-based exercises (your primary resource)

Kaggle — Data Cleaning Course — free, for Day 11

Real Python — search for any pandas topic for in-depth articles (some free, some paid)

Corey Schafer — pandas tutorials — video companion for every pandas concept

Google Colab — practice pandas without local installs (pandas comes pre-installed)

Note: Many of this week's code snippets build on each other. Work in one continuous notebook (or Colab file) so variables like df carry forward between exercises.

Day 8 — pandas Fundamentals (~75 min)

Learn:

Install pandas (if working locally, activate your venv first): pip install pandas openpyxl
DataFrame and Series — the two core objects
Reading data: pd.read_csv(), pd.read_excel()
Inspecting data: .head(), .tail(), .shape, .dtypes, .describe(), .info()

Resource: Kaggle — pandas Course (Lessons 1–2) (free, interactive, browser-based)

Practice:

import pandas as pd

# Create a DataFrame from scratch (simulating a ledger)
data = {
    "date": ["2026-01-05", "2026-01-12", "2026-01-19", "2026-01-26", "2026-02-02"],
    "client": ["Acme Corp", "Globex", "Acme Corp", "Initech", "Globex"],
    "amount": [15000, 8500, 22000, 12000, 9500],
    "status": ["paid", "pending", "paid", "overdue", "paid"],
}
df = pd.DataFrame(data)
df["date"] = pd.to_datetime(df["date"])

print(df.describe())
print(df.dtypes)

Day 9 — Selecting, Filtering & Sorting (~75 min)

Learn:

Column selection: df["col"], df[["col1", "col2"]]
Row filtering: df[df["amount"] > 10000]
Multiple conditions: & (and), | (or), ~ (not) — always wrap conditions in parentheses
Sorting: .sort_values(), .sort_index()
.loc[] and .iloc[] for precise indexing

Resource: Kaggle — pandas Course (Lesson 3)

Practice:

# Filter for large paid invoices
large_paid = df[(df["amount"] > 10000) & (df["status"] == "paid")]
print(large_paid)

# Sort by amount descending
print(df.sort_values("amount", ascending=False))

Day 10 — GroupBy & Aggregation (~75 min)

Learn:

.groupby() — the pandas equivalent of pivot tables
Aggregation functions: .sum(), .mean(), .count(), .min(), .max()
.agg() for multiple aggregations at once
.pivot_table() — literally pivot tables in code

Resource: Kaggle — pandas Course (Lesson 4)

Practice:

# Revenue by client
print(df.groupby("client")["amount"].agg(["sum", "mean", "count"]))

# Pivot: clients vs status
pivot = df.pivot_table(values="amount", index="client", columns="status",
                       aggfunc="sum", fill_value=0)
print(pivot)

Day 11 — Data Cleaning (~90 min)

Learn:

Handling missing data: .isna(), .fillna(), .dropna()
Renaming columns: .rename(), .columns
Changing data types: .astype()
String operations: df["col"].str.upper(), .str.contains(), .str.strip()
Removing duplicates: .drop_duplicates()

Resource: Kaggle — Data Cleaning Course (Lessons 1–3) (free)

Practice:

# Simulate messy data
messy = pd.DataFrame({
    "Name": ["  john smith", "JANE DOE", "john smith", "Bob Jones  "],
    "Department": ["Finance", None, "Finance", "Operations"],
    "Salary": ["85000", "92000", "85000", None],
})

# Clean it
cleaned = messy.copy()
cleaned["Name"] = cleaned["Name"].str.strip().str.title()
cleaned = cleaned.drop_duplicates(subset="Name")
cleaned["Salary"] = pd.to_numeric(cleaned["Salary"], errors="coerce")
cleaned["Department"] = cleaned["Department"].fillna("Unknown")
print(cleaned)

Day 12 — Dates & Time Series for Finance (~75 min)

Learn:

pd.to_datetime() for converting date columns
.dt accessors: .dt.year, .dt.month, .dt.day, .dt.day_name()
Month-end dates: pd.offsets.MonthEnd(), .dt.to_period("M")
Grouping by month/quarter: .groupby(df["date"].dt.to_period("M"))
Aging buckets (how many days since an event)

Resource: pandas docs — Time series / date functionality

Practice:

import pandas as pd
from datetime import datetime

# Invoice aging report
invoices = pd.DataFrame({
    "invoice_id": [101, 102, 103, 104, 105, 106],
    "client": ["Acme", "Globex", "Initech", "Acme", "Globex", "Initech"],
    "invoice_date": pd.to_datetime([
        "2025-12-01", "2025-12-15", "2026-01-05",
        "2026-01-20", "2026-02-10", "2026-02-28"
    ]),
    "amount": [5000, 12000, 3200, 8500, 15000, 4100],
    "status": ["paid", "overdue", "overdue", "paid", "pending", "overdue"],
})

# Calculate days outstanding
today = pd.Timestamp("2026-03-06")
invoices["days_outstanding"] = (today - invoices["invoice_date"]).dt.days

# Create aging buckets (standard AR aging)
bins = [0, 30, 60, 90, float("inf")]
labels = ["Current", "31-60 Days", "61-90 Days", "90+ Days"]
invoices["aging_bucket"] = pd.cut(invoices["days_outstanding"], bins=bins, labels=labels)

# Summary by aging bucket
print(invoices.groupby("aging_bucket", observed=True)["amount"].agg(["sum", "count"]))
print()

# Monthly revenue summary using .dt accessor
invoices["month"] = invoices["invoice_date"].dt.to_period("M")
print(invoices.groupby("month")["amount"].sum())

Day 13 — Merging, Joining & Reconciliation (~90 min)

Learn:

pd.merge() — SQL-style joins (inner, left, right, outer)
pd.concat() — stacking DataFrames vertically
Merge on single or multiple keys
Validating merges: indicator=True to flag matched/unmatched rows
Reconciliation basics: matching two data sources and reporting exceptions

Resource: pandas docs — Merge, join, concatenate

Practice:

# --- Basic merge (VLOOKUP equivalent) ---
employees = pd.DataFrame({
    "emp_id": [1, 2, 3, 4],
    "name": ["Alice", "Bob", "Carol", "Dave"],
    "dept_id": [10, 20, 10, 30],
})

departments = pd.DataFrame({
    "dept_id": [10, 20, 30],
    "dept_name": ["Finance", "Operations", "Compliance"],
    "budget": [500000, 350000, 275000],
})

merged = pd.merge(employees, departments, on="dept_id", how="left")
print(merged)

# --- Reconciliation: match bank statement to internal ledger ---
bank_statement = pd.DataFrame({
    "ref": ["TXN-001", "TXN-002", "TXN-003", "TXN-005"],
    "bank_amount": [1500.00, 2300.00, 875.50, 4200.00],
    "bank_date": pd.to_datetime(["2026-01-05", "2026-01-06", "2026-01-07", "2026-01-09"]),
})

internal_ledger = pd.DataFrame({
    "ref": ["TXN-001", "TXN-002", "TXN-003", "TXN-004"],
    "ledger_amount": [1500.00, 2300.00, 875.50, 3100.00],
    "ledger_date": pd.to_datetime(["2026-01-05", "2026-01-06", "2026-01-07", "2026-01-08"]),
})

# Outer merge with indicator to find matches and exceptions
recon = pd.merge(bank_statement, internal_ledger, on="ref", how="outer", indicator=True)
recon.columns = ["ref", "bank_amount", "bank_date", "ledger_amount", "ledger_date", "match_status"]

print("\n--- Full Reconciliation ---")
print(recon)

# Exception report
matched = recon[recon["match_status"] == "both"]
bank_only = recon[recon["match_status"] == "left_only"]   # in bank, not in ledger
ledger_only = recon[recon["match_status"] == "right_only"]  # in ledger, not in bank

print(f"\nMatched: {len(matched)}  |  Bank only: {len(bank_only)}  |  Ledger only: {len(ledger_only)}")

Day 14 — Excel Integration & Mini Project (~90 min)

Learn:

Reading Excel files: pd.read_excel() with sheet_name, skiprows, usecols
Writing to Excel: .to_excel() with multiple sheets using ExcelWriter
Reading from multiple sheets in one workbook
Exporting to CSV: .to_csv()

Resource: Real Python — A Guide to Excel Spreadsheets in Python with openpyxl

Mini project — Financial Data Cleaner (~60 min):

Build a script that:

Reads a messy CSV or Excel file (create sample data with intentional issues)
Cleans column names (lowercase, remove spaces)
Handles missing values and removes duplicates
Standardizes date formats
Prints before/after row counts as a sanity check
Outputs a clean Excel file with a summary sheet

Push this to GitHub. This is your first portfolio piece.

# Starter template for the mini project
with pd.ExcelWriter("quarterly_report.xlsx") as writer:
    df.to_excel(writer, sheet_name="All Transactions", index=False)
    large_paid.to_excel(writer, sheet_name="Large Paid", index=False)
    pivot.to_excel(writer, sheet_name="Pivot Summary")

print("Report saved to quarterly_report.xlsx")

Week 2 Milestone

You should be able to:

Open any CSV/Excel file in pandas and explore it
Filter, sort, group, and aggregate data
Work with dates: extract month/quarter, calculate aging, group by period
Merge multiple datasets and validate the merge results
Match two data sources and identify exceptions (reconciliation)
Clean messy data and export polished results
Explain what pandas does to a non-technical interviewer

Week 3: SQL, Databases & Automation

Goal: Learn to query databases with SQL, connect Python to SQLite, and automate one repetitive file-based task. These skills separate "I know pandas" from "I can build things."

Resources This Week:

SQLBolt — free interactive SQL exercises, no setup required (primary resource for Days 15–17)

Mode Analytics SQL Tutorial — free, finance-friendly SQL examples

Automate the Boring Stuff — Chapter 10 — file automation

Corey Schafer YouTube — search "SQLite" on his channel for a clear video walkthrough

Real Python — Python SQLite — in-depth reference (free portion)

Day 15 — SQL Fundamentals Part 1 (~75 min)

Learn:

What SQL is and why it matters (most finance data lives in databases)
SELECT, FROM, WHERE
ORDER BY, LIMIT
Comparison operators, LIKE, IN, BETWEEN
NULL handling: IS NULL, IS NOT NULL

Resource: SQLBolt — Lessons 1–4 (free, interactive, no setup)

Also recommended: Mode SQL Tutorial (free, finance-friendly examples)

Practice on SQLBolt — complete Lessons 1 through 4 interactively.

Day 16 — SQL Fundamentals Part 2 (~75 min)

Learn:

GROUP BY and aggregate functions: COUNT(), SUM(), AVG(), MIN(), MAX()
HAVING (filtering after grouping)
Aliases with AS
CASE WHEN — conditional logic inside queries (like IF in Excel)
COALESCE() — replace NULLs with a default value

Resource: SQLBolt — Lessons 5–7

Practice:

-- Find each department's total and average salary, only showing depts over $200k total
SELECT department,
       COUNT(*) AS headcount,
       SUM(salary) AS total_salary,
       ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department
HAVING SUM(salary) > 200000
ORDER BY total_salary DESC;

-- CASE WHEN: categorize transactions by size
SELECT vendor,
       amount,
       CASE
           WHEN amount >= 10000 THEN 'Large'
           WHEN amount >= 1000  THEN 'Medium'
           ELSE 'Small'
       END AS size_category
FROM transactions;

-- COALESCE: fill in missing department names
SELECT emp_id,
       name,
       COALESCE(department, 'Unassigned') AS department
FROM employees;

Day 17 — SQL Joins (~75 min)

Learn:

INNER JOIN — only matching rows
LEFT JOIN — all rows from the left table, matches from the right
Joining on multiple columns
Table aliases for readability
When to use each join type

Resource: SQLBolt — Lessons 10–12

Practice:

-- Join employees to departments, showing all employees even without a department
SELECT e.name,
       e.salary,
       COALESCE(d.dept_name, 'No Department') AS department
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id
ORDER BY e.salary DESC;

Day 18 — Python + SQLite Together (~90 min)

Learn:

sqlite3 module (built into Python — no install needed)
Creating a database, creating tables, inserting data
Running SQL queries from Python
Loading SQL results directly into pandas: pd.read_sql()

Practice:

import sqlite3
import pandas as pd

# Create a file-backed SQLite database
conn = sqlite3.connect("finance.db")

# Create and populate a table
conn.execute("""
    CREATE TABLE IF NOT EXISTS transactions (
        id INTEGER PRIMARY KEY,
        date TEXT,
        vendor TEXT,
        amount REAL,
        category TEXT
    )
""")

transactions = [
    ("2026-01-05", "AWS", 2400.00, "Technology"),
    ("2026-01-06", "Staples", 185.50, "Supplies"),
    ("2026-01-07", "Delta Airlines", 1250.00, "Travel"),
    ("2026-01-10", "AWS", 2400.00, "Technology"),
    ("2026-01-12", "Uber", 45.00, "Travel"),
]

conn.executemany(
    "INSERT INTO transactions (date, vendor, amount, category) VALUES (?, ?, ?, ?)",
    transactions
)
conn.commit()

# Query into pandas — use the SQL you learned this week
df = pd.read_sql("""
    SELECT category,
           SUM(amount) AS total,
           COUNT(*) AS num_transactions,
           ROUND(AVG(amount), 2) AS avg_amount
    FROM transactions
    GROUP BY category
    ORDER BY total DESC
""", conn)
print(df)

conn.close()

Day 19 — Automating File Operations (~75 min)

Learn:

pathlib module for file/folder operations
Listing files in a directory, filtering by extension
Renaming, moving, and organizing files programmatically
glob for pattern matching (*.xlsx, *_2026_*.csv)

Resource: Automate the Boring Stuff — Chapter 10

Practice:

from pathlib import Path
import shutil

# Organize files by extension (simulate a messy Downloads folder)
source = Path("messy_folder")

if not source.exists():
    print(f"Folder '{source}' not found — create it and add some test files first.")
else:
    for file in source.iterdir():
        if file.is_file():
            ext = file.suffix.lstrip(".")
            ext_folder = source / (ext if ext else "no_extension")
            ext_folder.mkdir(exist_ok=True)
            shutil.move(str(file), str(ext_folder / file.name))
            print(f"Moved {file.name} -> {ext_folder.name}/")

Day 20 — Report Automation & Scheduling (~75 min)

Learn:

Generating timestamped reports programmatically
datetime module for timestamps and date math
Combining pandas + SQLite + file output into a single pipeline
Overview of scheduling: cron (macOS/Linux) or Task Scheduler (Windows)

Resource: Python docs — datetime

Practice:

from datetime import datetime
import pandas as pd
import sqlite3

# Generate a daily summary report from the database
today = datetime.now()
report_name = f"daily_report_{today.strftime('%Y%m%d')}.xlsx"

# Pull data from SQLite
conn = sqlite3.connect("finance.db")
summary = pd.read_sql("""
    SELECT category,
           SUM(amount) AS total,
           COUNT(*) AS count
    FROM transactions
    GROUP BY category
""", conn)
conn.close()

# Add report metadata
summary.to_excel(report_name, index=False)
print(f"Report generated: {report_name} at {today.strftime('%H:%M')}")

Day 21 — Week 3 Mini Project: Automated Monthly Report (~90 min)

Build a script that:

Reads transaction data from a CSV file (create sample data — don't depend on an API)
Loads the data into a SQLite database
Runs SQL queries to summarize by category and month
Generates a formatted Excel report with summary statistics
Timestamps the file name so it can run monthly

Keep it simple — the goal is connecting the pieces (file → database → query → report), not building something production-grade.

Push to GitHub. This is portfolio piece #2.

Week 3 Milestone

You should be able to:

Write SQL queries to filter, aggregate, and join data
Use CASE WHEN and COALESCE for conditional logic and NULL handling
Connect Python and SQLite together with pd.read_sql()
Automate file organization and repetitive tasks
Build a data pipeline (file → database → query → report)

Week 4: Projects, Portfolio & Job Prep

Goal: Build resume-worthy projects, polish your GitHub portfolio, and learn how to talk about your skills in interviews.

Resources This Week:

Kaggle — Data Visualization Course — free, for Day 22

DataLemur — free SQL interview practice with finance-relevant problems (free tier; premium is paid)

HackerRank — SQL — more interview-style SQL problems (free; some features require account)

Day 22 — Data Visualization Basics (~75 min)

Learn:

matplotlib basics: line charts, bar charts
Install (if local): pip install matplotlib
Saving figures to files: .savefig()
pandas built-in plotting: df.plot()
Making charts look professional (titles, labels, formatting)

Resource: Kaggle — Data Visualization Course (Lessons 1–3) (free) — note that Kaggle teaches seaborn as well, but focus on matplotlib and df.plot() basics first.

Practice:

import pandas as pd
import matplotlib.pyplot as plt

months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
revenue = [82, 91, 78, 95, 88, 102]
expenses = [70, 75, 72, 80, 77, 85]

df = pd.DataFrame({"Revenue": revenue, "Expenses": expenses}, index=months)

ax = df.plot(kind="bar", figsize=(10, 6), color=["#2563eb", "#dc2626"])
ax.set_title("Monthly Revenue vs Expenses ($K)", fontsize=14, fontweight="bold")
ax.set_ylabel("Amount ($K)")
plt.tight_layout()
plt.savefig("revenue_vs_expenses.png", dpi=150)
print("Chart saved.")

Days 23–25 — Capstone Project: Bank Reconciliation & Exception Report (~4–5 hours total)

This is your flagship portfolio project. It mirrors a real finance-ops workflow: matching two data sources, identifying discrepancies, and producing a clear exception report.

Build a tool that:

Reads two files — a bank statement CSV and an internal ledger CSV (create realistic sample data with ~50–100 rows each)
Matches transactions — join on reference number (or date + amount), flag matched/unmatched rows using indicator=True
Identifies exceptions:
- Transactions in the bank but not in the ledger (and vice versa)
- Matched transactions where the amounts differ
Computes aging — how many days old is each unresolved exception?
Summarizes — total matched, total unmatched, dollar value of exceptions, aging breakdown
Outputs a polished Excel workbook:
- "Matched" sheet — all reconciled transactions
- "Exceptions" sheet — unmatched and amount mismatches
- "Summary" sheet — control totals and aging breakdown

Tips:

Start with the data and merge logic first. Get that working before adding the Excel output.
Use pd.merge(..., indicator=True) — you practiced this on Day 13.
Test with small data (5 rows each) before scaling up.

Days 26–27 — Second Project: Expense Report Automator (~3–4 hours total)

Build a tool that:

Reads expense data from multiple CSV files in a folder (one file per employee or per month)
Concatenates them into one DataFrame with pd.concat()
Categorizes expenses using rules (keyword matching on descriptions)
Validates — flags duplicates, missing fields, amounts over a threshold
Summarizes — totals by category, by employee, by month
Outputs a polished Excel workbook with multiple sheets and a chart
Logs any issues to a separate "exceptions" sheet

Day 28 — GitHub Portfolio & Interview Prep (~90 min)

No new coding today. Focus on packaging your work and preparing to talk about it.

GitHub portfolio setup:

Create a GitHub profile with a professional bio
Pin your 3 best repositories (data cleaner, monthly report, bank reconciliation)
Each repo should have:
- A clear README explaining what it does, how to run it, and a screenshot of output
- Clean, commented code
- A requirements.txt file (activate your venv first, then pip freeze > requirements.txt)
- Sample input data so someone can actually run your code

Resume bullet points you can now honestly write:

Built automated financial reporting and reconciliation tools in Python using pandas, reducing manual Excel processing time

Developed data pipelines that read source files, clean and transform data, and output formatted Excel reports with exception tracking

Created bank reconciliation tool matching transactions across data sources with aging analysis and exception reporting

Proficient in SQL for data extraction, aggregation, joins, and conditional logic

Experience with Python libraries: pandas, matplotlib, sqlite3, openpyxl

Common interview questions for these roles and how to answer them:

Question	How to Answer
"How do you use Python in your workflow?"	Describe your capstone projects. Emphasize replacing manual Excel work with automated, repeatable scripts.
"What's pandas?"	"It's a Python library for working with tabular data — think of it as Excel in code but faster, more reproducible, and it handles large datasets that would slow Excel down."
"Can you write a SQL query?"	Walk through a `GROUP BY` with `JOIN`. Practice writing queries on paper.
"Tell me about a project you built."	Use the STAR method: Situation (manual process), Task (automate it), Action (what you built), Result (time saved, errors reduced).
"How comfortable are you with large datasets?"	"pandas handles datasets much larger than Excel can manage comfortably. I've worked with CSVs and databases, and I know how to filter and aggregate efficiently so I'm not loading everything into memory at once."

Practice resources for interview SQL:

DataLemur — SQL Interview Questions (free tier — finance-relevant problems)
HackerRank — SQL Easy/Medium (free)

Daily Schedule Template

Time Block	Activity
0:00–0:15	Review yesterday's code. Re-type one exercise from memory.
0:15–0:45	Learn today's new concept (read/watch resource).
0:45–1:15	Hands-on practice — type the code yourself, modify it, break it, fix it.
1:15–1:30	Write a 2-sentence summary of what you learned (in a `learning_log.md` file).

Resource Summary

Most resources below are free. Where noted, some content is behind a paywall or requires an account.

Resource	What It Covers	Used In	Cost	Link
Automate the Boring Stuff (2nd ed.)	Python fundamentals, files, automation	Weeks 1, 3	Free	automatetheboringstuff.com
Python Official Tutorial	Authoritative Python reference	All weeks	Free	docs.python.org/3/tutorial
Real Python	In-depth articles on any Python topic	Weeks 2, 3	Freemium (some articles free, full access paid)	realpython.com
Google Colab	Free browser-based Python environment	Weeks 1–2	Free	colab.research.google.com
Kaggle — pandas Course	pandas fundamentals, hands-on	Week 2	Free (account required)	kaggle.com/learn/pandas
Kaggle — Data Cleaning	Handling messy real-world data	Week 2	Free (account required)	kaggle.com/learn/data-cleaning
Kaggle — Data Visualization	matplotlib and seaborn	Week 4	Free (account required)	kaggle.com/learn/data-visualization
SQLBolt	Interactive SQL fundamentals	Week 3	Free	sqlbolt.com
Mode SQL Tutorial	SQL with analytics focus	Week 3	Free	mode.com/sql-tutorial
Corey Schafer YouTube	Python and pandas video tutorials	Weeks 1–3	Free	youtube.com/c/Coreyms
DataLemur	SQL interview practice	Week 4	Freemium (free tier available)	datalemur.com
HackerRank — SQL	SQL practice problems	Week 4	Free (account required)	hackerrank.com/domains/sql
GitHub Skills	Git and GitHub basics	Week 1	Free	github.com/skills

What You'll Have After 4 Weeks

Skills:

Python scripting and automation
pandas for data analysis (filtering, grouping, merging, cleaning, reconciliation, exporting)
Dates and aging analysis for finance reporting
SQL for querying databases (including joins, CASE WHEN, COALESCE)
File automation and report generation
Basic data visualization
Git and GitHub for version control

Portfolio (3 pinned GitHub projects):

Financial Data Cleaner — reads messy files, standardizes and cleans data, exports polished output
Automated Monthly Report — reads source data, stores in SQLite, generates timestamped Excel reports
Bank Reconciliation Tool — matches transactions across two data sources, identifies exceptions, produces aging analysis

Resume additions:

Python (pandas, matplotlib, sqlite3, openpyxl)
SQL (SQLite, joins, aggregations, conditional logic)
Data automation, reconciliation, and pipeline development
GitHub portfolio with working code samples

Extension Topics (After the 4 Weeks)

In priority order:

Keep building. Automate something real at your current job (even if they didn't ask for it). Nothing is more compelling in an interview than "I built this to solve an actual problem."
Practice SQL on DataLemur — 15 minutes a day. Most finance Python roles will test SQL.

Working with APIs — learn the requests library to pull data from web APIs. Good practice APIs:

ExchangeRate-API — free tier, currency conversion data
Alpha Vantage — free API key, stock/market data

import requests

response = requests.get("https://open.er-api.com/v6/latest/USD")
response.raise_for_status()  # Raise an error if the request failed
data = response.json()

import pandas as pd
rates = pd.Series(data["rates"], name="rate_vs_usd")
print(rates[["EUR", "GBP", "JPY", "CAD", "CHF"]])

Learn Jupyter Notebooks — many teams use them for analysis. Install with pip install notebook and run with jupyter notebook.
Explore more pandas — time series operations (.resample(), .rolling()), pd.date_range(), and working with financial date indices.
Basic web scraping — BeautifulSoup for pulling data from websites (useful but don't lead with this in interviews).

You don't need to know everything. You need to know enough to be useful on day one — and you will be.