Who this is for: A finance professional with zero coding experience who wants to land analyst, operations, or data-oriented roles that list Python as a requirement.
Time commitment: 1–1.5 hours per day, 6 days a week, with one catch-up day built in. You can shift days around to fit your schedule — consistency matters more than sequence.
What this is NOT: This is not a software engineering curriculum. No algorithms, no LeetCode, no computer science theory. Everything here is chosen because it shows up in finance job descriptions or makes you dangerous with data.
Before You Start
Setup (Day 0 — 45–75 minutes)
- Install Python 3.12+ via python.org
- Install VS Code — code.visualstudio.com — then add the Python extension
- Or skip local setup entirely — use Google Colab to write and run Python in your browser for free. No installs, no configuration. Great for Weeks 1–2 while you're learning. You can switch to a local setup later when you need to work with local files and databases.
- Set up a virtual environment (local installs only — skip this if using Colab):
A virtual environment keeps your project's packages separate from your system Python. This prevents version conflicts and makes# In your terminal, create a project folder and virtual environment mkdir python-finance cd python-finance python -m venv venv # Activate it (do this every time you open a new terminal) # macOS/Linux: source venv/bin/activate # Windows: venv\Scripts\activate # You'll see (venv) in your prompt — that means it's active. # Always activate before running pip install.pip freeze > requirements.txtactually useful later. - Create a GitHub account — github.com — this becomes your portfolio
- Install Git — git-scm.com — you'll push projects here starting Week 2
- Bookmark these — you'll reference them constantly:
- Python official tutorial — the authoritative reference
- Automate the Boring Stuff with Python — free online, your primary Week 1 textbook
- Real Python — high-quality articles and tutorials (some articles are free, some require a subscription)
- pandas documentation
- Kaggle Learn — free hands-on courses for pandas, data cleaning, and visualization
Mindset
You don't need to memorize syntax. You need to build muscle memory for looking things up and solving problems. Every professional developer Googles things daily. Your goal is fluency, not memorization.
Week 1: Python Fundamentals
Goal: Be comfortable writing simple scripts that process data. Understand variables, loops, conditionals, functions, basic data structures, and how to read an error message.
Resources This Week:
- Automate the Boring Stuff with Python — your primary textbook, free online
- Python official tutorial — reference when you want deeper explanation
- Google Colab — write and run code in your browser, no setup needed
- Corey Schafer YouTube — excellent video walkthroughs of every Python fundamental (search his channel for the day's topic)
Day 1 — Variables, Types & Basic Operations (~60 min)
Learn:
- Variables and assignment (
name = "Sarah",revenue = 150000.50) - Data types:
str,int,float,bool - Basic math operations (
+,-,*,/,//,%,**) print()and f-strings (f"Revenue is ${revenue:,.2f}")
Resource: Automate the Boring Stuff — Chapter 1
Practice:
# Calculate compound interest
principal = 10000
rate = 0.07
years = 5
future_value = principal * (1 + rate) ** years
print(f"${principal:,.2f} at {rate:.0%} for {years} years = ${future_value:,.2f}")
Day 2 — Strings & User Input (~60 min)
Learn:
- String methods:
.upper(),.lower(),.strip(),.replace(),.split() - String slicing:
ticker[:3] input()for user interaction- Type conversion:
int(),float(),str()
Resource: Automate the Boring Stuff — Chapter 6
Practice:
# Clean up messy ticker symbols
raw_input = " aapl, MSFT , goog, AMZN "
tickers = [t.strip().upper() for t in raw_input.split(",")]
print(tickers) # ['AAPL', 'MSFT', 'GOOG', 'AMZN']
# Interactive currency converter
usd = float(input("Enter amount in USD: "))
rate = 0.92 # EUR/USD rate
print(f"${usd:,.2f} = €{usd * rate:,.2f}")
Day 3 — Conditionals & Comparisons (~60 min)
Learn:
if,elif,else- Comparison operators:
==,!=,>,<,>=,<= - Logical operators:
and,or,not - Truthiness (empty strings/lists are
False)
Resource: Automate the Boring Stuff — Chapter 2
Practice:
# Flag transactions for review
amount = 15000
category = "wire_transfer"
if amount > 10000 and category == "wire_transfer":
print("FLAG: Large wire transfer — requires compliance review")
elif amount > 10000:
print("FLAG: Large transaction — manager approval needed")
else:
print("APPROVED: Within normal limits")
Day 4 — Lists & Loops (~75 min)
Learn:
- Lists: creating, indexing, slicing,
.append(),.remove(),.sort() forloops (iterating over lists)whileloopsrange()andenumerate()- List comprehensions (the Pythonic shortcut)
Resource: Automate the Boring Stuff — Chapter 4 (covers lists, loops, and range())
Practice:
# Identify which months beat target
monthly_revenue = [82000, 91000, 78000, 95000, 88000, 102000,
97000, 85000, 110000, 93000, 99000, 115000]
target = 90000
above_target = [i+1 for i, rev in enumerate(monthly_revenue) if rev > target]
print(f"Months above ${target:,} target: {above_target}")
print(f"Hit rate: {len(above_target)}/{len(monthly_revenue)}")
Day 5 — Dictionaries & Sets (~75 min)
Learn:
- Dictionaries: key-value pairs,
.keys(),.values(),.items(),.get() - Nested dictionaries (common in financial data/JSON)
- Sets: unique values, intersections, unions
- When to use each: list vs dict vs set
Resource: Automate the Boring Stuff — Chapter 5
Practice:
# Portfolio holdings
portfolio = {
"AAPL": {"shares": 50, "avg_cost": 142.50},
"MSFT": {"shares": 30, "avg_cost": 285.00},
"GOOG": {"shares": 10, "avg_cost": 2750.00},
}
current_prices = {"AAPL": 178.50, "MSFT": 310.00, "GOOG": 2900.00}
for ticker, holding in portfolio.items():
cost_basis = holding["shares"] * holding["avg_cost"]
market_value = holding["shares"] * current_prices[ticker]
pnl = market_value - cost_basis
print(f"{ticker}: P&L = ${pnl:,.2f}")
Day 6 — Functions & Debugging Basics (~75 min)
Learn:
- Defining functions with
def - Parameters, arguments, return values
- Default arguments
- Docstrings (brief descriptions of what a function does)
- Why functions matter: reusability, readability, testing
- Reading tracebacks: when Python throws an error, read it bottom to top. The last line is the error type and message; the lines above show the call stack leading to it. Get comfortable with common errors:
NameError— you used a variable that doesn't exist (typo?)TypeError— you passed the wrong type (string where a number was expected?)IndexError— you went past the end of a listKeyError— you asked for a dictionary key that doesn't exist
Resource: Automate the Boring Stuff — Chapter 3
Practice:
def calculate_loan_payment(principal, annual_rate, years):
"""Calculate monthly payment for a fixed-rate loan."""
monthly_rate = annual_rate / 12
num_payments = years * 12
payment = principal * (monthly_rate * (1 + monthly_rate)**num_payments) / \
((1 + monthly_rate)**num_payments - 1)
return round(payment, 2)
# Test it
print(calculate_loan_payment(300000, 0.065, 30)) # ~$1,896.20
print(calculate_loan_payment(300000, 0.055, 15)) # ~$2,451.25
# Debugging exercise: intentionally break the function call below
# and practice reading the traceback Python gives you.
# Try: calculate_loan_payment("300000", 0.065, 30) # TypeError — string, not number
# Try: calculate_loan_payment(300000, 0, 30) # ZeroDivisionError — 0% rate
Day 7 — File I/O, Error Handling & Git Basics (~90 min)
Learn:
- Reading/writing text files (
open(),withstatement) - Reading/writing CSV files with the
csvmodule try/exceptfor error handling- Common exceptions:
FileNotFoundError,ValueError,KeyError - Git basics (just enough to push code to GitHub):
git init— turn a folder into a Git repositorygit add .— stage your filesgit commit -m "message"— save a snapshotgit remote add origin <url>— connect to GitHubgit push -u origin main— upload your code
Resources:
- Automate the Boring Stuff — Chapter 9 (files and paths)
- Automate the Boring Stuff — Chapter 11 (debugging and error handling)
- GitHub Skills — Introduction to GitHub (free, interactive Git walkthrough)
Practice:
import csv
# Write sample transaction data
transactions = [
["date", "description", "amount", "category"],
["2026-01-15", "Office Supplies", 234.50, "Operations"],
["2026-01-16", "Client Dinner", 187.25, "Entertainment"],
["2026-01-17", "Software License", 1200.00, "Technology"],
]
with open("transactions.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerows(transactions)
# Read it back with error handling
try:
with open("transactions.csv", "r") as f:
reader = csv.DictReader(f)
for row in reader:
print(f"{row['date']}: {row['description']} — ${float(row['amount']):,.2f}")
except FileNotFoundError:
print("Error: transactions.csv not found. Check the file path.")
except ValueError as e:
print(f"Error parsing data: {e}")
Then push your work to GitHub:
cd python-finance
git init
git add .
git commit -m "Week 1: Python fundamentals exercises"
# Create a new repo on github.com, then:
git remote add origin https://github.com/YOUR_USERNAME/python-finance.git
git push -u origin main
Week 1 Milestone
You should be able to:
- Write a script from scratch that reads a CSV, processes the data, and outputs results
- Use functions to organize your code
- Handle basic errors with
try/except - Read a Python traceback and identify what went wrong
- Push your code to GitHub using basic Git commands
- Feel comfortable reading Python code even if you couldn't write it from memory yet
Week 2: pandas — Your New Excel
Goal: Replace 80% of what you do in Excel with pandas. This is the single most valuable skill for finance Python roles.
Resources This Week:
- Kaggle — pandas Course — free, interactive, browser-based exercises (your primary resource)
- Kaggle — Data Cleaning Course — free, for Day 11
- Real Python — search for any pandas topic for in-depth articles (some free, some paid)
- Corey Schafer — pandas tutorials — video companion for every pandas concept
- Google Colab — practice pandas without local installs (pandas comes pre-installed)
Note: Many of this week's code snippets build on each other. Work in one continuous notebook (or Colab file) so variables like
dfcarry forward between exercises.
Day 8 — pandas Fundamentals (~75 min)
Learn:
- Install pandas (if working locally, activate your venv first):
pip install pandas openpyxl DataFrameandSeries— the two core objects- Reading data:
pd.read_csv(),pd.read_excel() - Inspecting data:
.head(),.tail(),.shape,.dtypes,.describe(),.info()
Resource: Kaggle — pandas Course (Lessons 1–2) (free, interactive, browser-based)
Practice:
import pandas as pd
# Create a DataFrame from scratch (simulating a ledger)
data = {
"date": ["2026-01-05", "2026-01-12", "2026-01-19", "2026-01-26", "2026-02-02"],
"client": ["Acme Corp", "Globex", "Acme Corp", "Initech", "Globex"],
"amount": [15000, 8500, 22000, 12000, 9500],
"status": ["paid", "pending", "paid", "overdue", "paid"],
}
df = pd.DataFrame(data)
df["date"] = pd.to_datetime(df["date"])
print(df.describe())
print(df.dtypes)
Day 9 — Selecting, Filtering & Sorting (~75 min)
Learn:
- Column selection:
df["col"],df[["col1", "col2"]] - Row filtering:
df[df["amount"] > 10000] - Multiple conditions:
&(and),|(or),~(not) — always wrap conditions in parentheses - Sorting:
.sort_values(),.sort_index() .loc[]and.iloc[]for precise indexing
Resource: Kaggle — pandas Course (Lesson 3)
Practice:
# Filter for large paid invoices
large_paid = df[(df["amount"] > 10000) & (df["status"] == "paid")]
print(large_paid)
# Sort by amount descending
print(df.sort_values("amount", ascending=False))
Day 10 — GroupBy & Aggregation (~75 min)
Learn:
.groupby()— the pandas equivalent of pivot tables- Aggregation functions:
.sum(),.mean(),.count(),.min(),.max() .agg()for multiple aggregations at once.pivot_table()— literally pivot tables in code
Resource: Kaggle — pandas Course (Lesson 4)
Practice:
# Revenue by client
print(df.groupby("client")["amount"].agg(["sum", "mean", "count"]))
# Pivot: clients vs status
pivot = df.pivot_table(values="amount", index="client", columns="status",
aggfunc="sum", fill_value=0)
print(pivot)
Day 11 — Data Cleaning (~90 min)
Learn:
- Handling missing data:
.isna(),.fillna(),.dropna() - Renaming columns:
.rename(),.columns - Changing data types:
.astype() - String operations:
df["col"].str.upper(),.str.contains(),.str.strip() - Removing duplicates:
.drop_duplicates()
Resource: Kaggle — Data Cleaning Course (Lessons 1–3) (free)
Practice:
# Simulate messy data
messy = pd.DataFrame({
"Name": [" john smith", "JANE DOE", "john smith", "Bob Jones "],
"Department": ["Finance", None, "Finance", "Operations"],
"Salary": ["85000", "92000", "85000", None],
})
# Clean it
cleaned = messy.copy()
cleaned["Name"] = cleaned["Name"].str.strip().str.title()
cleaned = cleaned.drop_duplicates(subset="Name")
cleaned["Salary"] = pd.to_numeric(cleaned["Salary"], errors="coerce")
cleaned["Department"] = cleaned["Department"].fillna("Unknown")
print(cleaned)
Day 12 — Dates & Time Series for Finance (~75 min)
Learn:
pd.to_datetime()for converting date columns.dtaccessors:.dt.year,.dt.month,.dt.day,.dt.day_name()- Month-end dates:
pd.offsets.MonthEnd(),.dt.to_period("M") - Grouping by month/quarter:
.groupby(df["date"].dt.to_period("M")) - Aging buckets (how many days since an event)
Resource: pandas docs — Time series / date functionality
Practice:
import pandas as pd
from datetime import datetime
# Invoice aging report
invoices = pd.DataFrame({
"invoice_id": [101, 102, 103, 104, 105, 106],
"client": ["Acme", "Globex", "Initech", "Acme", "Globex", "Initech"],
"invoice_date": pd.to_datetime([
"2025-12-01", "2025-12-15", "2026-01-05",
"2026-01-20", "2026-02-10", "2026-02-28"
]),
"amount": [5000, 12000, 3200, 8500, 15000, 4100],
"status": ["paid", "overdue", "overdue", "paid", "pending", "overdue"],
})
# Calculate days outstanding
today = pd.Timestamp("2026-03-06")
invoices["days_outstanding"] = (today - invoices["invoice_date"]).dt.days
# Create aging buckets (standard AR aging)
bins = [0, 30, 60, 90, float("inf")]
labels = ["Current", "31-60 Days", "61-90 Days", "90+ Days"]
invoices["aging_bucket"] = pd.cut(invoices["days_outstanding"], bins=bins, labels=labels)
# Summary by aging bucket
print(invoices.groupby("aging_bucket", observed=True)["amount"].agg(["sum", "count"]))
print()
# Monthly revenue summary using .dt accessor
invoices["month"] = invoices["invoice_date"].dt.to_period("M")
print(invoices.groupby("month")["amount"].sum())
Day 13 — Merging, Joining & Reconciliation (~90 min)
Learn:
pd.merge()— SQL-style joins (inner, left, right, outer)pd.concat()— stacking DataFrames vertically- Merge on single or multiple keys
- Validating merges:
indicator=Trueto flag matched/unmatched rows - Reconciliation basics: matching two data sources and reporting exceptions
Resource: pandas docs — Merge, join, concatenate
Practice:
# --- Basic merge (VLOOKUP equivalent) ---
employees = pd.DataFrame({
"emp_id": [1, 2, 3, 4],
"name": ["Alice", "Bob", "Carol", "Dave"],
"dept_id": [10, 20, 10, 30],
})
departments = pd.DataFrame({
"dept_id": [10, 20, 30],
"dept_name": ["Finance", "Operations", "Compliance"],
"budget": [500000, 350000, 275000],
})
merged = pd.merge(employees, departments, on="dept_id", how="left")
print(merged)
# --- Reconciliation: match bank statement to internal ledger ---
bank_statement = pd.DataFrame({
"ref": ["TXN-001", "TXN-002", "TXN-003", "TXN-005"],
"bank_amount": [1500.00, 2300.00, 875.50, 4200.00],
"bank_date": pd.to_datetime(["2026-01-05", "2026-01-06", "2026-01-07", "2026-01-09"]),
})
internal_ledger = pd.DataFrame({
"ref": ["TXN-001", "TXN-002", "TXN-003", "TXN-004"],
"ledger_amount": [1500.00, 2300.00, 875.50, 3100.00],
"ledger_date": pd.to_datetime(["2026-01-05", "2026-01-06", "2026-01-07", "2026-01-08"]),
})
# Outer merge with indicator to find matches and exceptions
recon = pd.merge(bank_statement, internal_ledger, on="ref", how="outer", indicator=True)
recon.columns = ["ref", "bank_amount", "bank_date", "ledger_amount", "ledger_date", "match_status"]
print("\n--- Full Reconciliation ---")
print(recon)
# Exception report
matched = recon[recon["match_status"] == "both"]
bank_only = recon[recon["match_status"] == "left_only"] # in bank, not in ledger
ledger_only = recon[recon["match_status"] == "right_only"] # in ledger, not in bank
print(f"\nMatched: {len(matched)} | Bank only: {len(bank_only)} | Ledger only: {len(ledger_only)}")
Day 14 — Excel Integration & Mini Project (~90 min)
Learn:
- Reading Excel files:
pd.read_excel()withsheet_name,skiprows,usecols - Writing to Excel:
.to_excel()with multiple sheets usingExcelWriter - Reading from multiple sheets in one workbook
- Exporting to CSV:
.to_csv()
Resource: Real Python — A Guide to Excel Spreadsheets in Python with openpyxl
Mini project — Financial Data Cleaner (~60 min):
Build a script that:
- Reads a messy CSV or Excel file (create sample data with intentional issues)
- Cleans column names (lowercase, remove spaces)
- Handles missing values and removes duplicates
- Standardizes date formats
- Prints before/after row counts as a sanity check
- Outputs a clean Excel file with a summary sheet
Push this to GitHub. This is your first portfolio piece.
# Starter template for the mini project
with pd.ExcelWriter("quarterly_report.xlsx") as writer:
df.to_excel(writer, sheet_name="All Transactions", index=False)
large_paid.to_excel(writer, sheet_name="Large Paid", index=False)
pivot.to_excel(writer, sheet_name="Pivot Summary")
print("Report saved to quarterly_report.xlsx")
Week 2 Milestone
You should be able to:
- Open any CSV/Excel file in pandas and explore it
- Filter, sort, group, and aggregate data
- Work with dates: extract month/quarter, calculate aging, group by period
- Merge multiple datasets and validate the merge results
- Match two data sources and identify exceptions (reconciliation)
- Clean messy data and export polished results
- Explain what pandas does to a non-technical interviewer
Week 3: SQL, Databases & Automation
Goal: Learn to query databases with SQL, connect Python to SQLite, and automate one repetitive file-based task. These skills separate "I know pandas" from "I can build things."
Resources This Week:
- SQLBolt — free interactive SQL exercises, no setup required (primary resource for Days 15–17)
- Mode Analytics SQL Tutorial — free, finance-friendly SQL examples
- Automate the Boring Stuff — Chapter 10 — file automation
- Corey Schafer YouTube — search "SQLite" on his channel for a clear video walkthrough
- Real Python — Python SQLite — in-depth reference (free portion)
Day 15 — SQL Fundamentals Part 1 (~75 min)
Learn:
- What SQL is and why it matters (most finance data lives in databases)
SELECT,FROM,WHEREORDER BY,LIMIT- Comparison operators,
LIKE,IN,BETWEEN NULLhandling:IS NULL,IS NOT NULL
Resource: SQLBolt — Lessons 1–4 (free, interactive, no setup)
Also recommended: Mode SQL Tutorial (free, finance-friendly examples)
Practice on SQLBolt — complete Lessons 1 through 4 interactively.
Day 16 — SQL Fundamentals Part 2 (~75 min)
Learn:
GROUP BYand aggregate functions:COUNT(),SUM(),AVG(),MIN(),MAX()HAVING(filtering after grouping)- Aliases with
AS CASE WHEN— conditional logic inside queries (like IF in Excel)COALESCE()— replace NULLs with a default value
Resource: SQLBolt — Lessons 5–7
Practice:
-- Find each department's total and average salary, only showing depts over $200k total
SELECT department,
COUNT(*) AS headcount,
SUM(salary) AS total_salary,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department
HAVING SUM(salary) > 200000
ORDER BY total_salary DESC;
-- CASE WHEN: categorize transactions by size
SELECT vendor,
amount,
CASE
WHEN amount >= 10000 THEN 'Large'
WHEN amount >= 1000 THEN 'Medium'
ELSE 'Small'
END AS size_category
FROM transactions;
-- COALESCE: fill in missing department names
SELECT emp_id,
name,
COALESCE(department, 'Unassigned') AS department
FROM employees;
Day 17 — SQL Joins (~75 min)
Learn:
INNER JOIN— only matching rowsLEFT JOIN— all rows from the left table, matches from the right- Joining on multiple columns
- Table aliases for readability
- When to use each join type
Resource: SQLBolt — Lessons 10–12
Practice:
-- Join employees to departments, showing all employees even without a department
SELECT e.name,
e.salary,
COALESCE(d.dept_name, 'No Department') AS department
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id
ORDER BY e.salary DESC;
Day 18 — Python + SQLite Together (~90 min)
Learn:
sqlite3module (built into Python — no install needed)- Creating a database, creating tables, inserting data
- Running SQL queries from Python
- Loading SQL results directly into pandas:
pd.read_sql()
Practice:
import sqlite3
import pandas as pd
# Create a file-backed SQLite database
conn = sqlite3.connect("finance.db")
# Create and populate a table
conn.execute("""
CREATE TABLE IF NOT EXISTS transactions (
id INTEGER PRIMARY KEY,
date TEXT,
vendor TEXT,
amount REAL,
category TEXT
)
""")
transactions = [
("2026-01-05", "AWS", 2400.00, "Technology"),
("2026-01-06", "Staples", 185.50, "Supplies"),
("2026-01-07", "Delta Airlines", 1250.00, "Travel"),
("2026-01-10", "AWS", 2400.00, "Technology"),
("2026-01-12", "Uber", 45.00, "Travel"),
]
conn.executemany(
"INSERT INTO transactions (date, vendor, amount, category) VALUES (?, ?, ?, ?)",
transactions
)
conn.commit()
# Query into pandas — use the SQL you learned this week
df = pd.read_sql("""
SELECT category,
SUM(amount) AS total,
COUNT(*) AS num_transactions,
ROUND(AVG(amount), 2) AS avg_amount
FROM transactions
GROUP BY category
ORDER BY total DESC
""", conn)
print(df)
conn.close()
Day 19 — Automating File Operations (~75 min)
Learn:
pathlibmodule for file/folder operations- Listing files in a directory, filtering by extension
- Renaming, moving, and organizing files programmatically
globfor pattern matching (*.xlsx,*_2026_*.csv)
Resource: Automate the Boring Stuff — Chapter 10
Practice:
from pathlib import Path
import shutil
# Organize files by extension (simulate a messy Downloads folder)
source = Path("messy_folder")
if not source.exists():
print(f"Folder '{source}' not found — create it and add some test files first.")
else:
for file in source.iterdir():
if file.is_file():
ext = file.suffix.lstrip(".")
ext_folder = source / (ext if ext else "no_extension")
ext_folder.mkdir(exist_ok=True)
shutil.move(str(file), str(ext_folder / file.name))
print(f"Moved {file.name} -> {ext_folder.name}/")
Day 20 — Report Automation & Scheduling (~75 min)
Learn:
- Generating timestamped reports programmatically
datetimemodule for timestamps and date math- Combining pandas + SQLite + file output into a single pipeline
- Overview of scheduling:
cron(macOS/Linux) or Task Scheduler (Windows)
Resource: Python docs — datetime
Practice:
from datetime import datetime
import pandas as pd
import sqlite3
# Generate a daily summary report from the database
today = datetime.now()
report_name = f"daily_report_{today.strftime('%Y%m%d')}.xlsx"
# Pull data from SQLite
conn = sqlite3.connect("finance.db")
summary = pd.read_sql("""
SELECT category,
SUM(amount) AS total,
COUNT(*) AS count
FROM transactions
GROUP BY category
""", conn)
conn.close()
# Add report metadata
summary.to_excel(report_name, index=False)
print(f"Report generated: {report_name} at {today.strftime('%H:%M')}")
Day 21 — Week 3 Mini Project: Automated Monthly Report (~90 min)
Build a script that:
- Reads transaction data from a CSV file (create sample data — don't depend on an API)
- Loads the data into a SQLite database
- Runs SQL queries to summarize by category and month
- Generates a formatted Excel report with summary statistics
- Timestamps the file name so it can run monthly
Keep it simple — the goal is connecting the pieces (file → database → query → report), not building something production-grade.
Push to GitHub. This is portfolio piece #2.
Week 3 Milestone
You should be able to:
- Write SQL queries to filter, aggregate, and join data
- Use
CASE WHENandCOALESCEfor conditional logic and NULL handling - Connect Python and SQLite together with
pd.read_sql() - Automate file organization and repetitive tasks
- Build a data pipeline (file → database → query → report)
Week 4: Projects, Portfolio & Job Prep
Goal: Build resume-worthy projects, polish your GitHub portfolio, and learn how to talk about your skills in interviews.
Resources This Week:
- Kaggle — Data Visualization Course — free, for Day 22
- DataLemur — free SQL interview practice with finance-relevant problems (free tier; premium is paid)
- HackerRank — SQL — more interview-style SQL problems (free; some features require account)
Day 22 — Data Visualization Basics (~75 min)
Learn:
matplotlibbasics: line charts, bar charts- Install (if local):
pip install matplotlib - Saving figures to files:
.savefig() - pandas built-in plotting:
df.plot() - Making charts look professional (titles, labels, formatting)
Resource: Kaggle — Data Visualization Course (Lessons 1–3) (free) — note that Kaggle teaches seaborn as well, but focus on matplotlib and df.plot() basics first.
Practice:
import pandas as pd
import matplotlib.pyplot as plt
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
revenue = [82, 91, 78, 95, 88, 102]
expenses = [70, 75, 72, 80, 77, 85]
df = pd.DataFrame({"Revenue": revenue, "Expenses": expenses}, index=months)
ax = df.plot(kind="bar", figsize=(10, 6), color=["#2563eb", "#dc2626"])
ax.set_title("Monthly Revenue vs Expenses ($K)", fontsize=14, fontweight="bold")
ax.set_ylabel("Amount ($K)")
plt.tight_layout()
plt.savefig("revenue_vs_expenses.png", dpi=150)
print("Chart saved.")
Days 23–25 — Capstone Project: Bank Reconciliation & Exception Report (~4–5 hours total)
This is your flagship portfolio project. It mirrors a real finance-ops workflow: matching two data sources, identifying discrepancies, and producing a clear exception report.
Build a tool that:
- Reads two files — a bank statement CSV and an internal ledger CSV (create realistic sample data with ~50–100 rows each)
- Matches transactions — join on reference number (or date + amount), flag matched/unmatched rows using
indicator=True - Identifies exceptions:
- Transactions in the bank but not in the ledger (and vice versa)
- Matched transactions where the amounts differ
- Computes aging — how many days old is each unresolved exception?
- Summarizes — total matched, total unmatched, dollar value of exceptions, aging breakdown
- Outputs a polished Excel workbook:
- "Matched" sheet — all reconciled transactions
- "Exceptions" sheet — unmatched and amount mismatches
- "Summary" sheet — control totals and aging breakdown
Tips:
- Start with the data and merge logic first. Get that working before adding the Excel output.
- Use
pd.merge(..., indicator=True)— you practiced this on Day 13. - Test with small data (5 rows each) before scaling up.
Days 26–27 — Second Project: Expense Report Automator (~3–4 hours total)
Build a tool that:
- Reads expense data from multiple CSV files in a folder (one file per employee or per month)
- Concatenates them into one DataFrame with
pd.concat() - Categorizes expenses using rules (keyword matching on descriptions)
- Validates — flags duplicates, missing fields, amounts over a threshold
- Summarizes — totals by category, by employee, by month
- Outputs a polished Excel workbook with multiple sheets and a chart
- Logs any issues to a separate "exceptions" sheet
Day 28 — GitHub Portfolio & Interview Prep (~90 min)
No new coding today. Focus on packaging your work and preparing to talk about it.
GitHub portfolio setup:
- Create a GitHub profile with a professional bio
- Pin your 3 best repositories (data cleaner, monthly report, bank reconciliation)
- Each repo should have:
- A clear README explaining what it does, how to run it, and a screenshot of output
- Clean, commented code
- A
requirements.txtfile (activate your venv first, thenpip freeze > requirements.txt) - Sample input data so someone can actually run your code
Resume bullet points you can now honestly write:
- Built automated financial reporting and reconciliation tools in Python using pandas, reducing manual Excel processing time
- Developed data pipelines that read source files, clean and transform data, and output formatted Excel reports with exception tracking
- Created bank reconciliation tool matching transactions across data sources with aging analysis and exception reporting
- Proficient in SQL for data extraction, aggregation, joins, and conditional logic
- Experience with Python libraries: pandas, matplotlib, sqlite3, openpyxl
Common interview questions for these roles and how to answer them:
| Question | How to Answer |
|---|---|
| "How do you use Python in your workflow?" | Describe your capstone projects. Emphasize replacing manual Excel work with automated, repeatable scripts. |
| "What's pandas?" | "It's a Python library for working with tabular data — think of it as Excel in code but faster, more reproducible, and it handles large datasets that would slow Excel down." |
| "Can you write a SQL query?" | Walk through a GROUP BY with JOIN. Practice writing queries on paper. |
| "Tell me about a project you built." | Use the STAR method: Situation (manual process), Task (automate it), Action (what you built), Result (time saved, errors reduced). |
| "How comfortable are you with large datasets?" | "pandas handles datasets much larger than Excel can manage comfortably. I've worked with CSVs and databases, and I know how to filter and aggregate efficiently so I'm not loading everything into memory at once." |
Practice resources for interview SQL:
- DataLemur — SQL Interview Questions (free tier — finance-relevant problems)
- HackerRank — SQL Easy/Medium (free)
Daily Schedule Template
| Time Block | Activity |
|---|---|
| 0:00–0:15 | Review yesterday's code. Re-type one exercise from memory. |
| 0:15–0:45 | Learn today's new concept (read/watch resource). |
| 0:45–1:15 | Hands-on practice — type the code yourself, modify it, break it, fix it. |
| 1:15–1:30 | Write a 2-sentence summary of what you learned (in a learning_log.md file). |
Resource Summary
Most resources below are free. Where noted, some content is behind a paywall or requires an account.
| Resource | What It Covers | Used In | Cost | Link |
|---|---|---|---|---|
| Automate the Boring Stuff (2nd ed.) | Python fundamentals, files, automation | Weeks 1, 3 | Free | automatetheboringstuff.com |
| Python Official Tutorial | Authoritative Python reference | All weeks | Free | docs.python.org/3/tutorial |
| Real Python | In-depth articles on any Python topic | Weeks 2, 3 | Freemium (some articles free, full access paid) | realpython.com |
| Google Colab | Free browser-based Python environment | Weeks 1–2 | Free | colab.research.google.com |
| Kaggle — pandas Course | pandas fundamentals, hands-on | Week 2 | Free (account required) | kaggle.com/learn/pandas |
| Kaggle — Data Cleaning | Handling messy real-world data | Week 2 | Free (account required) | kaggle.com/learn/data-cleaning |
| Kaggle — Data Visualization | matplotlib and seaborn | Week 4 | Free (account required) | kaggle.com/learn/data-visualization |
| SQLBolt | Interactive SQL fundamentals | Week 3 | Free | sqlbolt.com |
| Mode SQL Tutorial | SQL with analytics focus | Week 3 | Free | mode.com/sql-tutorial |
| Corey Schafer YouTube | Python and pandas video tutorials | Weeks 1–3 | Free | youtube.com/c/Coreyms |
| DataLemur | SQL interview practice | Week 4 | Freemium (free tier available) | datalemur.com |
| HackerRank — SQL | SQL practice problems | Week 4 | Free (account required) | hackerrank.com/domains/sql |
| GitHub Skills | Git and GitHub basics | Week 1 | Free | github.com/skills |
What You'll Have After 4 Weeks
Skills:
- Python scripting and automation
- pandas for data analysis (filtering, grouping, merging, cleaning, reconciliation, exporting)
- Dates and aging analysis for finance reporting
- SQL for querying databases (including joins,
CASE WHEN,COALESCE) - File automation and report generation
- Basic data visualization
- Git and GitHub for version control
Portfolio (3 pinned GitHub projects):
- Financial Data Cleaner — reads messy files, standardizes and cleans data, exports polished output
- Automated Monthly Report — reads source data, stores in SQLite, generates timestamped Excel reports
- Bank Reconciliation Tool — matches transactions across two data sources, identifies exceptions, produces aging analysis
Resume additions:
- Python (pandas, matplotlib, sqlite3, openpyxl)
- SQL (SQLite, joins, aggregations, conditional logic)
- Data automation, reconciliation, and pipeline development
- GitHub portfolio with working code samples
Extension Topics (After the 4 Weeks)
In priority order:
- Keep building. Automate something real at your current job (even if they didn't ask for it). Nothing is more compelling in an interview than "I built this to solve an actual problem."
- Practice SQL on DataLemur — 15 minutes a day. Most finance Python roles will test SQL.
- Working with APIs — learn the
requestslibrary to pull data from web APIs. Good practice APIs:- ExchangeRate-API — free tier, currency conversion data
- Alpha Vantage — free API key, stock/market data
import requests response = requests.get("https://open.er-api.com/v6/latest/USD") response.raise_for_status() # Raise an error if the request failed data = response.json() import pandas as pd rates = pd.Series(data["rates"], name="rate_vs_usd") print(rates[["EUR", "GBP", "JPY", "CAD", "CHF"]]) - Learn Jupyter Notebooks — many teams use them for analysis. Install with
pip install notebookand run withjupyter notebook. - Explore more pandas — time series operations (
.resample(),.rolling()),pd.date_range(), and working with financial date indices. - Basic web scraping —
BeautifulSoupfor pulling data from websites (useful but don't lead with this in interviews).
You don't need to know everything. You need to know enough to be useful on day one — and you will be.