Guide

Getting Started With Python Excel AutomationDeep dive

Working with Multiple Excel Sheets in Python

Read, transform, and write multi-sheet Excel workbooks with pandas and openpyxl — load all tabs into a dict, merge across sheets, and export to named sheets.

A single workbook usually holds several related tables — sales on one tab, returns on another, a summary on a third. Working with multiple sheets means loading the tabs you need, joining or aggregating across them, and writing results back to clearly named sheets. pandas handles the tabular work; openpyxl is the engine underneath for .xlsx. This guide walks the full cycle on a sample workbook the first step creates, so every snippet runs in order.

Step 1: Install the toolchain

Bash
pip install pandas openpyxl

openpyxl is the engine pandas uses to read and write .xlsx. pandas does not ship it, so install both.

Step 2: Create a multi-sheet workbook

Write three DataFrames to three sheets with a single ExcelWriter. Each to_excel call with a different sheet_name adds a tab:

Python
import pandas as pd

sales = pd.DataFrame({
    "product_id": [1, 2, 3, 4],
    "product": ["Widget", "Gadget", "Gizmo", "Doohickey"],
    "units": [120, 80, 45, 30],
    "unit_price": [9.99, 14.50, 22.00, 5.25],
})
returns = pd.DataFrame({
    "product_id": [1, 3],
    "return_units": [10, 5],
})
regions = pd.DataFrame({
    "product_id": [1, 2, 3, 4],
    "region": ["North", "South", "West", "East"],
})

with pd.ExcelWriter("workbook.xlsx", engine="openpyxl") as writer:
    sales.to_excel(writer, sheet_name="Sales", index=False)
    returns.to_excel(writer, sheet_name="Returns", index=False)
    regions.to_excel(writer, sheet_name="Regions", index=False)

print("Created workbook.xlsx with 3 sheets")

Step 3: List the sheets without loading data

pd.ExcelFile opens the workbook once and exposes sheet_names without parsing any rows — useful for validating a file before you commit to reading it:

Python
xls = pd.ExcelFile("workbook.xlsx")
print("Sheets:", xls.sheet_names)

Step 4: Read all sheets into a dictionary

Pass sheet_name=None to get a dict mapping each sheet name to its DataFrame. Iteration order matches the workbook:

Python
all_sheets = pd.read_excel("workbook.xlsx", sheet_name=None)
print(type(all_sheets), list(all_sheets.keys()))

for name, frame in all_sheets.items():
    print(f"{name}: {frame.shape[0]} rows x {frame.shape[1]} cols")

Step 5: Read only the sheets you need

Loading every tab is wasteful when you need a few. Pass a single name for one DataFrame, or a list of names for a dict of just those sheets:

Python
# One sheet -> one DataFrame
sales_only = pd.read_excel("workbook.xlsx", sheet_name="Sales")

# A subset -> a dict of just those sheets
subset = pd.read_excel("workbook.xlsx", sheet_name=["Sales", "Returns"])
print("Loaded subset:", list(subset.keys()))

When individual sheets need different parsing — custom headers, skipped metadata rows, parsed dates — apply those per-sheet options. See How to Read Excel With Pandas: Step by Step for header, skiprows, and parse_dates.

Step 6: Merge data across sheets

The dict from Step 4 lets you join tabs in memory. Here we attach return quantities and region to sales, then compute net revenue. A left join keeps every sales row; fillna handles products with no returns:

Python
sales_df = all_sheets["Sales"]
returns_df = all_sheets["Returns"]
regions_df = all_sheets["Regions"]

merged = (
    sales_df
    .merge(returns_df, on="product_id", how="left")
    .merge(regions_df, on="product_id", how="left")
)
merged["return_units"] = merged["return_units"].fillna(0)
merged["net_revenue"] = merged["unit_price"] * (merged["units"] - merged["return_units"])

print(merged[["product", "region", "units", "return_units", "net_revenue"]])

Step 7: Build a summary sheet

Aggregate the merged data — for example, net revenue per region — to drive a summary tab:

Python
summary = (
    merged.groupby("region", as_index=False)["net_revenue"]
    .sum()
    .sort_values("net_revenue", ascending=False)
)
print(summary)

Step 8: Write multiple DataFrames to named sheets

Use one ExcelWriter so all sheets land in a single file. mode="w" (the default) creates a fresh workbook; each to_excel writes a named tab. The same writer exposes the underlying openpyxl worksheets through writer.sheets, so you can auto-fit columns in the same pass:

Python
with pd.ExcelWriter("report.xlsx", engine="openpyxl") as writer:
    summary.to_excel(writer, sheet_name="Summary", index=False)
    merged.to_excel(writer, sheet_name="Detail", index=False)

    # Auto-fit column widths on every sheet
    for ws in writer.sheets.values():
        for column_cells in ws.columns:
            width = max(len(str(c.value)) for c in column_cells if c.value is not None)
            ws.column_dimensions[column_cells[0].column_letter].width = width + 2

print("Wrote report.xlsx with Summary and Detail sheets")

Step 9: Append a sheet to an existing workbook

To add a tab to a workbook that already exists, open the writer with mode="a". In pandas 3.0 you must set if_sheet_exists to say what happens on a name clash — "replace", "overlay", or "error" (the default):

Python
audit = pd.DataFrame({
    "step": ["load", "merge", "summarize", "export"],
    "status": ["ok", "ok", "ok", "ok"],
})

with pd.ExcelWriter("report.xlsx", engine="openpyxl",
                    mode="a", if_sheet_exists="replace") as writer:
    audit.to_excel(writer, sheet_name="Audit", index=False)

print("Appended Audit sheet; now:", pd.ExcelFile("report.xlsx").sheet_names)

Processing large workbooks sheet by sheet

sheet_name=None materializes every tab at once. For very large workbooks, read one sheet at a time and reduce each before moving on, so only one sheet's data sits in memory:

Python
target_sheets = ["Sales", "Returns"]
results = {}
for sheet in target_sheets:
    df = pd.read_excel("workbook.xlsx", sheet_name=sheet)
    # Reduce immediately — keep only what the report needs
    results[sheet] = df.head(100).copy()

print("Processed sequentially:", list(results.keys()))

Common errors and fixes

ErrorCauseFix
ValueError: Excel file format cannot be determinedWrong/missing extension or a renamed non-Excel filePass engine="openpyxl" for .xlsx; confirm the file is real .xlsx
KeyError: 'Sales'Sheet name case or whitespace mismatchCheck pd.ExcelFile(path).sheet_names; strip names before lookup
ValueError: Sheet 'X' already exists ...Appending in mode="a" without conflict handlingPass if_sheet_exists="replace" or "overlay"
Merge produced unexpected NaNJoin keys differ in name or type across sheetsRename to a common key and align dtypes before merge

Conclusion

The pattern for multi-sheet work is always the same: read into a dict keyed by sheet name, operate in memory, and write through a single ExcelWriter. Reading as a dict keeps sheets independent and easy to inspect individually; writing through one writer keeps the output as one file. Auto-fitting columns during the same writer context is a small addition that makes every output immediately usable without manual column-width adjustment.

Where to go next