Guide
Working with Multiple Excel Sheets in Python
Read, transform, and write multi-sheet Excel workbooks with pandas and openpyxl — load all tabs into a dict, merge across sheets, and export to named sheets.
A single workbook usually holds several related tables — sales on one tab, returns on another, a summary on a third. Working with multiple sheets means loading the tabs you need, joining or aggregating across them, and writing results back to clearly named sheets. pandas handles the tabular work; openpyxl is the engine underneath for .xlsx. This guide walks the full cycle on a sample workbook the first step creates, so every snippet runs in order.
Step 1: Install the toolchain
pip install pandas openpyxl
openpyxl is the engine pandas uses to read and write .xlsx. pandas does not ship it, so install both.
Step 2: Create a multi-sheet workbook
Write three DataFrames to three sheets with a single ExcelWriter. Each to_excel call with a different sheet_name adds a tab:
import pandas as pd
sales = pd.DataFrame({
"product_id": [1, 2, 3, 4],
"product": ["Widget", "Gadget", "Gizmo", "Doohickey"],
"units": [120, 80, 45, 30],
"unit_price": [9.99, 14.50, 22.00, 5.25],
})
returns = pd.DataFrame({
"product_id": [1, 3],
"return_units": [10, 5],
})
regions = pd.DataFrame({
"product_id": [1, 2, 3, 4],
"region": ["North", "South", "West", "East"],
})
with pd.ExcelWriter("workbook.xlsx", engine="openpyxl") as writer:
sales.to_excel(writer, sheet_name="Sales", index=False)
returns.to_excel(writer, sheet_name="Returns", index=False)
regions.to_excel(writer, sheet_name="Regions", index=False)
print("Created workbook.xlsx with 3 sheets")
Step 3: List the sheets without loading data
pd.ExcelFile opens the workbook once and exposes sheet_names without parsing any rows — useful for validating a file before you commit to reading it:
xls = pd.ExcelFile("workbook.xlsx")
print("Sheets:", xls.sheet_names)
Step 4: Read all sheets into a dictionary
Pass sheet_name=None to get a dict mapping each sheet name to its DataFrame. Iteration order matches the workbook:
all_sheets = pd.read_excel("workbook.xlsx", sheet_name=None)
print(type(all_sheets), list(all_sheets.keys()))
for name, frame in all_sheets.items():
print(f"{name}: {frame.shape[0]} rows x {frame.shape[1]} cols")
Step 5: Read only the sheets you need
Loading every tab is wasteful when you need a few. Pass a single name for one DataFrame, or a list of names for a dict of just those sheets:
# One sheet -> one DataFrame
sales_only = pd.read_excel("workbook.xlsx", sheet_name="Sales")
# A subset -> a dict of just those sheets
subset = pd.read_excel("workbook.xlsx", sheet_name=["Sales", "Returns"])
print("Loaded subset:", list(subset.keys()))
When individual sheets need different parsing — custom headers, skipped metadata rows, parsed dates — apply those per-sheet options. See How to Read Excel With Pandas: Step by Step for header, skiprows, and parse_dates.
Step 6: Merge data across sheets
The dict from Step 4 lets you join tabs in memory. Here we attach return quantities and region to sales, then compute net revenue. A left join keeps every sales row; fillna handles products with no returns:
sales_df = all_sheets["Sales"]
returns_df = all_sheets["Returns"]
regions_df = all_sheets["Regions"]
merged = (
sales_df
.merge(returns_df, on="product_id", how="left")
.merge(regions_df, on="product_id", how="left")
)
merged["return_units"] = merged["return_units"].fillna(0)
merged["net_revenue"] = merged["unit_price"] * (merged["units"] - merged["return_units"])
print(merged[["product", "region", "units", "return_units", "net_revenue"]])
Step 7: Build a summary sheet
Aggregate the merged data — for example, net revenue per region — to drive a summary tab:
summary = (
merged.groupby("region", as_index=False)["net_revenue"]
.sum()
.sort_values("net_revenue", ascending=False)
)
print(summary)
Step 8: Write multiple DataFrames to named sheets
Use one ExcelWriter so all sheets land in a single file. mode="w" (the default) creates a fresh workbook; each to_excel writes a named tab. The same writer exposes the underlying openpyxl worksheets through writer.sheets, so you can auto-fit columns in the same pass:
with pd.ExcelWriter("report.xlsx", engine="openpyxl") as writer:
summary.to_excel(writer, sheet_name="Summary", index=False)
merged.to_excel(writer, sheet_name="Detail", index=False)
# Auto-fit column widths on every sheet
for ws in writer.sheets.values():
for column_cells in ws.columns:
width = max(len(str(c.value)) for c in column_cells if c.value is not None)
ws.column_dimensions[column_cells[0].column_letter].width = width + 2
print("Wrote report.xlsx with Summary and Detail sheets")
Step 9: Append a sheet to an existing workbook
To add a tab to a workbook that already exists, open the writer with mode="a". In pandas 3.0 you must set if_sheet_exists to say what happens on a name clash — "replace", "overlay", or "error" (the default):
audit = pd.DataFrame({
"step": ["load", "merge", "summarize", "export"],
"status": ["ok", "ok", "ok", "ok"],
})
with pd.ExcelWriter("report.xlsx", engine="openpyxl",
mode="a", if_sheet_exists="replace") as writer:
audit.to_excel(writer, sheet_name="Audit", index=False)
print("Appended Audit sheet; now:", pd.ExcelFile("report.xlsx").sheet_names)
Processing large workbooks sheet by sheet
sheet_name=None materializes every tab at once. For very large workbooks, read one sheet at a time and reduce each before moving on, so only one sheet's data sits in memory:
target_sheets = ["Sales", "Returns"]
results = {}
for sheet in target_sheets:
df = pd.read_excel("workbook.xlsx", sheet_name=sheet)
# Reduce immediately — keep only what the report needs
results[sheet] = df.head(100).copy()
print("Processed sequentially:", list(results.keys()))
Common errors and fixes
| Error | Cause | Fix |
|---|---|---|
ValueError: Excel file format cannot be determined | Wrong/missing extension or a renamed non-Excel file | Pass engine="openpyxl" for .xlsx; confirm the file is real .xlsx |
KeyError: 'Sales' | Sheet name case or whitespace mismatch | Check pd.ExcelFile(path).sheet_names; strip names before lookup |
ValueError: Sheet 'X' already exists ... | Appending in mode="a" without conflict handling | Pass if_sheet_exists="replace" or "overlay" |
Merge produced unexpected NaN | Join keys differ in name or type across sheets | Rename to a common key and align dtypes before merge |
Conclusion
The pattern for multi-sheet work is always the same: read into a dict keyed by sheet name, operate in memory, and write through a single ExcelWriter. Reading as a dict keeps sheets independent and easy to inspect individually; writing through one writer keeps the output as one file. Auto-fitting columns during the same writer context is a small addition that makes every output immediately usable without manual column-width adjustment.
Where to go next
- Combine Multiple Excel Files into One in Python — scale from many sheets to many files.
- Writing DataFrames to Excel with Pandas — deeper control over the export step.
- Merging and Joining Excel DataFrames — join strategies beyond the left join used here.