Guide
Fill Missing Values in Excel with Pandas fillna
Fill missing values in an Excel file with pandas fillna: normalize blanks to NaN, then apply a scalar, a per-column dictionary, forward/back fill, or interpolation, and export.
To fill missing values in an Excel workbook with pandas, load the file with pd.read_excel(), turn Excel blanks into real NaN, apply DataFrame.fillna() (or ffill/bfill/interpolate), and write the result with df.to_excel(). The examples below run in order and share a namespace; the first block writes a sample workbook.
Create a sample workbook
import pandas as pd
seed = pd.DataFrame({
"Region": ["North", "South", "West", "East"],
"Revenue": [12000, None, 9800, None],
"Status": ["Closed", None, "Open", None],
"Date": ["2024-01-01", None, "2024-01-03", "2024-01-04"],
})
seed.to_excel("monthly_report.xlsx", index=False)
Load and normalize blanks
Excel frequently exports an empty cell as an empty string, which pandas keeps as valid text — fillna() skips it. Convert whitespace-only cells to NaN first so every gap is fillable:
df = pd.read_excel("monthly_report.xlsx", engine="openpyxl")
df = df.replace(r"^\s*$", pd.NA, regex=True)
print(df.isna().sum())
Fill with a per-column dictionary
A dictionary fills each column with a type-appropriate value in one call. This is the safest default because it never forces one column's fill onto another:
fill_map = {
"Revenue": df["Revenue"].median(),
"Status": "Pending",
}
df = df.fillna(fill_map)
print(df)
Fill strategies at a glance
| Strategy | Syntax | Best for |
|---|---|---|
| Global scalar | df.fillna(0) | A single uniform default |
| Per-column dict | df.fillna({"A": 0, "B": "n/a"}) | Mixed dtypes, categorical defaults |
| Forward / back fill | df.ffill() / df.bfill() | Ordered logs, time series |
| Interpolation | df.interpolate() | Continuous numeric trends |
Forward and back fill for ordered data
For time series, carry the last known value forward, then backfill any leading gap. In pandas 3.0 the method= argument to fillna() was removed — use the dedicated ffill() and bfill() methods:
df["Date"] = pd.to_datetime(df["Date"], errors="coerce")
df = df.sort_values("Date")
df["Date"] = df["Date"].ffill().bfill()
print(df[["Region", "Date"]])
Interpolate numeric trends
When a numeric column represents a smooth trend, interpolate() estimates gaps from neighboring values instead of repeating one number:
series = pd.Series([10.0, None, None, 40.0])
print(series.interpolate(method="linear"))
Export the filled workbook
Write without the index so the output matches the input layout:
df.to_excel("monthly_report_filled.xlsx", index=False, engine="openpyxl")
print("Wrote monthly_report_filled.xlsx")
Troubleshooting
A numeric fill turns the column into text. A column mixing numbers and text is object dtype; filling it with a number can coerce values to strings. Isolate numerics first:
num_cols = df.select_dtypes(include="number").columns
df[num_cols] = df[num_cols].fillna(0)
print(df.dtypes)
fillna() leaves blanks untouched. They are empty strings, not NaN. Run df.replace(r"^\s*$", pd.NA, regex=True) after loading. Also avoid keep_default_na=False on import unless you map blanks yourself.
Formula cells lose their formulas on export. to_excel() writes computed values, not formulas. If downstream consumers need live formulas, write into a pre-built template with openpyxl instead of overwriting the sheet wholesale.
Validation checklist — confirm df.isna().sum() is zero where it must be, check df.dtypes to ensure numeric columns stayed numeric, and open the exported file to confirm no #VALUE! cells.
Where to go next
- For profiling, classifying, and validating gaps end to end, see Handling Missing Data in Excel Reports.
- For the broader cleaning toolkit, see Advanced Data Transformation and Cleaning.
- Joins are a common source of new
NaN— see Merging and Joining Excel DataFrames.