Guide
Read Specific Columns From Excel With Pandas
Use read_excel usecols to load only the columns you need by name, index, Excel-letter range, or callable — combine it with sheet_name and dtype to save memory.
pandas.read_excel() loads every column on a sheet by default, which wastes memory and time when you only need four columns out of forty. The fix is the usecols parameter: it tells pandas to materialize a subset and skip the rest. This guide is part of Reading Excel Files with Pandas, and it covers all four forms usecols accepts — a list of names, a list of indices, an Excel-letter range, and a callable — plus how to combine the selection with sheet_name and dtype. Every snippet writes its own sample workbook so you can paste and run it.
Prerequisites
Install pandas and the .xlsx engine:
pip install pandas openpyxl
Create a sample workbook
Each example below reads this file. It has more columns than any one read needs, which is exactly when usecols earns its keep:
import pandas as pd
sample = pd.DataFrame({
"Order_ID": [1001, 1002, 1003],
"SKU": ["A-100", "B-200", "C-300"],
"Quantity": [3, 1, 5],
"Unit_Price": [19.99, 49.50, 8.75],
"Region": ["North", "South", "West"],
"Notes": ["rush", "", "backorder"],
})
sample.to_excel("orders.xlsx", sheet_name="Raw", index=False)
print("wrote orders.xlsx")
Select columns by name
Pass a list of header names. The names must match the header row exactly — same casing, no leading or trailing spaces. Order in your list does not matter; pandas returns columns in their on-sheet order:
import pandas as pd
df = pd.read_excel("orders.xlsx", usecols=["SKU", "Quantity", "Unit_Price"])
print(df.head())
Select columns by integer index
When headers are unreliable or absent, select by zero-based position instead. [0, 2, 3] reads the first, third, and fourth columns:
import pandas as pd
df = pd.read_excel("orders.xlsx", usecols=[0, 2, 3])
print(df.columns.tolist())
Select columns by Excel letter range
usecols accepts a string of Excel column letters — the same A, B, C you see in the spreadsheet UI. Use a colon for a contiguous range, commas for individual columns, or mix both:
import pandas as pd
contiguous = pd.read_excel("orders.xlsx", usecols="A:C") # A, B, C
sparse = pd.read_excel("orders.xlsx", usecols="A,C,E") # A, C, E
mixed = pd.read_excel("orders.xlsx", usecols="A:C,E") # A, B, C, E
print(contiguous.columns.tolist())
print(sparse.columns.tolist())
print(mixed.columns.tolist())
The letters map to physical spreadsheet positions, so they ignore the header names entirely — handy when an export shuffles column titles but keeps the layout stable.
Select columns with a callable
Pass a function and pandas calls it once per column name, keeping the column when it returns True. This is the right tool for pattern-based selection — every column starting with a prefix, or every name except a blocklist:
import pandas as pd
# Keep every column except free-text "Notes"
df = pd.read_excel("orders.xlsx", usecols=lambda name: name != "Notes")
print(df.columns.tolist())
# Keep only price-like columns
prices = pd.read_excel(
"orders.xlsx",
usecols=lambda name: "Price" in name or name == "Quantity",
)
print(prices.columns.tolist())
Combine usecols with sheet_name and dtype
usecols composes with the rest of read_excel. Target a tab with sheet_name, then pin the types of the columns you kept so downstream math doesn't break when Excel stores a number as text:
import pandas as pd
df = pd.read_excel(
"orders.xlsx",
sheet_name="Raw",
usecols=["Order_ID", "Quantity", "Unit_Price"],
dtype={"Order_ID": "int64", "Quantity": "int64", "Unit_Price": "float64"},
)
print(df.dtypes)
The keys in dtype only need to cover columns that survive usecols; a dtype entry for a dropped column is silently ignored.
Why usecols saves memory on wide files
read_excel builds the full DataFrame in RAM. On a sheet with hundreds of columns, restricting to the handful you actually use cuts both the parse work and the resident memory roughly in proportion to the columns dropped. There is no chunksize for Excel, so column pruning is one of the few levers you have to keep a wide-file read inside a memory budget. Pair it with nrows when you only need a sample.
Common pitfalls
| Symptom | Cause | Fix |
|---|---|---|
ValueError: Usecols do not match columns | A name in the list isn't an exact header match (case or whitespace differs) | Print pd.read_excel(path, nrows=0).columns.tolist() and copy the exact spelling |
| Empty or wrong columns returned | Mixed names and indices in one list | Use either all names or all indices — never both in the same list |
| Selection is off by one column | Header sits below metadata rows, shifting positions | Apply skiprows/header first so indices and letters align to the real header |
dtype cast has no effect | The targeted column was dropped by usecols | Make sure the column is in your usecols list before pinning its type |
A subtle one: a list mixing names and indices (["SKU", 2]) raises a ValueError. The string form ("A,C") and the callable are the only ways to express "these specific columns" without committing to a single key style.
Frequently asked questions
Does usecols order control the DataFrame column order?
No. Whether you pass names, indices, or letters, pandas returns the selected columns in their original on-sheet order. Reorder afterward with df = df[["Unit_Price", "SKU"]].
Can I use usecols with no header row?
Yes — pass header=None and select by index list or letter range. Name-based selection needs a header to match against.
What's the difference between "A:C" and [0, 1, 2]?
They select the same physical columns. The letter range mirrors the spreadsheet UI and reads naturally; the index list is easier to build programmatically from a computed set of positions.
Why does my callable receive names I didn't expect?
The callable receives the parsed header values, including any auto-generated Unnamed: N labels for blank header cells. Guard against those in your predicate.
Conclusion
usecols has four interchangeable forms — names, indices, an Excel-letter string, and a callable — and the right one depends on what's stable in your file. Names read clearest when headers are reliable; letters and indices survive header churn; the callable handles pattern-based selection. All four compose with sheet_name and dtype, so you can target a tab, prune columns, and pin types in a single call while keeping wide-file reads inside a memory budget.
Where to go next
For the end-to-end reading workflow, return to Reading Excel Files with Pandas. To see usecols in the context of sheet targeting, header realignment, and dtypes, read How to Read Excel Files With Pandas: A Step-by-Step Guide. When you're ready to write results back out, see Write a Pandas DataFrame to Excel Without the Index.