Guide

Getting Started With Python Excel AutomationDeep dive

Read Specific Columns From Excel With Pandas

Use read_excel usecols to load only the columns you need by name, index, Excel-letter range, or callable — combine it with sheet_name and dtype to save memory.

pandas.read_excel() loads every column on a sheet by default, which wastes memory and time when you only need four columns out of forty. The fix is the usecols parameter: it tells pandas to materialize a subset and skip the rest. This guide is part of Reading Excel Files with Pandas, and it covers all four forms usecols accepts — a list of names, a list of indices, an Excel-letter range, and a callable — plus how to combine the selection with sheet_name and dtype. Every snippet writes its own sample workbook so you can paste and run it.

Prerequisites

Install pandas and the .xlsx engine:

Bash
pip install pandas openpyxl

Create a sample workbook

Each example below reads this file. It has more columns than any one read needs, which is exactly when usecols earns its keep:

Python
import pandas as pd

sample = pd.DataFrame({
    "Order_ID": [1001, 1002, 1003],
    "SKU": ["A-100", "B-200", "C-300"],
    "Quantity": [3, 1, 5],
    "Unit_Price": [19.99, 49.50, 8.75],
    "Region": ["North", "South", "West"],
    "Notes": ["rush", "", "backorder"],
})
sample.to_excel("orders.xlsx", sheet_name="Raw", index=False)
print("wrote orders.xlsx")

Select columns by name

Pass a list of header names. The names must match the header row exactly — same casing, no leading or trailing spaces. Order in your list does not matter; pandas returns columns in their on-sheet order:

Python
import pandas as pd

df = pd.read_excel("orders.xlsx", usecols=["SKU", "Quantity", "Unit_Price"])
print(df.head())

Select columns by integer index

When headers are unreliable or absent, select by zero-based position instead. [0, 2, 3] reads the first, third, and fourth columns:

Python
import pandas as pd

df = pd.read_excel("orders.xlsx", usecols=[0, 2, 3])
print(df.columns.tolist())

Select columns by Excel letter range

usecols accepts a string of Excel column letters — the same A, B, C you see in the spreadsheet UI. Use a colon for a contiguous range, commas for individual columns, or mix both:

Python
import pandas as pd

contiguous = pd.read_excel("orders.xlsx", usecols="A:C")        # A, B, C
sparse = pd.read_excel("orders.xlsx", usecols="A,C,E")          # A, C, E
mixed = pd.read_excel("orders.xlsx", usecols="A:C,E")           # A, B, C, E
print(contiguous.columns.tolist())
print(sparse.columns.tolist())
print(mixed.columns.tolist())

The letters map to physical spreadsheet positions, so they ignore the header names entirely — handy when an export shuffles column titles but keeps the layout stable.

Select columns with a callable

Pass a function and pandas calls it once per column name, keeping the column when it returns True. This is the right tool for pattern-based selection — every column starting with a prefix, or every name except a blocklist:

Python
import pandas as pd

# Keep every column except free-text "Notes"
df = pd.read_excel("orders.xlsx", usecols=lambda name: name != "Notes")
print(df.columns.tolist())

# Keep only price-like columns
prices = pd.read_excel(
    "orders.xlsx",
    usecols=lambda name: "Price" in name or name == "Quantity",
)
print(prices.columns.tolist())

Combine usecols with sheet_name and dtype

usecols composes with the rest of read_excel. Target a tab with sheet_name, then pin the types of the columns you kept so downstream math doesn't break when Excel stores a number as text:

Python
import pandas as pd

df = pd.read_excel(
    "orders.xlsx",
    sheet_name="Raw",
    usecols=["Order_ID", "Quantity", "Unit_Price"],
    dtype={"Order_ID": "int64", "Quantity": "int64", "Unit_Price": "float64"},
)
print(df.dtypes)

The keys in dtype only need to cover columns that survive usecols; a dtype entry for a dropped column is silently ignored.

Why usecols saves memory on wide files

read_excel builds the full DataFrame in RAM. On a sheet with hundreds of columns, restricting to the handful you actually use cuts both the parse work and the resident memory roughly in proportion to the columns dropped. There is no chunksize for Excel, so column pruning is one of the few levers you have to keep a wide-file read inside a memory budget. Pair it with nrows when you only need a sample.

Common pitfalls

SymptomCauseFix
ValueError: Usecols do not match columnsA name in the list isn't an exact header match (case or whitespace differs)Print pd.read_excel(path, nrows=0).columns.tolist() and copy the exact spelling
Empty or wrong columns returnedMixed names and indices in one listUse either all names or all indices — never both in the same list
Selection is off by one columnHeader sits below metadata rows, shifting positionsApply skiprows/header first so indices and letters align to the real header
dtype cast has no effectThe targeted column was dropped by usecolsMake sure the column is in your usecols list before pinning its type

A subtle one: a list mixing names and indices (["SKU", 2]) raises a ValueError. The string form ("A,C") and the callable are the only ways to express "these specific columns" without committing to a single key style.

Frequently asked questions

Does usecols order control the DataFrame column order? No. Whether you pass names, indices, or letters, pandas returns the selected columns in their original on-sheet order. Reorder afterward with df = df[["Unit_Price", "SKU"]].

Can I use usecols with no header row? Yes — pass header=None and select by index list or letter range. Name-based selection needs a header to match against.

What's the difference between "A:C" and [0, 1, 2]? They select the same physical columns. The letter range mirrors the spreadsheet UI and reads naturally; the index list is easier to build programmatically from a computed set of positions.

Why does my callable receive names I didn't expect? The callable receives the parsed header values, including any auto-generated Unnamed: N labels for blank header cells. Guard against those in your predicate.

Conclusion

usecols has four interchangeable forms — names, indices, an Excel-letter string, and a callable — and the right one depends on what's stable in your file. Names read clearest when headers are reliable; letters and indices survive header churn; the callable handles pattern-based selection. All four compose with sheet_name and dtype, so you can target a tab, prune columns, and pin types in a single call while keeping wide-file reads inside a memory budget.

Where to go next

For the end-to-end reading workflow, return to Reading Excel Files with Pandas. To see usecols in the context of sheet targeting, header realignment, and dtypes, read How to Read Excel Files With Pandas: A Step-by-Step Guide. When you're ready to write results back out, see Write a Pandas DataFrame to Excel Without the Index.