Reading Excel Files (Spreadsheets) in Python

This is a detailed tutorial on reading excel files in python. Learn to extract the data from an Excel Workbook or spreadsheet using the xlrd module.

Reading an Excel File in Python

Spreadsheets are used worldwide to store tabular data. You can easily read and write these files in python. Recently, I published an article on reading and writing CSV files. The spreadsheets are quite different from the CSV files. Therefore, I found another good module for specifically parsing the excel files.

In this tutorial, we’ll make use of the xlrd module to read spreadsheets in python. So, you need to install this module first. Simply run the following command to install this module.

pip install xlrd

Python Xlrd Module Installation

Example Excel File.

I’ll use the Excel File shown in the screenshot given below for an illustration of performing different reading operations on the file using the xlrd module in python.

Example Excel Sheet

This excel sheet is named demo.xlsx

Reading A Particular Cell Value

The following python code illustrates the use of the xlrd module to read a particular cell value from the example excel sheet.

#Importing xlrd Module
import xlrd 
  
#Open Workbook, Provide file path as argument to the function open_workbook()
workBook = xlrd.open_workbook("demo.xlsx") 

#Select the desired sheet from the workbook
#sheet_by_index method fetches the sheets using numerical index values, starting from 0
spreadSheet = workBook.sheet_by_index(0) 
  
#Fetching Cell Value at Location 0, 0
cell = spreadSheet.cell_value(0, 0)

#Printing the Cell Value
print(cell)

Reading Cell Values From An Excel File Python

This way, you can fetch the value of any particular cell from any particular excel sheet of an Excel Workbook.

Extracting the Number of Rows & Columns

This module also allows you to find the total number of rows and columns of the spreadsheet in an excel workbook. The following example illustrates this concept.

#Importing xlrd Module
import xlrd 

#Open Workbook, Provide file path as argument to the function open_workbook()
workBook = xlrd.open_workbook("demo.xlsx")

#Select the desired sheet from the workbook
#sheet_by_index method fetches the sheets using numerical index values, starting from 0
spreadSheet = workBook.sheet_by_index(0) 

#Fetching No. of Rows
rows_count = spreadSheet.nrows

#Fetching No. of Columns
columns_count = spreadSheet.ncols

#Printing the Values
print("No. of Rows: " + str(rows_count))
print("No. of Columns: " + str(columns_count))

Extracting Number Of Rows & Columns From An Excel Sheet

Extracting Column Names

You can easily extract the name of all columns using the cell_value() function used in the first example by doing iteration over the number of columns. The following example illustrates the same.

#Importing xlrd Module
import xlrd 

#Open Workbook, Provide file path as argument to the function open_workbook()
workBook = xlrd.open_workbook("demo.xlsx")

#Select the desired sheet from the workbook
#sheet_by_index method fetches the sheets using numerical index values, starting from 0
spreadSheet = workBook.sheet_by_index(0) 

#Fetching No. of Columns
columns_count = spreadSheet.ncols

#Iterating all column name values
for i in range(columns_count):
col_name = spreadSheet.cell_value(0,i)
print(col_name)

Fetch All Column Names In A Spreadsheet Excel Example

Extracting Particular Column & Row Values

The following example illustrates how you can fetch all of the values for a particular column. Again, we’ll do this using iteration over the range of the number of rows. Here we’re reading the values for the second column, Full Name. As the column and row index starts from 0, so the index for the second column here will be 1.

#Importing xlrd Module
import xlrd 

#Open Workbook, Provide file path as argument to the function open_workbook()
workBook = xlrd.open_workbook("demo.xlsx")

#Select the desired sheet from the workbook
#sheet_by_index method fetches the sheets using numerical index values, starting from 0
spreadSheet = workBook.sheet_by_index(0) 

#Fetching No. of Rows
row_count = spreadSheet.ncols

#Fetching the Value For Second Column
#Iterating all column name values
for i in range(row_count):
value = spreadSheet.cell_value(i,1)
print(value)

Reading Particular Column Values Example Python

The example given below illustrates how you can fetch the values for an entire row. We need not do iteration here, rather we’ll make use of the row_values() function here. This function takes the row index as the parameter and returns the list of values of that particular row.

#Importing xlrd Module
import xlrd 

#Open Workbook, Provide file path as argument to the function open_workbook()
workBook = xlrd.open_workbook("demo.xlsx")

#Select the desired sheet from the workbook
#sheet_by_index method fetches the sheets using numerical index values, starting from 0
spreadSheet = workBook.sheet_by_index(0) 

#Fetching Row Values
secondRow = spreadSheet.row_values(1)

#Printing the Row Values
print(secondRow)

Fetching Row Values From Spreadsheet Python Example

Apart from the simple reading of the Excel files, you can do much more with the spreadsheets using the xlrd module. You can view the full documentation of the xlrd module here.

Leave a Reply

Your email address will not be published. Required fields are marked *