Working with Excel and Python together has never been easier, but with so many different tools to choose from how can you know what to use when? This post gives an overview of some of the most popular and useful tools out there to help you choose.
At the end of this post I’ve compiled a feature matrix outlining the different features with the available packages for calling Python from Excel.
Broadly, the Python Excel tools break down into two main camps:
Reading and Writing Excel workbooks
For working with Excel 2007 onwards (.xlsx files), this is a great all round choice.
If you only need to write Excel workbooks and not read them then xlsxwriter is an easy to use package to use that works well. If you are working with large files or are particularly concerned about speed then you may find xlsxwriter a better choice than openpyxl.
For working with ranges of data and reading or writing them to Excel workbooks with no frills, using pandas can be a very quick and effective method. If you don’t need much in the way of formatting and just care about getting data into or out of Excel workbooks then the pandas functions “read_excel” and “write_excel” may be just what you need.
xltable is a higher level library for building Excel reports from pandas DataFrames. Rather than writing the workbook cell by cell or row by row, whole tables are added and can include formulas that reference other tables without having to know ahead of time where those tables will be. For more complex reports involving formulas xltable can be very useful.
xlrd and xlwt read and write the old Excel .xls files respectively. These are included in this list for completeness, but are now really only used when you are forced to deal with the legacy xls file format. They are both extremely mature packages that are very capable and stable, but xlwt will never be extended to support the newer xlsx/xlsm file formats and so for new code dealing with modern Excel file formats they are no longer the best choice.
Building Interactive Tools Using Excel as a Front-End
Interactive tools using Excel as a familiar user interface but using Python code to do the heavy lifting can be really powerful. Virtually everyone already has experience using Excel, and many people spend a large part of their working day in Excel already; so if you are writing tools for others to use delivering them via Excel can be a great combination.
There are a few tools available that can be used to bring Python to Excel, and it can be difficult to know which one is right for different situations. We’ll look at each and attempt to answer some common questions.
PyXLL is currently the only package that enables developers to write fully featured Excel addins in Python. It embeds the Python interpreter into Excel so that it can be used as a complete VBA replacement. You can think of it conceptually to being similar to something like ExcelDNA for C#, except that it is dynamic and imports your Python code while Excel is running – so there’s no addin to build and no need to restart Excel when modifying your Python code.
Using PyXLL Python code can be used to write:
- Worksheet functions (user defined functions, called from Excel worksheet formulas)
- Custom Ribbon Bars
- Real Time Data feeds
pywin32 / comtypes
It’s sometimes overlooked that the entire Excel API (or Object Model) is exposed via COM. Everything that can be written as a VBA macro can also be written using the Excel COM API in Python by using pywin32 or comtypes.
The Excel COM API can be used from outside of Excel (e.g. from a running Python prompt, script or Jupyter notebook. If you already know how to do something in VBA then doing the equivalent task in Python via the COM API is generally quite straightforward. Calling a routine using pywin32 or comtypes from Excel (e.g. from a button on the ribbon bar, menu item or macro) can be done using PyXLL.
xlwings provides a convenient wrapper around the Excel COM API for simplifying many common tasks, such as writing pandas dataframes to an open Excel workbook. It uses pywin32’s COM wrappers and gives you access to those, so you can always drop down to using the normal Excel API should you need to.
In the same way as pywin32 and comtypes, xlwings can talk to Excel from a normal Python prompt or Jupyter notebook. For calling code using xlwings from Excel itself, PyXLL provides a convenient way of getting the Excel Application object as an xlwings object (see here). This allows you to script Excel in Python and trigger running your code from a ribbon button or menu item. An example use-case could be a ribbon button for fetching data from a database, building a report, and writing it straight into the running Excel.
xlwings includes a way of writing user defined functions (UDFs) or worksheet functions in Python that are called from a formula in Excel, similar to the user defined functions offered by PyXLL. These rely on a server process running outside of Excel and VBA wrappers to call into that server. It’s an effective solution but has some drawbacks (such as performance, and the fact that those functions are only available from the workbook containing the VBA wrappers).
DataNitro is another API to control Excel from Python. It’s not clear what the advantage over its API and the existing and well understood Microsoft Excel COM API is, but it does allow you to write and run scripts without leaving Excel. It has rudimentary support for user defined functions (worksheet functions), but they run outside of the Excel process and only work if there is only one Excel process running.
It’s currently unknown whether DataNitro is still under active development or not.
Feature Matrix For Calling Python From Excel
|Basic worksheet functions||✔||✔||✔||DataNitro and xlwings use an external Python process, xlwings requires VBA wrapper code|
|Reload without restarting Excel||✔||✔||✔||Modules can be reloaded without restarting Excel. PyXLL also supports 'deep reloading' where all module dependencies are also reloaded.|
|Function documentation||✘||✔||✔||Include Python function docstrings in the Excel function wizard|
|IntelliSense||✘||✘||✔||IntelliSense tooltip as you type – PyXLL integrates with the ExcelDNA Intellisense Addin|
|Object Cache||✘||✘||✔||Pass Python objects between worksheet functions seamlessly via an object cache|
|Ribbon customisation||✘||✘||✔||Give users a rich user experience with custom ribbon menus|
|Real time data||✘||✘||✔||Stream real time data into Excel worksheets|
|Menu functions||✘||✘||✔||Call Python code from the Excel menu|
|Automatically resize arrays||✘||✔||✔||Array functions can resize automatically|
|Volatile Functions||✘||✔||✔||Volatile functions are called every time a worksheet is recalculated|
|Macros||✘||✔||✔||Macros are functions that can be attached to UI elements like buttons or called from VBA|
|Keyboard shortcuts||✘||✘||✔||Keyboard shortcuts can be assigned to macros with PyXLL|
|Thread safe worksheet functions||✘||✘||✔||Improve worksheet responsiveness by using Excel's own threadpool to run worksheet functions concurrently|
|Asynchronous functions||✘||✘||✔||Don't block Excel waiting for long running functions|
|Macro sheet equivalent functions||✘||✘||✔||Call back into Excel from a worksheet function|
|Full Excel API exposed||✘||✔||✔||xlwings uses pywin32, PyXLL users can choose between pywin32, comtypes or xlwings|