Python has become the analytical programming language of choice for data analysis and is reported to be used by over 10M people worldwide. The most popular analytical tool, measured by quantity of users, is the spreadsheet and as Python has grown in popularity, it was inevitable the concept of a Python spreadsheet would arise. Created in 2001, PyExcelerator was the first Python package for interacting with spreadsheets but it is no longer maintained and new libraries have become more popular.
The preference for a Python spreadsheet is largely due to the complimentary benefits of Python and a spreadsheet. Python is great for analysis of big data sets, automating processes, connecting to hosted data sets, and complex analyses. Spreadsheets on the other hand, generally do not share the same strengths as Python and are more useful for viewing and visualizing data, simple computations, and sharing with team members.
What are Python Spreadsheets?
A real spreadsheet is one that accepts normal spreadsheet functions and gives users the flexibility to copy/paste, format, visualize, and modify data. There are other applications that provide a tabular view of data sets and allow users to write Python but those are not the focus of this article as they are not real spreadsheets.
Another class of software not covered here are many of the Python libraries that enable interactions with spreadsheets. The list includes XlsxWriter, pyexcel, pycel, PyExcelerate, openpyxl, and many others.
Criteria When Choosing a Product
When considering what Python spreadsheet to use, think through the following questions:
Will the user know how to get Python running on your computer?
Getting Python running on a computer requires downloading and installing the latest version of Python. Then installing a code editor and maintaining versioning across any packages used. The tasks aren’t terribly complicated but it is not trivial.
Will the user want more spreadsheet capabilities or Python freedom?
Software engineers may prefer to do more work in Python and only display a summary or results in a spreadsheet whereas others may prefer to use Python to import data or run modeling packages with the rest of the analysis done in a spreadsheet.
How big are the data sets that will be used?
Using Python for data analysis enables a user to load many millions of rows of data whereas Excel and Google Sheets top out at 1M rows and suffer from slow performance well before they hit that total. If millions of rows are needed for the analysis, traditional spreadsheets may struggle to support the data but could still be used to display summary statistics or graphs.
Should the analysis be run on a local computer or in the cloud?
A local spreadsheet application can be accessed offline, may be preferable for certain data security concerns, and is limited to the processing power of the local hardware. A hosted application requires an internet connections, necessitates data being uploaded to servers, and will have access to great processing power.
Will the analysis be shared?
A static view or image of the final results can always be shared in a ppt or an email; however, if the analysis is intended to be shared in an Excel file, some programs require installation of the Python add-in wherever the file will be shared. Hosted solutions make sharing easy as there is often a ‘Share’ button and the ability to assign read-only or editor permissions.
Top 4 Python Spreadsheets
The following list outlines the most popular options for using Python in a real spreadsheet to automate tasks, connect to data sets, use python libraries, and other tasks in a spreadsheet.
1. Row Zero
Row Zero is a spreadsheet application that runs in the cloud, supports millions of rows of data, and lets users write Python directly into the application’s code window or spreadsheet cells. No installations are needed on a user’s local machine as Row Zero runs entirely in the cloud. The product can be accessed by logging in on the website. The application consists of a highly performant spreadsheet alongside a Python code window. In the code window, users can write Python functions that are consumed in the spreadsheet, import Python packages, like numpy, Scipy, and Pandas, or connect to hosted data sets and APIs. The product is currently free.
Pyxll is an Excel add-in that allows developers to use python in their spreadsheets, build applications, stream data sets, and create models. Pyxll also facilitates running a Jupyter notebook inside an Excel spreadsheet. The application runs locally on a computer and users will need to have Python running on their local machine. Python IDEs, like Conda, Pycharm, or VSCode, can be used to write and debug the code that is consumed in a spreadsheet and pyxll provides a wide range of documentation and support resources to help users share workbooks, perform analysis, and debug code. The product comes with a free 30 day trial and is $29/month beyond 30 days.
Xlwings is an open source Python package that comes pre-installed with Anaconda and WinPython. An Excel add-in can also be installed. The package enables automating excel with Python scripts, writing macros in Python, writing user defined functions, and building custom add-ins. To use the free version a Python environment must be maintained on a user’s local machine. A professional plan is available that hosts the Python environment and provides increased functionality, like 1-click installers, web-based add-ins that run everywhere and support for Excel 365 and Google Sheets. The professional plan costs $1490 annually.
Pyspread is an open source python spreadsheet. Unlike the two previously mentioned products that plug into Excel, Pyspread has its own spreadsheet interface. The product supports python code in cells, accessing python libraries, importing and exporting CSVs, SVGs and PDFs, and charts. To run the application, a user must already have Python installed on their computer, then download Pyspread and run it. Pyspread is free for all users.
As you can see from the list above, there are a number of great Python spreadsheet options. For additional information on any of the applications, visit their respective websites. For more information on Row Zero, a blazingly fast, connected, programmable Python spreadsheet visit the website.