fbpx

Writing a Custom Excel Function Formatter: “Fill Down” Formatting

Fill formatting of Excel array function using PyXLL custom formatter

PyXLL makes returning arrays and DataFrames of data from Python to Excel simple. Python functions can return entire tables of data directly into Excel as user defined function (or UDF). All we require to do that is PyXLL’s @xl_func decorator.

Python Array Functions in Excel: https://www.pyxll.com/docs/userguide/udfs/arrayfuncs.html

But what if you want to format your data in Excel? Often you will want to return columns in a specific format, as percentages or with date formatting, for example. PyXLL has a solution to that! Use a formatter when declaring your worksheet function and PyXLL will format the result when returning Excel.

Cell Formatting: https://www.pyxll.com/docs/userguide/formatting/index.html

PyXLL includes several standard formatters for common cases such as formatting DataFrames. Formatters are also composable. Try adding them together if you want to apply multiple formatters at the same time.

And if none of the standard formatters, or combinations of standard formatters, are capable of doing what you need? Write a custom formatter! In the rest of this article we will explore how to write a custom formatter.

The Challenge: Writing a “Fill Down” Function Formatter

We will have a Python function that returns a table of data to Excel as a user defined function. This table will have a row of column headers as the first row, followed by a table of data. As the inputs to the function change, the size of the table will also change.

We want to allow the Excel user to be able to format the headers and the data in Excel. When the table updates or changes size, we want the formatting the user has applied to be copied (or filled) to the rest of the table.

Using any of the standard formatters, the formatting specified by the formatter would overwrite any formatting set by the user when the table updates. If we don’t use a formatter, when the table increases in size the user has to apply the formatting to the newly added rows.

To solve this, we will write a customer formatter that copies any formatting set by the user to the entire table in Excel. If the table expands, we want the formatter to copy to existing formatting to the new range. If the table contracts, the we want to clear the formatting from the cells where the previous table was.

Custom Formatters: https://www.pyxll.com/docs/userguide/formatting/customformatters.html

Starting Off: Writing a test Excel dynamic array function

Before we write our customer formatter we need a test function in order to try it out. For this we’ll write a function that returns a list of lists (list of rows of values), where the number of rows and columns are inputs to our function.

from pyxll import xl_func

@xl_func("int rows, int columns: var[][]", auto_resize=True)
def fill_formatter_example(rows, columns):
    """Function that returns a table of data for
    the purpose of demonstrating the FillFormatter class.
    """
    header = [f"COL{i+1}" for i in range(columns)]
    rows = [
        [c + r * columns for c in range(columns)]
        for r in range(rows)
    ]
    return [header] + rows

The above function returns a list of data and when called from Excel looks as follows:

No formatting yet, it looks very plain!

Let’s get started! Write a custom function formatter

To add formatting to our function we could use one of PyXLL’s existing formatters, but here we’re going to write our own formatter.

To write a customer formatter we create a new Python class derived from PyXLL’s “Formatter” class. The Formatter class has a few methods we can override to customize how the formatting works.

In our case, we’re going to override the Formatter.clear and the Formatter.apply methods. The Formatter.clear method gets called before Formatter.apply to clear any previous styles. The Formatter.apply method is what applies the formatting.

For simple styles we could use the Formatter.apply_style method. This is a convenient way to apply a style, but it cannot be used to copy and paste styles as we will need later. Instead, we will use the Excel Object Model API directly. This is the same API that you can use in VBA to automate Excel, and the entire API is also available to Python.

Python as a VBA Replacement: https://www.pyxll.com/docs/userguide/vba.html

To start, let’s just override the apply method on a new Formatter class:

from pyxll import Formatter

class FillFormatter(Formatter):
    """Our custom formatter, based on pyxll.Formatter"""

    # Our overridden apply method. This takes several additional arguments, but
    # we're not going to use them in this example so *args and **kwargs is used here.
    def apply(self, cell, *args, **kwargs):
        # 'cell' is a pyxll.XLCell object corresponding to the range where we need to apply our style.
        #
        # We could use Formatter.apply_style here, but instead we'll use the Excel Object Model
        # and convert the cell to an Excel "Range" COM object.
        #
        # PyXLL can use different packages for working with COM objects. "win32com" is the default,
        # but I have specified it explicitly here.
        xl_range = cell.to_range(com_package="win32com")

        # xl_range is an Excel "Range" object, which is the same as a VBA Range object.
        # We can use this to update the style of the Range.
        # Using the VBA macro recorder is a good way to find out how to do what we need,
        # the generated VBA code can be translated to Python quite easily!
        #
        # For now, we'll just set the interior cell color to yellow.
        xl_range.Interior.Color = 0x00FFFF

Next we need to add our new formatter to the Excel function we wrote in the previous section:

# Create an instance of our custom formatter
my_formatter = FillFormatter()

# Pass it to @xl_func using the 'formatter' option
@xl_func("int rows, int columns: var[][]", auto_resize=True, formatter=my_formatter)
def fill_formatter_example(rows, columns):
    ... same as above ...

Now when we call the function in Excel our formatter applies the yellow interior color automatically! If the size of the array changes, the formatter updates the style automatically.

If you don’t see the style being applied, check your PyXLL log file for errors.

The main bit: Writing our “fill down” formatter

Now we understand how to write a custom formatter, and that we can use the entire Excel Object Model in Python in just the same way as from VBA, we’re ready to write our fill formatter.

What we want to do is to copy the style from the previous range to the new range whenever the array being formatted changes size.

You might have seen that with the code above, when the array shrinks the yellow formatting is automatically cleared from where the array was. The “clear” method of the formatter is what does that. In our case, we don’t want to clear the formatting as we’re going to copy that to the new range. So, we will override the clear method and save the range that should be cleared to use in the apply method. Since the apply method is always called directly after the clear method, we can store this temporarily between these two calls.

from pyxll import Formatter

class FillFormatter(Formatter):
    """Our custom formatter, based on pyxll.Formatter"""

    def __init__(self):
        self.__prev_cell = None

    def clear(self, cell, *args, **kwargs):
        # Instead of clearing any formatting keep a reference to the cell that should be cleared
        self.__prev_cell = cell

    def apply(self, cell, *args, **kwargs):        
        # Get the previous cell and clear the temporary reference to it
        prev_cell = self.__prev_cell
        self.__prev_cell = None

        # Here we want to copy the style from "prev_cell" to "cell"
        # TODO!

We saw in the last section that we can get an Excel “Range” object from the cell. Using that, we have access to all of the same properties and methods that are also available in VBA.

Specifically, we can use Range.Copy and Range.PasteSpecial to copy and paste the formatting from one range to the other.

To call Range.PasteSpecicial we need some Excel constants, which we get from the “win32com.client.constants” module.

from pyxll import Formatter
from win32com.client import constants

class FillFormatter(Formatter):
    """Our custom formatter, based on pyxll.Formatter"""

    def __init__(self):
        self.__prev_cell = None

    def clear(self, cell, *args, **kwargs):
        # Instead of clearing any formatting keep a reference to the cell that should be cleared
        self.__prev_cell = cell

    def apply(self, cell, *args, **kwargs):        
        # Get the previous cell and clear the temporary reference to it
        prev_cell = self.__prev_cell
        self.__prev_cell = None

        # Here we copy the style from "prev_cell" to "cell"
        prev_range = prev_cell.to_range(com_package="win32com")
        new_range = cell.to_range(com_package="win32com")

        prev_range.Copy()
        new_range.PasteSpecial(Paste=constants.xlPasteFormats, Operation=constants.xlNone)

        # End copy/paste mode.
        new_range.Application.CutCopyMode = False

The last line above is to end Excel’s copy/paste mode. Without that line, a dotted border will be left around the cells being copied as there is when you manually copy a range.

Using this formatter, if we apply formatting to the range and then change the size of the returned array the formatting gets copied to the entire range.

So close, but not quite right! Above you can see that the style from the headers gets copied to as the range expands. We need to make some changes so that the header is treated separately from the rest of the table.

Improvement 1: Getting the header right

To prevent the header style from being copied when we copy the style from the old range to the new one, we need to get the range excluding the header.

To do that we can make use of the Range.Offset and Range.Resize properties. VBA is unusual in that properties can take arguments. In Python, properties never take arguments. Instead of accessing the properties directly we use the “GetX” accessor methods “Range.GetOffset” and “Range.GetResize” so that we can pass the arguments required.

from pyxll import Formatter
from win32com.client import constants

class FillFormatter(Formatter):
    """Our custom formatter, based on pyxll.Formatter"""

    def __init__(self, num_headers=1):
        self.__num_headers = num_headers
        self.__prev_cell = None

    def clear(self, cell, *args, **kwargs):
        # Instead of clearing any formatting keep a reference to the cell that should be cleared
        self.__prev_cell = cell

    def apply(self, cell, *args, **kwargs):        
        # Get the previous cell and clear the temporary reference to it
        prev_cell = self.__prev_cell
        self.__prev_cell = None

        # Here we copy the style from "prev_cell" to "cell"
        prev_range = prev_cell.to_range(com_package="win32com")
        new_range = cell.to_range(com_package="win32com")

        # Get just the headers as the top N rows of the ranges
        prev_header = prev_range.GetResize(RowSize=self.__num_headers)
        new_header = new_range.GetResize(RowSize=self.__num_headers)

        # Copy the header style as there could be more columns than previously
        prev_header.Copy()
        new_header.PasteSpecial(Paste=constants.xlPasteFormats, Operation=constants.xlNone)

        # Now get the rest of the ranges, excluding the headers
        prev_rows = prev_rows.GetOffset(RowOffset=self.__num_headers)
        prev_rows = prev_rows.GetResize(RowSize=prev_range.Rows.Count - self.__num_headers)
                    
        new_rows = new_rows.GetOffset(RowOffset=self.__num_headers)
        new_rows = new_rows.GetResize(RowSize=new_range.Rows.Count - self.__num_headers)

        # And copy the style for the table below the headers
        prev_rows.Copy()
        new_rows.PasteSpecial(Paste=constants.xlPasteFormats, Operation=constants.xlNone)

        # End copy/paste mode.
        new_range.Application.CutCopyMode = False

Now we’re getting somewhere! Not only does the header style not get copied to the data below, if the table expands to get wider the header style gets copied across too.

Improvement 2: Clearing old formatting

So far our formatter copied the formatting from the previous range to the new range. This works great when the Excel range is expanding, but what about when the range shrinks?

You will remember that previously we added our own override to the Formatter.clear method. The “clear” method is what would usually clear any formatting for the previous range, before the formatter applies the new format to the current range. Our formatter can’t clear the previous range as it needs that in order to copy the formatting to the updated range. Instead, we need to modify our formatter to clear just the areas that are no longer needed.

To check if the range is expanding or contracting we can use the Range.Rows.Count and Range.Columns.Count properties. If the number of rows or columns in the new range is smaller than the previous range, it is contracting. To get just the part of the range for the formatter to clear, we can use Range.Offset and Range.Resize.

Note: Range.Offset and Range.Resize are properties. VBA has a rather strange feature where properties can take arguments, just like methods! In Python, properties never take arguments. When calling these properties with arguments in Python we have to use their accessor methods, GetOffset and GetResize instead. In general, you can call all Excel Object Model properties with arguments from Python via an accessor method. Accessor methods have the same name as the property with a “Get” prefix.

Once we have identified that the range is contracting, and we have the sub-range to clear using Range.GetOffset and Range.GetResize, we call Range.ClearFormats to clear the formatting.

Our updated “apply” method now looks as follows:

    def apply(self, cell, *args, **kwargs):
        # Reset prev_cell ahead of any future calls.
        prev_cell = self.__prev_cell
        self.__prev_cell = None

        # Get the previous and new Range objects from the cells as win32com objects.
        prev_range = prev_cell.to_range(com_package="win32com")
        new_range = cell.to_range(com_package="win32com")
        
        # Check if the range is shrinking and clear the formatting if it is.
        if prev_range.Rows.Count > new_range.Rows.Count:
            num_rows = prev_range.Rows.Count - new_range.Rows.Count
            rows = prev_range.GetOffset(RowOffset=new_range.Rows.Count)
            rows = rows.GetResize(RowSize=num_rows)
            rows.ClearFormats()
        
        if prev_range.Columns.Count > new_range.Columns.Count:
            num_cols = prev_range.Columns.Count - new_range.Columns.Count
            cols = prev_range.GetOffset(ColumnOffset=new_range.Columns.Count)
            cols = cols.GetResize(ColumnSize=num_cols)
            cols.ClearFormats()

        # .... as previously ....

With this change, when reducing the number of rows or columns, the formatter automatically clears the formatting from the cells no longer in use.

The Final Code: Our complete ‘Fill Down’ custom Excel formatter!

The complete Python code accompanying this post can be found on GitHub here https://github.com/pyxll/pyxll-examples/tree/master/formatters.

The FillFormatter class available on GitHub is almost the same as the code we’ve worked through in this post, but with some small additions to handle some corner cases we’ve not considered above.

With what you’ve learned in this post, now you should be able to see that any formatting you want to achieve for your user defined Excel array functions is possible!

References