Writing User Defined Functions (UDFs)

Exposing functions as UDFs

Python functions to be exposed as UDFs are decorated with the xl_func decorator, imported from the pyxll module. The pyxll module is compiled into the pyxll.xll addin so don't worry that you don't have it.

xl_func(signature, category='PyXLL', help_topic='', thread_safe=False, macro=False):

  • xl_func is a function that returns a decorator for exposing python functions to Excel.
  •  
  • signature is a string that specifies the argument types and, optionally, their names and the return type. If the return type isn't specified the var type is assumed. eg:
    • "int x, string y: double" for a function that takes two arguments, x and y and returns a double
    • "float x" or "float x: var" for a function that takes a float x and returns a variant type.
  •  
  • category is a string used to set what category in Excel the function will belong to.
  •  
  • help_topic is an optional string to the path of a help file that will be displayed when the user chooses the Help button when the function is displayed in Excel.
  •  
  • thread_safe is a boolean that indicates whether the function is thread-safe or not. Thread-safe functions may be called by Excel from different threads if the Excel version is 2007 or newer, and depending on how many CPU cores are available.
  •  
  • macro is a boolean that indicates whether the function should be registered as a macro sheet equivalent function or not. Macro sheet equivalent functions are less restricted in what they can do, and in particular they can call the Excel macro sheet functions. There is an additional overhead for using them though, so you should only use this option when you need it.

Example usage:

from pyxll import xl_func

@xl_func("string name: string", thread_safe=True)
def hello(name):
    """return a familiar greeting"""
    return "Hello, %s" % name

@xl_func("int n: int", category="fibonacci", thread_safe=True)
def fibonacci(n):
    """naive iterative implementation of fibonacci"""
    a, b = 0, 1
    for i in xrange(n):
        a, b = b, a + b
    return a

_fibs = {0 : 0, 1 : 1}
@xl_func("int n: int", category="fibonacci", thread_safe=True)
def fibonacci_fast(n):
    """Dijkstra's implementation of fibonacci"""
    if n in _fibs:
        return _fibs[n]
    if n % 2 == 0:
        fib_2 = fibonacci_fast(n/2)
        _fibs[n] = fib_n = ((2 * fibonacci_fast((n/2)-1)) + fib_2) * fib_2
        return fib_n
    _fibs[n] = fib_n = (fibonacci_fast((n-1) / 2) ** 2) + (fibonacci_fast((n+1) / 2) ** 2)
    return fib_n

Standard argument and return types

Several standard types may be used in the signature specifed when exposing a python UDF. It is also possible to pass arrays and custom types, which are discussed later.

Below is a list of the standard types. Any of these can be specified as an argument type or return type in a function signature. If a type passed from Excel or returned from Python is not (or cannot be converted to) the Python type in this list, an error will be written to the log file and NaN will be returned to Excel if possible.

PyXLL type Python type
int int
float float
string str
bool bool
datetime datetime.datetime
date datetime.date
time datetime.time

In addition there is also the var type. This can be used when the argument or return type isn't fixed. Using the strong types has the advantage that arguments passed from Excel will get coerced correctly, so for example if your function takes an int, you'll always get an int and there's no need to do type checking in your function. If you use a var, you may get a float if a number is passed to your function, and if the user passes a non-numeric value your function will still get called so you need to check the type and raise an exception yourself.

Using arrays

Ranges of cells can be passed from Excel to Python as a 2d array, represented in python as a lists of lists.

Any type can be used as an array type by appending [], as shown in the following example:

from pyxll import xl_func

@xl_func("float[] array: float")
def py_sum(array):
    """return the sum of a range of cells"""
    total = 0.0

    # array is a list of lists of floats
    for row in array:
        for cell_value in row:
            total += cell_value

    return total

Arrays can be used as return values as well. When returning an array remember that it has to be a list of lists. This means to return a row of data you would return [[1,2,3,4]], for example. To enter an array forumla in Excel you select the cells, enter the formula and then press Ctrl+Shift+Enter. Please refer to the Excel documentation for more information about array formula.

Any type can be used as an array type, but float[] requires the least marshalling between Excel and python and is therefore the fastest of the array types.

If you use the var type in your function signature then an array type will be used if you return a list of lists, or if the argument to your function is a range of data.

Using NumPy arrays

To be able to use NumPy arrays you must have NumPy installed and in your pythonpath.

You can use NumPy 1d and 2d arrays as argument types to pass ranges of data into your function, and as return types for returing for array functions. Only up to 2d arrays are supported, as higher dimension arrays don't fit well with how data is arranged in a spreadsheet.

The most common type of NumPy array to use is a 2d array of floats, for which the type to use in the function signature is numpy_array. For 1d arrays, the types numpy_row and numpy_column may be used.

Types other than floating point arrays are supported too, and are listed below for numpy_array. The same applies to the 1d array types.

PyXLL type Python type
numpy_array numpy.array of floats
numpy_array<float>numpy.array of floats
numpy_array<int> numpy.array of ints
numpy_array<bool>numpy.array of bools

Passing errors as values

Sometimes it is useful to be able to pass a cell value from Excel to python when the cell value is actually an error, or vice-versa.

PyXLL has two different ways of doing this.

The first is to use the var type, which passes Excel errors as python exception objects. Below is a table that shows how Excel errors are converted to python exception objects when the var type is used.

Excel errorPython exception type
#NULL! LookupError
#DIV/0! ZeroDivisionError
#VALUE! ValueError
#REF! ReferenceError
#NAME! NameError
#NUM! ArithmeticError
#NA! RuntimeError

The second is to use the special type: float_nan.

float_nan behaves in almost exactly the same way as the normal float type. It can be used as an array type, or as an element type in a numpy array, e.g. numpy_array<float_nan>. The only difference is that if the Excel value is an error, the value passed to python will be float('nan') or 1.#QNAN, which is equivalent to numpy.nan.

The two different float types exist because sometimes you don't want your function to be called if there's an error with the inputs, but sometimes you do. There is also a slight performance penalty for using the float_nan type when compared to a plain float.

Errors can also be returned to Excel using the same types. This way, it is possible to return arrays where some values are errors but some aren't.

Custom types

As well as the standard types listed above, it's also possible to define your own argument and return types that can then be used in your function signatures.

Custom argument types need a function that will convert a standard type to the custom type, which will then be passed to your function. For example, if you have a function that takes an instance of type X, you can declare a function to convert from a standard type to X and then use X as a type in your function signature. When called from Excel, your conversion function will be called with an instance of the base type, and then your exposed UDF will be called with the result of that conversion.

To declare a custom type, you use the xl_arg_type decorator on your conversion function. The xl_arg_type decorator takes two arguments, the name of your custom type and the base type.

Here's an example of a simple custom type:

from pyxll import xl_arg_type

class CustomType:
    def __init__(self, x):
        self.x = x

@xl_arg_type("custom", "string")
def string_to_customtype(x):
    return CustomType(x)

custom can now be used as an argument type in a function signature. The Excel UDF will take a string, but before your Python function is called the conversion function will be used to convert that string to a CustomType instance.

To use a custom type as a return type you also have to specify the conversion function from your custom type to a base type. This is exactly the reverse of the custom argument type conversion described previously.

The custom return type conversion function is decorated with the xl_return_type decorator. For the previous example it could look like:

from pyxll import xl_return_type

@xl_return_type("custom", "string")
def customtype_to_string(x):
    # x is an instance of CustomType
    return x.x

Any recognized type can be used as a base type. That can be a standard type, an array type or another custom type (or even an array of a custom type!). The only restriction is that it must resolve to a standard type eventually.

There are more examples of custom types included in the PyXLL download.

Passing cell metadata

As well as passing arguments from Excel to Python by value, it's also possible to pass data about the cell being passed to your function using the xl_cell type.

When you use the xl_cell type you get passed an object with the properties value, address, formula and note.

value is the value of the cell argument, passed in the same way as the var type.

address is a string representing the address of the cell argument, or None if a value was passed to the function and not a cell reference.

formula is formula of the cell argument, or None if a value was passed to the function and not a cell reference or if the cell has no formula.

note is note on the cell argument, or None if a value was passed to the function and not a cell reference or if the cell has no note.

from pyxll import xl_func

@xl_func("xl_cell cell: string")
def xl_cell_test(cell):
    return "[value=%s, address=%s, formula=%s, note=%s]" % (
                cell.value,
                cell.address,
                cell.formula,
                cell.note)