PyXLL The Python Excel Add-In
  • Product
    • Features
    • Get Started
    • Request Demo
    • Download
  • Pricing
  • Resources
    • Documentation
    • Blog
    • Videos
    • FAQ
    • Learn Python
    • Customer Portal
    • About Us
  • Support
    • Documentation
    • Videos
    • FAQ
    • Contact Us
  • Contact Us
Table of Contents
  • PyXLL Documentation
  • Introduction to PyXLL
  • User Guide
    • Installing PyXLL
    • Configuring PyXLL
    • Worksheet Functions
      • Introduction
      • Argument and Return Types
      • Cached Objects
      • Cached Functions
      • Array Functions
      • NumPy Array Types
      • Pandas Types
      • Polars DataFrames
      • Asynchronous Functions
      • Function Documentation
      • Variable and Keyword Arguments
      • Recalculating On Open
      • Recalculating On Reload
      • Interrupting Functions
    • Macro Functions
    • Real Time Data
    • Cell Formatting
    • Charts and Plotting
    • Custom Task Panes
    • ActiveX Controls
    • Using Pandas in Excel
    • Customizing the Ribbon
    • Context Menu Functions
    • Working with Tables
    • Python as a VBA Replacement
    • Menu Functions
    • Reloading and Rebinding
    • Error Handling
    • Deploying your add-in
    • Workbook Metadata
  • Video Guides and Tutorials
  • API Reference
  • What’s new in PyXLL 5
  • Changelog
Close

Pandas Types¶

Pandas DataFrames and Series can be used as function arguments and return types for Excel worksheet functions.

For polars DataFrames, see Polars DataFrames.

Tip

See Using Pandas in Excel for a more complete explaination of how pandas can be used in Excel.

When used as an argument, the range specified in Excel will be converted into a Pandas DataFrame or Series as specified by the function signature.

When returning a DataFrame or Series, a range of data will be returned to Excel. PyXLL will automatically resize the range of the array formula to match the returned data if auto_resize=True is set in xl_func.

The following shows returning a random dataframe, including the index:

from pyxll import xl_func
import pandas as pd
import numpy as np

@xl_func("int rows, int columns: dataframe<index=True>", auto_resize=True)
def random_dataframe(rows, columns):
    data = np.random.rand(rows, columns)
    column_names = [chr(ord('A') + x) for x in range(columns)]
    return pd.DataFrame(data, columns=column_names)

Type Annotations¶

Python type annoations can be used instead of the type signature shown above.

Where type parameters are required you can use xl_arg and xl_return in conjunction with type annotations to specify the type parameters.

The above functions can be written using type annoations as follows:

from pyxll import xl_func, xl_return
import pandas as pd
import numpy as np

@xl_func(auto_resize=True)
@xl_return(index=True)  # pass index=True to the DataFrame return type
def random_dataframe(rows: int, columns: int) -> pd.DataFrame:
    data = np.random.rand(rows, columns)
    column_names = [chr(ord('A') + x) for x in range(columns)]
    return pd.DataFrame(data, columns=column_names)

Type Parameters¶

The following type parameters are available for the dataframe and series argument and return types:

  • dataframe, when used as an argument type

    dataframe<kind=None, index=0, columns=1, dtype=None, dtypes=None, index_dtype=None>

    kind

    Set to pandas to specify that a pandas DataFrame is required. If not set the default DataFrame type will be used.

    The default DataFrame type can be set by the default_dataframe_kind option in the [PYXLL] section of the pyxll.cfg file.

    Defaults to pandas if not set otherwise.

    index

    Number of columns to use as the DataFrame’s index. Specifying more than one will result in a DataFrame where the index is a MultiIndex.

    columns

    Number of rows to use as the DataFrame’s columns. Specifying more than one will result in a DataFrame where the columns is a MultiIndex. If used in conjunction with index then any column headers on the index columns will be used to name the index.

    trim

    New in PyXLL 5.11

    Trim leading and trailing empty rows and columns from the data before constructing the DataFrame.

    See Trimming DataFrame Arguments.

    dtype

    Datatype for the values in the dataframe. May not be set with dtypes.

    dtypes

    Dictionary of column name -> datatype for the values in the dataframe. May not be set with dtype.

    The dictionary can be specified using standard Python dictionary syntax as part of a function signature string. However, often it is more convenient to use the xl_arg or xl_return decorators. These allow you to set type multiple complex parameters more easily, for example:

    @xl_func
    @xl_arg("df", dtypes={"A": "date"})
    def your_function(df: pd.DataFrame):
        ....
    

    Not all column dtypes need to be specified. Any that are not specified will default to var.

    index_dtype

    Datatype for the values in the dataframe’s index.

    multi_sparse [1]

    Return sparse results for MultiIndexes. Can be set to True or False, or 'index' or 'columns' if it should only apply to one or the other.

  • dataframe, when used as a return type

    dataframe<kind=None, index=None, columns=True>

    kind

    Set to pandas force the kind of DataFrame expected to be a pandas DataFrame.

    If not set, any supported kind of DataFrame can be used.

    index

    If True include the index when returning to Excel, if False don’t. If None, only include if the index is named.

    columns

    If True include the column headers, if False don’t.

  • series, when used as an argument type

    series<index=1, transpose=None, dtype=None, index_dtype=None>

    index

    Number of columns (or rows, depending on the orientation of the Series) to use as the Series index.

    transpose

    Set to True if the Series is arranged horizontally or False if vertically. By default the orientation will be guessed from the structure of the data.

    dtype

    Datatype for the values in the Series.

    index_dtype

    Datatype for the values in the Series’ index.

    multi_sparse [1]

    Return sparse results for MultiIndexes.

  • series, when used as a return type

    series<index=True, transpose=False>

    index

    If True include the index when returning to Excel, if False don’t.

    transpose

    Set to True if the Series should be arranged horizontally, or False if vertically.

Tip

For specifying multiple or complex type parameters it can be easier to use the xl_arg and xl_return decorators.

See @xl_arg and @xl_return Decorators for more details about how to use xl_arg and xl_return to specify type parameters.

When passing large DataFrames between Python functions, it is not always necessary to return the full DataFrame to Excel and it can be expensive reconstructing the DataFrame from the Excel range each time. In those cases you can use the object return type to return a handle to the Python object. Functions taking the dataframe and series types can accept object handles.

See Using Pandas in Excel for more information.

Footnotes

[1] (1,2)

The multi_sparse parameter is new in PyXLL 5.3.0.

« NumPy Array Types
Polars DataFrames »
  • Home
  • Product
  • Features
  • Documentation
  • Download
  • Pricing
  • Support
  • Documentation
  • Videos
  • FAQ
  • Learn Python
  • Contact Us
  • About
  • About Us
  • Legal
  • Blog
© Copyright PyXLL Ltd