{"id":665,"date":"2018-09-05T16:06:05","date_gmt":"2018-09-05T16:06:05","guid":{"rendered":"https:\/\/www.pyxll.com\/blog\/?p=665"},"modified":"2023-12-20T09:35:28","modified_gmt":"2023-12-20T09:35:28","slug":"how-to-profile-python-code-in-excel","status":"publish","type":"post","link":"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/","title":{"rendered":"How to profile Python code in Excel"},"content":{"rendered":"\n<ol class=\"wp-block-list\">\n<li><a href=\"#intro\">Introduction<\/a><\/li>\n\n\n\n<li><a href=\"#outside-excel\">Write code to run inside <em>and outside<\/em> of Excel<\/a><\/li>\n\n\n\n<li><a href=\"#setup\">Setting up the Example<\/a>\n<ol class=\"wp-block-list\">\n<li><a href=\"#func\">An example function<\/a><\/li>\n\n\n\n<li><a href=\"#setup-excel\">Setting up Excel<\/a><\/li>\n<\/ol>\n<\/li>\n\n\n\n<li><a href=\"#how-long\">How long is it taking?<\/a>\n<ol class=\"wp-block-list\">\n<li><a href=\"#how-long2\">Measuring how long a function takes<\/a><\/li>\n\n\n\n<li><a href=\"#prof\">Profiling an Excel Application<\/a><\/li>\n\n\n\n<li><a href=\"#yappi\">Profiling multi-threaded Python code<\/a><\/li>\n<\/ol>\n<\/li>\n\n\n\n<li><a href=\"#speed\">How to speed up your Python code<\/a>\n<ol class=\"wp-block-list\">\n<li><a href=\"#line-prof\">Line Profiling<\/a><\/li>\n\n\n\n<li><a href=\"#line-prof-excel\">Using the Line Profiler in Excel<\/a><\/li>\n\n\n\n<li><a href=\"#numpy\">Use NumPy or Pandas to replace for loops<\/a><\/li>\n\n\n\n<li><a href=\"#numba\">Speed up your code using Numba or Cython<\/a><\/li>\n\n\n\n<li><a href=\"#threads\">Multi-threading<\/a><\/li>\n\n\n\n<li><a href=\"#async\">Asynchronous Functions<\/a><\/li>\n<\/ol>\n<\/li>\n\n\n\n<li><a href=\"#resources\">Resources<\/a><\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"intro\">Introduction<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><em>All the code from this blog is on <a href=\"https:\/\/github.com\/pyxll\/pyxll-examples\/tree\/master\/profiling\">github<\/a><\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When creating any kind of application, there sometimes comes a point where things are not running as smoothly as you want and you need to think about the performance of your application. The same is true when developing Excel spreadsheets. Using PyXLL to add functionality to Excel is an extremely powerful way to create rich analytics and applications in Excel, but what do you do when your Python based spreadsheet is having performance problems?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When tackling any sort of performance problem, the first step is always to profile the code! This post covers some pointers to make it easier for you to identify performance bottlenecks (or hotspots) and how to profile a running Excel application.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"outside-excel\">Write code to run inside <em>and outside<\/em> of Excel<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">When writing Python code for Excel it&#8217;s helpful to be able to run it outside of Excel as well. Doing so means that you can apply automated testing (e.g. unit testing) to your code, and also you can examine any performance bottlenecks and debug problems without having to run your code in Excel.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To help you do this, PyXLL provides as <em>stubs<\/em> package that declares all the same functions and decorators you use to expose your Python functions to Excel (e.g. <em>@xl_func<\/em>), but that do nothing. This means you can run the same code both in Excel and from any Python prompt.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When you download PyXLL, the stubs package is provided as a <em>.whl<\/em> file, or Python Wheel. It will be named <em>pyxll-{version}-{pyversion}-none-win{32\/64}.whl<\/em>, depending on which version of Python you have downloaded PyXLL for. To install it, open a command prompt and if necessary active your Python environment and use the command <em>pip<\/em> to install the whl file:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&gt;&gt; activate myenv  ;; only if using conda or a venv\n&gt;&gt; pip install .\/pyxll-3.5.4-cp36-none-win_amd64.whl &#x5B;\/code]\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Note:<\/strong><em>Use the path to the file, including &#8220;.\/&#8221; if in the current directory. Without this, pip will look for a wheel with that name from the remote PyPI repository instead of installing the local file.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now you will be able to import pyxll from Python even if not running in Excel via the PyXLL add-in.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"setup\">Setting up the Example<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Before we can look at profiling, we need to set up some code to profile \ud83d\ude42 For this we&#8217;ll use a simple function that calculates the average of a range of numbers, with a tolerance so only numbers above that tolerance are included.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"func\">An example function<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Start your preferred Python IDE (I will be using PyCharm for this blog), or a simple text editor will do, create a new file named &#8220;average_tol.py&#8221; and enter the following:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom pyxll import xl_func\n\n@xl_func\ndef average_tol_simple(values, tolerance):\n    &quot;&quot;&quot;Compute the mean of values where value is &gt; tolerance\n\n    :param values: Range of values to compute average over\n    :param tolerance: Only values greater than tolerance will be considered\n    &quot;&quot;&quot;\n    total = 0.0\n    count = 0\n    for row in values:\n        for value in row:\n            if not isinstance(value, float):\n                continue\n            if value &lt;= tolerance:\n                continue\n            total += value\n            count += 1\n    return total \/ count\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">This function will compute the average of a range of numbers above the specified tolerance, and will also be available to be called from Excel because it&#8217;s decorated with the <em>@xl_func<\/em> annotation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We can test this by importing the Python module and calling the function. If you are using PyCharm you can right click and select &#8220;Run File in Console&#8221;. This will run the file and give you an IPython prompt so you can test the function:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nIn&#x5B;2]: average_tol_simple(&#x5B;&#x5B;1,2,3],&#x5B;4,5,6],&#x5B;7,8,9]], 5)\nOut&#x5B;2]: 7.5\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">If you are not using PyCharm, you can import the function and run it as normal from an IPython prompt. If you don&#8217;t have IPython installed, install it now as we will be using some of its timing features later. IPython can be installed by running &#8220;pip install ipython&#8221; or &#8220;conda install ipython&#8221; if you are using Anaconda.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"setup-excel\">Setting up Excel<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Now we&#8217;ve got a basic function we can try calling it from Excel. If you&#8217;ve not already installed PyXLL, install it now. See the PyXLL <a href=\"https:\/\/www.pyxll.com\/docs\/introduction.html\" target=\"_blank\" rel=\"noopener noreferrer\">Introduction<\/a> for installation instructions. Ensure you point it at the same Python you have been using so far by setting &#8220;executable&#8221; in the pyxll.cfg config file.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Once you&#8217;ve got PyXLL installed successfully, add the path where you&#8217;ve saved &#8220;average_tol.py&#8221; to your Python path and add &#8220;average_tol&#8221; to the list of modules for PyXLL to import:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;PYTHON]\nexecutable = {path to your Python executable}\npythonpath =\n    {path to the folder where average_tol.py is saved}\n\n&#x5B;PYXLL]\nmodules =\n    average_tol\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Now when you reload PyXLL (or restart Excel) our new &#8220;average_tol_simple&#8221; function will be available to call as a worksheet function.<\/p>\n\n\n\n<figure class=\"wp-block-image is-style-default\"><img decoding=\"async\" width=\"536\" height=\"338\" data-attachment-id=\"670\" data-permalink=\"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/average_tol_simple\/\" data-orig-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/average_tol_simple.png\" data-orig-size=\"536,338\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"average_tol_simple\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/average_tol_simple.png\" src=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/average_tol_simple.png\" alt=\"Simple Average With Tolerance Python Function\" class=\"wp-image-670\" srcset=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/average_tol_simple.png 536w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/average_tol_simple-300x189.png 300w\" sizes=\"(max-width: 536px) 100vw, 536px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"how-long\">How long is it taking?<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"how-long2\">Measuring how long a function takes<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Now we&#8217;ve got a basic function working in Excel, how long does it take to run? If it&#8217;s very slow, we can just time it roughly using a watch or the clock on the computer. That&#8217;s error prone and won&#8217;t really give us the precision we need. Luckily, IPython has a way to time how long it takes to run a function accurately. It does this by running the function a number of times and taking the average.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To time the function in IPython we use the &#8220;%timeit&#8221; command. To make the function take a little longer, we&#8217;ll use a range with 32,000 values. To get that range, we use numpy&#8217;s arange function and reshape it into a 2d array.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;In2]: values = np.arange(1, 32001, dtype=float).reshape((320, 100))\n&#x5B;In3]: average_tol_simple(values, 500)\nOut&#x5B;2]: 16250.5\n&#x5B;In4]: %timeit average_tol_simple(values, 500)\n14.9 ms \u00b1 135 \u00b5s per loop (mean \u00b1 std. dev. of 7 runs, 100 loops each)\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Great! So on my laptop, it takes about 15ms to run that function for 32,000 values. But, how long does the same function take in Excel? We&#8217;d expect it to be a slower since there is inevitably some overhead in constructing the Python array from the Excel values and calling the Python function, but how can we test that?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One way is to set up the calculation we want, and then add a menu function to trigger that calculation and time how long Excel takes to do it. There will be some cost involved in calling the Excel API to run the calculation, but if we run a long enough calculation that time will be insignificant and so we can get a reasonably accurate picture of how long it&#8217;s taking.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The following triggers the recalculation of the selected cell 100 times and calculates the mean time and standard deviation. This is roughly the same as what we did previously using &#8220;%timeit&#8221; in IPython. To use it, copy the following code to a new file called &#8220;profiling_tools.py&#8221; and add &#8220;profiling_tools&#8221; to the list of modules in your pyxll.cfg file. When you reload PyXLL you will see a new menu &#8220;Profiling Tools&#8221; (under the Add-Ins tab) with our new menu item &#8220;Time to Calculate.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom pyxll import xl_menu, xl_app, xlcAlert\nfrom win32com.client import constants\nimport win32clipboard\nimport math\nimport time\n\n@xl_menu(\"Time to Calculate\", menu=\"Profiling Tools\")\ndef time_calculation():\n    \"\"\"Recalculates the selected range and times how long it takes\"\"\"\n    xl = xl_app()\n\n    # switch Excel to manual calculation and disable screen updating\n    orig_calc_mode = xl.Calculation\n    try:\n        xl.Calculation = constants.xlManual\n        xl.ScreenUpdating = False\n\n        # get the current selection\n        selection = xl.Selection\n\n        # run the calculation a few times\n        timings = &#x5B;]\n        for i in range(100):\n            # Start the timer and calculate the range\n            start_time = time.clock()\n            selection.Calculate()\n            end_time = time.clock()\n            duration = end_time - start_time\n            timings.append(duration)\n    finally:\n        # restore the original calculation mode and enable screen updating\n        xl.ScreenUpdating = True\n        xl.Calculation = orig_calc_mode\n\n    # calculate the mean and stddev\n    mean = math.fsum(timings) \/ len(timings)\n    stddev = (math.fsum(&#x5B;(x - mean) ** 2 for x in timings]) \/ len(timings)) ** 0.5\n    best = min(timings)\n    worst = max(timings)\n\n    # copy the results to the clipboard\n    data = &#x5B;\n        &#x5B;\"mean\", mean],\n        &#x5B;\"stddev\", stddev],\n        &#x5B;\"best\", best],\n        &#x5B;\"worst\", worst]\n    ]\n    text = \"\\n\".join(&#x5B;\"\\t\".join(map(str, x)) for x in data])\n    win32clipboard.OpenClipboard()\n    win32clipboard.EmptyClipboard()\n    win32clipboard.SetClipboardText(text)\n    win32clipboard.CloseClipboard()\n\n    # report the results\n    xlcAlert((\"%0.2f ms \\xb1 %d \\xb5s\\n\"\n              \"Best: %0.2f ms\\n\"\n              \"Worst: %0.2f ms\\n\"\n              \"(Copied to clipboard)\") % (mean * 1000, stddev * 1000000, best * 1000, worst * 1000))\n\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Note:<\/strong><em>If you can&#8217;t import win32clipboard you need to install the pywin32 package. You can do that with pip, &#8220;pip install pypiwin32&#8221; or with conda, &#8220;conda install pywin32&#8221;.<\/em><\/p>\n\n\n\n<figure class=\"wp-block-image is-style-default\"><img decoding=\"async\" width=\"574\" height=\"258\" data-attachment-id=\"673\" data-permalink=\"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/time-to-calc\/\" data-orig-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/time-to-calc.png\" data-orig-size=\"574,258\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"time-to-calc\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/time-to-calc.png\" src=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/time-to-calc.png\" alt=\"Time to Calculate Python Menu Item\" class=\"wp-image-673\" srcset=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/time-to-calc.png 574w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/time-to-calc-300x135.png 300w\" sizes=\"(max-width: 574px) 100vw, 574px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In Excel we can now use this menu item to time how long an individual cell (or range of cells) takes to calculate. Modifying the call to &#8220;average_tol_simple&#8221; to take a range with 32,000 values and using this, on my PC I get a result that it takes about 11 ms to compute. This is actually lower than it took in IPython! However, that&#8217;s because in IPython we were iterating over a numpy array which is slower than iterating over the list that we get with PyXLL. For a fair comparison we need to use a list in both cases and then we see that the time take in IPython is lower:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;In5]: values2 = values.tolist()\n&#x5B;In6]: %timeit average_tol_simple(values2, 500)\n8.14 ms \u00b1 37.2 \u00b5s per loop (mean \u00b1 std. dev. of 7 runs, 100 loops each)\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Now the time in Excel is slightly longer than in IPython, as expected due to passing 32,000 values between Excel and Python.<\/p>\n\n\n\n<figure class=\"wp-block-image is-style-default\"><img decoding=\"async\" width=\"435\" height=\"136\" data-attachment-id=\"675\" data-permalink=\"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/simple-timings\/\" data-orig-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/simple-timings.png\" data-orig-size=\"435,136\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"simple-timings\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/simple-timings.png\" src=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/simple-timings.png\" alt=\"Comparison of Python Timings in Excel\" class=\"wp-image-675\" srcset=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/simple-timings.png 435w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/simple-timings-300x94.png 300w\" sizes=\"(max-width: 435px) 100vw, 435px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"prof\">Profiling an Excel Application<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Sometimes when a workbook is running slowly it&#8217;s not obvious where that time is being spent. It&#8217;s easy to speculate and guess what function is taking a long time, but to be sure it&#8217;s always good practice to profile. Even if you think you know where the problem is, you can often be surprised after profiling!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Python comes with its own profiler called <a href=\"https:\/\/docs.python.org\/2\/library\/profile.html\" target=\"_blank\" rel=\"noopener noreferrer\">cProfile<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To profile a Python script, cProfile can be run as follows:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\npython -m cProfile -s cumtime your_script.py\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This prints out all the functions called, with the time spend in each and the number of times they have been called.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As mentioned earlier, if you can run your code outside of Excel and can build a script that runs your functions with representative data then that makes things nice and easy \ud83d\ude42 You can just run that script using cProfile and see where the bottlenecks are!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Real world cases are rarely that straightforward, and in order to get to the point where we can focus on a few functions outside of Excel we need to profile <em>what Excel is actually doing<\/em>. Starting to optimize your code without profiling first will waste a lot of time.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We can call the cProfile module from inside Excel from a couple of menu functions. By starting and stopping the profiler in Excel we can get some details about what&#8217;s actually being called. The following code adds two new menu items, one for starting the profiler and another for stopping it and collecting the results.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nfrom pyxll import xl_menu, xlcAlert\r\nfrom io import StringIO\r\nimport win32clipboard\r\nimport cProfile\r\nimport pstats\r\n\r\n_active_cprofiler = None\r\n\r\n@xl_menu(&quot;Start&quot;, menu=&quot;Profiling Tools&quot;, sub_menu=&quot;cProfile&quot;)\r\ndef start_profiling():\r\n    &quot;&quot;&quot;Start the cProfile profiler&quot;&quot;&quot;\r\n    global _active_cprofiler\r\n    if _active_cprofiler is not None:\r\n        _active_cprofiler.disable()\r\n    _active_cprofiler = cProfile.Profile()\r\n\r\n    xlcAlert(&quot;cProfiler Active\\n&quot;\r\n             &quot;Recalcuate the workbook and then stop the profiler\\n&quot;\r\n             &quot;to see the results.&quot;)\r\n\r\n    _active_cprofiler.enable()\r\n\r\n@xl_menu(&quot;Stop&quot;, menu=&quot;Profiling Tools&quot;, sub_menu=&quot;cProfile&quot;)\r\ndef stop_profiling():\r\n    &quot;&quot;&quot;Stop the cProfile profiler and print the results&quot;&quot;&quot;\r\n    global _active_cprofiler\r\n    if not _active_cprofiler:\r\n        xlcAlert(&quot;No active profiler&quot;)\r\n        return\r\n\r\n    _active_cprofiler.disable()\r\n\r\n    # print the profiler stats\r\n    stream = StringIO()\r\n    stats = pstats.Stats(_active_cprofiler, stream=stream).sort_stats(&quot;cumulative&quot;)\r\n    stats.print_stats()\r\n\r\n    # print the results to the log\r\n    print(stream.getvalue())\r\n\r\n    # and copy to the clipboard\r\n    win32clipboard.OpenClipboard()\r\n    win32clipboard.EmptyClipboard()\r\n    win32clipboard.SetClipboardText(stream.getvalue())\r\n    win32clipboard.CloseClipboard()\r\n\r\n    _active_cprofiler = None\r\n\r\n    xlcAlert(&quot;cProfiler Stopped\\n&quot;\r\n             &quot;Results have been written to the log and clipboard.&quot;)\r\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Copy the code above into &#8220;profiling_tools.py&#8221; and add &#8220;profiling_tools&#8221; to the list of modules in your pyxll.cfg. Restart PyXLL and you&#8217;ll find the new menu items should have been added to the Add-Ins menu.<\/p>\n\n\n\n<figure class=\"wp-block-image is-style-default\"><img decoding=\"async\" width=\"573\" height=\"260\" data-attachment-id=\"693\" data-permalink=\"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/cprofile-start\/\" data-orig-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/cprofile-start.png\" data-orig-size=\"573,260\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"cprofile-start\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/cprofile-start.png\" src=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/cprofile-start.png\" alt=\"Start Python CProfiler\" class=\"wp-image-693\" srcset=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/cprofile-start.png 573w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/cprofile-start-300x136.png 300w\" sizes=\"(max-width: 573px) 100vw, 573px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">To profile your workbook, start the profiler using the menu item and then recalculate everything you want to be included in the profile. When finished, stop the profiler and the results are printed to the log file and copied to the clipboard.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n         32024 function calls (32023 primitive calls) in 0.020 seconds\n\n   Ordered by: cumulative time\n\n   ncalls  tottime  percall  cumtime  percall filename:lineno(function)\n        1    0.009    0.009    0.012    0.012 c:\\github\\pyxll-examples\\profiling\\average_tol.py:8(average_tol_simple)\n      2\/1    0.005    0.002    0.005    0.005 c:\\github\\pyxll-examples\\profiling\\average_tol.py:49(average_tol_numba)\n        1    0.003    0.003    0.003    0.003 c:\\github\\pyxll-examples\\profiling\\average_tol.py:28(average_tol_try_except)\n    32002    0.003    0.000    0.003    0.000 {built-in method builtins.isinstance}\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Once you&#8217;ve identified what functions are taking most time you can focus on improving them.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"yappi\">Profiling multi-threaded Python code<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The Python profiler <em>cProfile<\/em> used in the section above is great for single threaded code, but it only collects profiling data for the thread it&#8217;s called on. It is common to use one or more background threads when writing RTD or async functions, and using cProfile in these cases will not include timing data for the background threads.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Fortunately there is another third-party profiling tool, <a href=\"https:\/\/pypi.org\/project\/yappi\/\">yappi<\/a> that will profile code running in all threads. Yappi can be installed using pip:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&gt;&gt; pip install yappi\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">Profiling using <em>yappi<\/em> is straightforward and can be done in a similar way to what was shown above. All that is needed is to modify the <em>start_profiling<\/em> and <em>stop_profiling<\/em> menu functions to use <em>yappi<\/em> instead of <em>cPython<\/em>.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n@xl_menu(&quot;Start&quot;, menu=&quot;Profiling Tools&quot;, sub_menu=&quot;Yappi&quot;)\r\ndef start_profiling():\r\n    yappi.start()\r\n\r\n@xl_menu(&quot;Stop&quot;, menu=&quot;Profiling Tools&quot;, sub_menu=&quot;Yappi&quot;)\r\ndef stop_profiling():\r\n    yapp.stop()\r\n    yappi.get_func_stats().print_all()\r\n    yappi.get_thread_stats().print_all()\r\n    yappi.clear_stats()\r\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The procedure to profiling your code is the same as above. Start the profiler, recalculate everything you want to profile and then stop the profiler. The results will be written to the log file, only this time everything from any background threads will also be included.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"speed\">How to speed up your Python code<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Optimizing Python code is a whole series of blog posts in itself, but since now we know how to time an individual function in both IPython and Excel it&#8217;s a good time to dig a little deeper and see how we can improve things.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"line-prof\">Line Profiling<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If you already know what function you need to spend time optimizing then the &#8220;line_profiler&#8221; Python package is a useful tool to drill into the function and see where the time&#8217;s going. In our case, since we only have one function we know which one to optimize, but in more complex cases it&#8217;s advisable to spend some time before this stage to uncover exactly which functions are problematic.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Install the &#8220;line_profiler&#8221; package using &#8220;pip install line_profiler&#8221; or &#8220;conda install line_profiler&#8221; if using Anaconda.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The line profiler works by running a Python script and looking for functions decorated with a special &#8216;@profile&#8217; decorator. The easiest way to run it for our function is to turn the module into script that will call the function we&#8217;re interested in a number of times. We do this by adding some code at the end that only runs if the module is being run as a script by checking <em>__name__ == &#8220;__main__&#8221;<\/em>.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom pyxll import xl_func\n\n@xl_func\n@profile\ndef average_tol_simple(values, tolerance):\n    &quot;&quot;&quot;Compute the mean of values where value is &gt; tolerance\n\n    :param values: Range of values to compute average over\n    :param tolerance: Only values greater than tolerance will be considered\n    &quot;&quot;&quot;\n    total = 0.0\n    count = 0\n    for row in values:\n        for value in row:\n            if not isinstance(value, (float, int)):\n                continue\n            if value &lt;= tolerance:\n                continue\n            total += value\n            count += 1\n    return total \/ count\n\nif __name__ == &quot;__main__&quot;:\n    import numpy as np\n    values = np.arange(1, 32001, dtype=float).reshape(100, 320).tolist()\n    for i in range(100):\n        avg = average_tol_simple(values, 500)\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">Note that now &#8220;@profile&#8221; has been added to the average_tol_simple function.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To run the line profiler we do the following:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&gt;&gt; kernprof -v -l average_tol.py\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">The &#8220;-v&#8221; option tells the profiler to print the results rather than save them to a file. On my PC, the results look like this:<\/p>\n\n\n\n<figure class=\"wp-block-image is-style-default\"><img decoding=\"async\" width=\"1008\" height=\"423\" data-attachment-id=\"677\" data-permalink=\"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/line-profile\/\" data-orig-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/line-profile.png\" data-orig-size=\"1008,423\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"line-profile\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/line-profile.png\" src=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/line-profile.png\" alt=\"Python Line Profiler Output\" class=\"wp-image-677\" srcset=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/line-profile.png 1008w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/line-profile-300x126.png 300w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/line-profile-768x322.png 768w\" sizes=\"(max-width: 1008px) 100vw, 1008px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Looking at the percentage time we can see that the biggest contributor is testing whether the value is a number of not. In Python, using a try\/except block can be faster than doing explicit checks like this so we can try removing that check and catching and TypeError exceptions that might be raised as follows:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndef average_tol_try_except(values, tolerance):\n    total = 0.0\n    count = 0\n    for row in values:\n        for value in row:\n            try:\n                if value &lt;= tolerance:\n                    continue\n                total += value\n                count += 1\n            except TypeError:\n                continue\n    return total \/ count\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">Using IPython&#8217;s %timeit again we can see that change has really improved the performance of this function.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;In8]: %timeit average_tol_try_except(values, 500)\n3.53 ms \u00b1 60.8 \u00b5s per loop (mean \u00b1 std. dev. of 7 runs, 100 loops each)\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Running the same test as before in Excel also shows a similar speed up, though because we are passing a large range of data and the function the time taken to convert the data from Excel to Python means there is always some small overhead.<\/p>\n\n\n\n<figure class=\"wp-block-image is-style-default\"><img decoding=\"async\" width=\"675\" height=\"129\" data-attachment-id=\"684\" data-permalink=\"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/try-except-timings\/\" data-orig-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/try-except-timings.png\" data-orig-size=\"675,129\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"try-except-timings\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/try-except-timings.png\" src=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/try-except-timings.png\" alt=\"Try\/Except Timings\" class=\"wp-image-684\" srcset=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/try-except-timings.png 675w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/try-except-timings-300x57.png 300w\" sizes=\"(max-width: 675px) 100vw, 675px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"line-prof-excel\">Using the Line Profiler in Excel<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">As well as using the line profiler from the command line as described above, it&#8217;s also possible to use the line_profiler API to profile functions without leaving Excel.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The following code adds two new menu items to start and stop the line profiler. Once the line profiler is started, if any function decorated with the @enable_line_profiler decorator (also defined below) will be profiled. When the profiler is stopped the results are written to the log file and to the clipboard to be pasted into Excel.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nfrom pyxll import xl_menu\r\nfrom functools import wraps\r\nfrom io import StringIO\r\nimport win32clipboard\r\nimport line_profiler\r\n\r\n# Current active line profiler\r\n_active_line_profiler = None\r\n\r\n@xl_menu(&quot;Start&quot;, menu=&quot;Profiling Tools&quot;, sub_menu=&quot;Line Profiler&quot;)\r\ndef start_line_profiler():\r\n    &quot;&quot;&quot;Start the line profiler&quot;&quot;&quot;\r\n    global _active_line_profiler\r\n    _active_line_profiler = line_profiler.LineProfiler()\r\n\r\n    xlcAlert(&quot;Line Profiler Active\\n&quot;\r\n             &quot;Run the function you are interested in and then stop the profiler.\\n&quot;\r\n             &quot;Ensure you have decoratored the function with @enable_line_profiler.&quot;)\r\n\r\n@xl_menu(&quot;Stop&quot;, menu=&quot;Profiling Tools&quot;, sub_menu=&quot;Line Profiler&quot;)\r\ndef stop_line_profiler():\r\n    &quot;&quot;&quot;Stops the line profiler and prints the results&quot;&quot;&quot;\r\n    global _active_line_profiler\r\n    if not _active_line_profiler:\r\n        return\r\n\r\n    stream = StringIO()\r\n    _active_line_profiler.print_stats(stream=stream)\r\n    _active_line_profiler = None\r\n\r\n    # print the results to the log\r\n    print(stream.getvalue())\r\n\r\n    # and copy to the clipboard\r\n    win32clipboard.OpenClipboard()\r\n    win32clipboard.EmptyClipboard()\r\n    win32clipboard.SetClipboardText(stream.getvalue())\r\n    win32clipboard.CloseClipboard()\r\n\r\n    xlcAlert(&quot;Line Profiler Stopped\\n&quot;\r\n             &quot;Results have been written to the log and clipboard.&quot;)\r\n\r\ndef enable_line_profiler(func):\r\n    &quot;&quot;&quot;Decorator to switch on line profiling for a function.\r\n\r\n    If using line_profiler from the command line, use the built-in @profile\r\n    decorator instead of this one.\r\n    &quot;&quot;&quot;\r\n    @wraps(func)\r\n    def wrapper(*args, **kwargs):\r\n        nonlocal func\r\n        if _active_line_profiler:\r\n            func = _active_line_profiler(func)\r\n        return func(*args, **kwargs)\r\n    return wrapper\r\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Copy the code above into your profiling_tools.py file, and add the @enable_line_profiler decorator to the average_tol_simple function.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nfrom pyxll import xl_func\nfrom profiling_tools import enable_line_profiler\n\n@xl_func\n@enable_line_profiler\ndef average_tol_simple(values, tolerance):\n    total = 0.0\n    count = 0\n    for row in values:\n        for value in row:\n            if not isinstance(value, (float, int)):\n                continue\n            if value &lt;= tolerance:\n                continue\n            total += value\n            count += 1\n    return total \/ count\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">Now reload PyXLL and you&#8217;ll see the two new menu items. Start the line profiler, call average_tol_simple, then stop the line profiler and copy the results to Excel or look at the log file.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"numpy\">Use NumPy or Pandas to replace for loops<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">For operations involving large arrays of data it&#8217;s often much faster (and simpler) to use numpy or pandas rather than writing for loops to process data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Our example function can easily be re-written using numpy:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n@xl_func(&quot;numpy_array&lt;float&gt;, float: float&quot;)\ndef average_tol_numpy(values, tolerance):\n    return values&#x5B;values &gt; tolerance].mean()\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">Note the use of the signature in @xl_func. That tells PyXLL what types to pass to our function, and so handles the conversion of the array of data into a numpy array for us.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This function is significantly faster than our original, and now the time taken is dominated by the cost of converting the Excel data into Python objects.<\/p>\n\n\n\n<figure class=\"wp-block-image is-style-default\"><img decoding=\"async\" width=\"687\" height=\"127\" data-attachment-id=\"687\" data-permalink=\"https:\/\/www.pyxll.com\/blog\/how-to-profile-python-code-in-excel\/numpy-timings\/\" data-orig-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/numpy-timings.png\" data-orig-size=\"687,127\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"numpy-timings\" data-image-description=\"\" data-image-caption=\"\" data-large-file=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/numpy-timings.png\" src=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/numpy-timings.png\" alt=\"NumPy Function Timings\" class=\"wp-image-687\" srcset=\"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/numpy-timings.png 687w, https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/numpy-timings-300x55.png 300w\" sizes=\"(max-width: 687px) 100vw, 687px\" \/><\/figure>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&#x5B;In9]: %timeit average_tol_numpy(values, 500.0)\n61.5 \u00b5s \u00b1 451 ns per loop (mean \u00b1 std. dev. of 7 runs, 10000 loops each)\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This version of the function however doesn&#8217;t handle non-numeric types being passed to it, whereas the previous ones would skip non-numeric values. Adding those checks back in negates the improvement gained by using numpy, which goes to show why profiling is so important!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"numba\">Speed up your code using Numba or Cython<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">For certain types of CPU bound functions you can get significant speedups using <a href=\"https:\/\/numba.pydata.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Numba<\/a> or <a href=\"https:\/\/cython.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Cython<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Numba is a Python JIT compiler that can speed up any array and math heavy calculations with simple annotations. For example, all that&#8217;s need to speed up our average function is to do the following:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom pyxll import xl_func\nimport numba\n\n@xl_func\n@numba.jit(locals={&quot;value&quot;: numba.double, &quot;total&quot;: numba.double, &quot;count&quot;: numba.int32})\ndef average_tol_numba(values, tolerance):\n    total = 0.0\n    count = 0\n    for row in values:\n        for value in row:\n            if value &lt;= tolerance:\n                continue\n            total += value\n            count += 1\n    return total \/ count\n<\/pre><\/div>\n\n\n<p class=\"wp-block-paragraph\">By telling Numba some of the types of the local variables it can the code faster. However, for a function like this where most of the time is spent iterating over the lists, the overhead of using the JIT compiler means there is little or no performance benefit. Real world functions are likely to have better results!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Cython works by converting Python code into C code, which is then compiled into a C extension. When used carefully this can speed up code dramatically. As with all optimization techniques though, constant profiling is required to understand where the time is being spent.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"threads\">Multi-threading<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Excel can run calculations concurrently across multiple threads. This is only applicable if the functions can actually be run concurrently and don&#8217;t depend on each other. For sheets with many functions this can improve the overall time to calculate the sheet by making better use of a modern PCs&#8217; multiple cores.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To enable a function to be run in a background calculation thread, all that&#8217;s required is to add &#8220;thread_safe=True&#8221; to the &#8220;@xl_func&#8221; decorator:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom pyxll import xl_func\n\n@xl_func(thread_safe=True)\ndef thread_safe_func():\n    pass\n<\/pre><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Although thread safe functions may be run concurrently across multiple threads, Python code can&#8217;t always take advantage of this because of something called the &#8220;GIL&#8221; or &#8220;Global Interpreter Lock&#8221;. Basically, while the Python interpreter is doing anything it always acquires the GIL, preventing any other thread from running any Python code at the same time! This means that although your Python code is now running on multiple threads, it only runs on one thread at once and so you see no benefit from this.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">All is not lost however. Many libraries will release the GIL while running long tasks. For example, if you are reading a large file from disk, then Python will release the GIL while that IO operation takes place meaning that it&#8217;s freed up for other threads to continue running Python code. Packages like numpy and scipy also release the GIL while performing numerical computations, so if you are making heavy use of those your code may benefit from being multi-threaded too. If you are using numba or cython to speed up your code, both of those have options to release the GIL so you can make your code thread friendly.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If you have IO bound functions (e.g. reading from disk, a network or a database), or if you&#8217;re packages that release the GIL (like numpy or scipy), then using multiple threads may help. If your functions are purely CPU bound and are just running plain Python code then it may not.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For some more information about multi-threading and multi-processing in Python see this post: <a href=\"https:\/\/www.geeksforgeeks.org\/difference-between-multithreading-vs-multiprocessing-in-python\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.geeksforgeeks.org\/difference-between-multithreading-vs-multiprocessing-in-python\/<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For a more detailed explanation of the GIL see <a href=\"https:\/\/wiki.python.org\/moin\/GlobalInterpreterLock\">https:\/\/wiki.python.org\/moin\/GlobalInterpreterLock<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"async\">Asynchronous Functions<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">For IO bound functions asynchronous functions allow Excel to continue working on other functions while the asynchronous function continues in the background. This can be useful in some situations where you don&#8217;t want to block one of the calculation threads from doing other calculations while you wait for a response from a remote server, for example.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For more information about asynchronous functions in PyXLL see <a href=\"https:\/\/www.pyxll.com\/docs\/userguide\/udfs\/asyncfuncs.html\" target=\"_blank\" rel=\"noopener noreferrer\">Asynchronous Functions<\/a> in the user guide.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Note that asynchronous functions must still return during Excel&#8217;s calculation phase, and so they are not a way to free up the user interface while a long running calculation is being performed. For that, use an RTD (<a href=\"https:\/\/www.pyxll.com\/docs\/userguide\/rtd.html#rtd\" target=\"_blank\" rel=\"noopener noreferrer\">Real Time Data<\/a>) function instead.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"resources\">Resources<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>All the code from this blog is on <a href=\"https:\/\/github.com\/pyxll\/pyxll-examples\/tree\/master\/profiling\">github<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/docs.python.org\/2\/library\/profile.html\" target=\"_blank\" rel=\"noopener noreferrer\">cProfile<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/rkern\/line_profiler\">line_profiler<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/numba.pydata.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Numba<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/cython.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Cython<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.geeksforgeeks.org\/difference-between-multithreading-vs-multiprocessing-in-python\/\">Python Multi-Threading vs Multi-Processing<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/wiki.python.org\/moin\/GlobalInterpreterLock\">GlobalInterpreterLock from wiki.python.org<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>This post explains how to profile your code so you can understand where it is not performing, and help you to make it run faster.<\/p>\n","protected":false},"author":1,"featured_media":701,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[14,2],"tags":[],"class_list":["post-665","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-performance","category-pyxll"],"acf":[],"jetpack_featured_media_url":"https:\/\/www.pyxll.com\/blog\/wp-content\/uploads\/2018\/09\/shutterstock_672142231.jpg","jetpack_shortlink":"https:\/\/wp.me\/p7l3LP-aJ","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/posts\/665","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/comments?post=665"}],"version-history":[{"count":1,"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/posts\/665\/revisions"}],"predecessor-version":[{"id":1729,"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/posts\/665\/revisions\/1729"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/media\/701"}],"wp:attachment":[{"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/media?parent=665"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/categories?post=665"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.pyxll.com\/blog\/wp-json\/wp\/v2\/tags?post=665"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}