Author Topic: thread_safe xl_func argument  (Read 825 times)

kris.metha

  • Newbie
  • *
  • Posts: 13
  • Karma: +0/-0
    • View Profile
thread_safe xl_func argument
« on: July 24, 2014, 10:58:17 AM »
Hi Tony,

Thanks again for your help previously.

I have a question regarding  thread_safe . My understanding is that if thread_safe = True is passed to xl_func, the function will be called concurrently in MS Excel (2010)? I am experiencing some inconsistency as some of the functions are executed concurrently, while some are sequentially though these are  registered with thread_safe = True . 

In PyXLL may I know how the thread_safe is implemented ? Any pointer on where to look would be very helpful.

My setting is:
  • Excel 2010
  • PyXLL : 2.2.0.beta2

Kind regards,
Kris

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 574
  • Karma: +13/-1
    • View Profile
Re: thread_safe xl_func argument
« Reply #1 on: July 24, 2014, 12:03:19 PM »
Hi Kris,

'thread_safe' is more like a hint to excel that your function is thread-safe, and so if it chooses to it may call it from a different thread than the main thread. Excel creates a pool of threads (you can set the number of threads it uses in the Excel options) and then uses those to run thread safe functions when possible. It doesn't start a new thread for each function, and some thread safe functions will still be run on the main thread.

There are some things that will prevent your functions from being run in a thread even if you specify the function is thread safe:
  • If any of the argument types or return type you function uses are not thread safe then the whole function will not be run in a thread (all the standard argument conversion functions that pyxll uses are thread safe - see https://www.pyxll.com/docs/udfs.html?highlight=thread#pyxll.xl_arg_type for details about custom types).
  • If your function is a macro sheet equivalent function it cannot be run in a thread

If you set the pyxll log level to 'debug' you will get warnings in your log file for either of these cases.

The actual deciding of which calls to make on what threads and the management of the thread pool is done by Excel; PyXLL simply tells Excel that the function may be called from a separate thread when it registered the function using the Excel C API function xlfRegister. You can see more details about that here (look for 'Registering Worksheet Functions as Thread-Safe'):
http://msdn.microsoft.com/en-us/library/office/bb687900(v=office.15).aspx

I hope this helps,
Tony

kris.metha

  • Newbie
  • *
  • Posts: 13
  • Karma: +0/-0
    • View Profile
Re: thread_safe xl_func argument
« Reply #2 on: July 25, 2014, 02:04:13 AM »
Hi Tony,

Thank you for the info. That is very helpful.

Now I understand where the issue is. Essentially, what I am trying to do is to get data from feed and create Pandas.DataFrame (return as cached_object in Excel Cell) in parallel.

Now the current return type registration is:
@xl_return_type("cached_object", "string", macro=True, allow_arrays=False, thread_safe=False)

Since I try to make it thread_safe, I amended as below:
@xl_return_type("cached_object", "string", macro=False, allow_arrays=False, thread_safe=True)
This obviously fails due to xlfCaller() which requires macro=True

I am also not sure whether the event handler would work correctly if macro=False :
_setup_event_handler(_global_cache)

In this case, do you know whether there is a work around ? I foresee that creating (return type) cache_object in thread_safe mode would be a rather common scenario?

Kind regards,
Kris

 

kris.metha

  • Newbie
  • *
  • Posts: 13
  • Karma: +0/-0
    • View Profile
Re: thread_safe xl_func argument
« Reply #3 on: July 25, 2014, 06:05:47 AM »
Hi Tony,

So I just realised that xlfCaller().address can be bypassed by using other attributes of XLCell

I have rewritten the @xl_return_type for cache_object to be thread safe as shown below :


#------------ THREAD SAFE ----------------------------------------

def baseExcel(num,b=26,numerals='ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    return ((num < b) and numerals[num]) or (baseExcel(num // b - 1, b, numerals) + numerals[num % b])


@xl_return_type("cached_object", "string", macro=False, allow_arrays=False, thread_safe=True) #KRIS making it thread safe, macro has to be False
def cached_object_return_func(x):
    """
    custom return type for objects that should be cached for use as
    parameters to other xl functions
    """
    # this requires the function to be registered as a macro sheet equivalent
    # function because it calls xlfCaller, hence macro=True in
    # the xl_return_type decorator above.
    #
    # As xlfCaller returns the individual cell a function was called from, it's
    # not possible to return arrays of cached_objects using the cached_object[]
    # type in a function signature. allow_arrays=False prevents a function from
    # being registered with that return type. Arrays of cached_objects as an
    # argument type is fine though.

    if _have_win32com:
        # _setup_event_handler creates an event handler for Excel events to
        # ensure the cache is kept up to date with cell changes
        _setup_event_handler(_global_cache)

    # get the calling cell in [book]sheet!address format
    caller = xlfCaller()
    rowNum = caller.rect.first_row
    colNum = caller.rect.first_col
    colXL = baseExcel(colNum)
    cell =  colXL + str(rowNum+1)
   
    wb_and_sheet = caller.sheet_name
    assert wb_and_sheet.startswith("[") and "]" in wb_and_sheet, \
        "Calling cell not in [book]sheet!address format: %s" % wb_and_sheet
    workbook, sheet = wb_and_sheet.strip("[").split("]", 1)
    while "''" in sheet:
        sheet = sheet.replace("''", "'")   

    # update the cache and return the cached object id
    return _global_cache.update(workbook, sheet, cell, x)

Hopefully this is helpful for those who want to make cache_object thread safe.

Kind regards,
Kris

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 574
  • Karma: +13/-1
    • View Profile
Re: thread_safe xl_func argument
« Reply #4 on: July 25, 2014, 10:30:31 AM »
Hi Kris,

great, thanks for this. I'll take a look at getting this example updated!

Best regards,
Tony

dblob

  • Newbie
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: thread_safe xl_func argument
« Reply #5 on: September 26, 2014, 09:26:40 PM »
Kris (hope you are reading this)-

I have a few questions:

1) Did you go through the exercise of making _global_cache thread safe?  That seems important
2) Why did you have to remove xlfCaller().address and replace it with your own equivalent code?   Is it not thread-safe?  How did you know?

Thanks,
Dave