Author Topic: Add-in function that takes "numpy_array<float_nan>" type array doesnt work  (Read 1934 times)

ozonosphere

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Hi

When I write the following function in python, it works. But when I give a range in Excel that contains empty strings and run it in Excel, the function returns #NUM! error.

@xl_func("numpy_array<float_nan> array, numpy_array weights, numpy_array no_of_events, int null_size  : cached_object")

def Get_Null_Dist_WeightedAvg(array, weights, no_of_events, null_size):
    '''
    Return a cached null distribution array which is computed from weighted average method
    array: input array (nparray)
    weights: 1d nparray contains all the weights
    no_of_events: 1d nparray contains the no. of events (datapoints) that will be randomly chosen
    null_size: No. of datapoints in null distribution
    '''
   
   
    Null_dist = np.zeros((null_size,))
    Sample_dist = np.zeros((array.shape[1],))
       
    for j in range(0, null_size):
       
        for i in range(0, array.shape[1]):
            if np.sum(~np.isnan(array[:,i])) == 0:
                Sample_dist = 0
            else:
                Sample_dist = np.average(np.random.choice(array[:,i][~np.isnan(array[:,i])], no_of_events))
        Null_dist[j] = np.dot(Sample_dist, weights)
   
    Null_dist.sort()
    return Null_dist


« Last Edit: August 07, 2015, 02:17:52 PM by ozonosphere »

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 615
  • Karma: +14/-1
    • View Profile
Hi,

any exception throw should be written to the log file. If you take a look at that you should be able to see what the problem is, or if not you can post it here.

Best regards,
Tony

ozonosphere

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Hi Tony,

Thanks for reply

How do I find this log file?

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 615
  • Karma: +14/-1
    • View Profile
Hi,

you set the path you want the log written to in the config file, see https://www.pyxll.com/docs/index.html#configuration

Best regards,
Tony

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 615
  • Karma: +14/-1
    • View Profile
Actually, this link is better: https://www.pyxll.com/docs/index.html#logging

ozonosphere

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Hi Tony

I just checked the log file.

Here are some error msg.

2015-08-07 14:50:19,776 - WARNING : cached_object is already a custom type - overridding
2015-08-07 14:50:19,776 - WARNING : cached_object is already a custom type - overriding
2015-08-07 14:50:19,776 - WARNING : cached_object is already a custom type - overriding
2015-08-07 14:50:19,792 - WARNING : pydevd failed to import - eclipse debugging won't work
2015-08-07 14:50:19,792 - WARNING : Check the eclipse path in C:\Users\charles.li\AppData\Local\Continuum\Anaconda\examples\tools\eclipse_debug.pyc
2015-08-07 14:50:19,792 - WARNING : function cached_object_count is duplicated - overriding
2015-08-07 14:52:31,849 - ERROR : Error calling function Get_Null_Dist_WeightedAvg
2015-08-07 14:52:31,849 - ERROR : Traceback (most recent call last):
2015-08-07 14:52:31,849 - ERROR :   File "pyxll", line 1545, in _xl_func_wrapper
2015-08-07 14:52:31,849 - ERROR :   File "pyxll", line 1536, in _xl_func_wrapper
2015-08-07 14:52:31,849 - ERROR :   File "C:\Users\charles.li\Documents\Pyxll add-in modules\my_excel_addin.py", line 70, in Get_Null_Dist_WeightedAvg
2015-08-07 14:52:31,849 - ERROR :     Sample_dist = np.average(np.random.choice(array[:,i][~np.isnan(array[:,i])], no_of_events))
2015-08-07 14:52:31,849 - ERROR :   File "mtrand.pyx", line 1111, in mtrand.RandomState.choice (numpy\random\mtrand\mtrand.c:10789)
2015-08-07 14:52:31,849 - ERROR :   File "mtrand.pyx", line 949, in mtrand.RandomState.randint (numpy\random\mtrand\mtrand.c:9688)
2015-08-07 14:52:31,849 - ERROR : ValueError: array is too big.





I don't understand why array is too big..
I only used a range of 3979x24 in Excel in which there are empty strings
« Last Edit: August 07, 2015, 02:59:43 PM by ozonosphere »

ozonosphere

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
and even if I try to evaluate by using a small array, there is still a "array is too big" error. :(

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 615
  • Karma: +14/-1
    • View Profile
Try printing out some debug information about the array, to check it is coming through as you expect it to. If you use a small array it would be fine just to print the array. I expect somewhere your code is not doing exactly what you want, and that could possibly be because the input data to your function is not quite what you're expecting. Anything you print will be written to the log file.

Best regards,
Tony

ozonosphere

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Thanks for reply Tony

I ll have a try.

ozonosphere

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
    • View Profile
Solved!.

Tony, thanks for guiding me through using the log file.

Basically, I should have used "numpy_row" type for both "weights" and "no_of_events" arguments because they are 1d array.

I was using "numpy_array" type for these two argument before, and Python automatically interpreted as 2d array, and then when I use the np.random.choice later, it is using too much memory.

I have one final question, I noticed that both "numpy_row" and "numpy_column" can be passed to arguments as 1d numpy array type. What is the difference?

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 615
  • Karma: +14/-1
    • View Profile
Great, glad you managed to figure it out.

numpy_row is a row of data (horizontal) and numpy_column is a column of data (vertical).

Best regards,
Tony