Author Topic: multiprocessing and queue, Excel crash on worker at xl = xl_app()  (Read 1366 times)

Patrick

  • Newbie
  • *
  • Posts: 5
  • Karma: +0/-0
    • View Profile
Hi
This is my first program with pyxll (and I am newby on Python)
I think everything is in the code,
I get keys from an excel sheet and from this keys I am looking for information on the intranet. The program is working fine stand alone, and is working fine until I want to write back into the excel sheet. I just want to write the result on the next column from were I got the key, but Excel crashes inthe worker at  xl = xl_app()
Thank you for your help. :-[

Code: [Select]
#import sys
#import os
#import pyxll
#from pyxll import xl_func
from queue import Queue
from threading import Thread
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException


import logging
_log = logging.getLogger(__name__)

try:
    import win32com.client
except ImportError:
    _log.warning("*** win32com.client could not be imported          ***")
    _log.warning("*** some of the automation examples will not work  ***")
    _log.warning("*** to fix this, install the pywin32 extensions.   ***")


import pyxll
from pyxll import xl_macro
import logging
_log = logging.getLogger(__name__)

q = Queue(maxsize=0)
num_worker_threads=3
 

def xl_app():
    """returns a Dispatch object for the current Excel instance"""
    # get the Excel application object from PyXLL and wrap it
    xl_window = pyxll.get_active_object()
    xl_app = win32com.client.Dispatch(xl_window).Application

    # it's helpful to make sure the gen_py wrapper has been created
    # as otherwise things like constants and event handlers won't work.
    win32com.client.gencache.EnsureDispatch(xl_app)

    return xl_app


   
@xl_macro()
def button_example5():
xl = xl_app()
FIELD="F6"
range = xl.Range(FIELD)
range.Value = range.Value + 1


def getweb(line):
    row = []
    result=[]
    _log.info(line[0])       
    addr=line[0]
    _log.info(line[1])

    link=line[1]
   
    try:
        browser = webdriver.Firefox()
        #browser =webdriver.PhantomJS(executable_path='C:\phantomjs\\phantomjs.exe')
        browser.implicitly_wait(1)
        browser.get(link)
                                       
        h1=browser.find_element_by_tag_name('h1').text
        last=browser.find_element_by_id('last-price-value').text
        volume=browser.find_element_by_id('volume').text

       
        row.append(h1)
        try:
            row.append(float(last))
        except:
            row.append(last)

        row.append(volume)
        #row.append(worker.active_count())
        result.append(addr)
        result.append(row)

    except NoSuchElementException:
        result.append(link)
        result.append(['Not found'])

    # return the result to Excel
    browser.close()
    browser.quit()
   
    return result

def worker():                       # Worker from t = Thread(target=worker)
    while True:
        item = q.get()              # queue is filled from Excel
        result=getweb(item)         # Get values from net
                                    # Values are comming back correctly
        Addr=str(result[0])         # first is address where to write result   
        _log.info(Addr)             # Field Excel address OK
        _log.info(result[1][0])     # Filed value         OK

        [b]xl = xl_app()      [/b]         # This line make Excel Crash
        range2=xl.Range(Addr)
        #_log.info(range2)       
        range2.Value=result[1][0]
       
        q.task_done()


@xl_macro()
def morning_button():

   xl = xl_app()
   
   for i in range(num_worker_threads):
       t = Thread(target=worker) #, args=(xl,)
       t.daemon = True
       t.start()


   range0=xl.Range("A13")
   zz=range0.Value
   range1=xl.Range("A14:A24")
   #_log.info(range0)

   x=14
   for sticker in range1:   
       if sticker.Value !=None:
           yy=sticker.Value
           #_log.info("yy:"+yy)
           Link=zz.replace("####",yy)
           #_log.info("Link:"+Link)
           q.put(["B"+str(x),Link])
       x=x+1

   q.join()       # block until all tasks are done

2014-10-01 18:10:02,147 - ERROR : Exception in thread Thread-4:
2014-10-01 18:10:02,147 - ERROR : Traceback (most recent call last):
2014-10-01 18:10:02,147 - ERROR :   File "c:\Anaconda3\Lib\threading.py", line 920, in _bootstrap_inner
2014-10-01 18:10:02,147 - ERROR :     self.run()
2014-10-01 18:10:02,147 - ERROR :   File "c:\Anaconda3\Lib\threading.py", line 868, in run
2014-10-01 18:10:02,147 - ERROR :     self._target(*self._args, **self._kwargs)
2014-10-01 18:10:02,147 - ERROR :   File "C:\Anaconda3\pyxll\morning\morningxlsx1.py", line 105, in worker
2014-10-01 18:10:02,147 - ERROR :     xl = xl_app()
2014-10-01 18:10:02,147 - ERROR :   File "C:\Anaconda3\pyxll\morning\morningxlsx1.py", line 34, in xl_app
2014-10-01 18:10:02,147 - ERROR :     xl_window = pyxll.get_active_object()
2014-10-01 18:10:02,147 - ERROR : RuntimeError: error getting IDispatch from hWnd

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 594
  • Karma: +14/-1
    • View Profile
Re: multiprocessing and queue, Excel crash on worker at xl = xl_app()
« Reply #1 on: October 01, 2014, 06:52:13 PM »
Hi,

that's odd.. not sure why it wouldn't be able to get the COM object if it's found the window... I'll have to look into that.

In the meantime, you could try using win32com to get the excel application object directly using:

Code: [Select]
xl = win32com.client.Dispatch("Excel.Application")

The reason for the pyxll xl_app function is because it always returns the correct excel application if you have multiple running, whereas the method above may sometimes return the wrong one if you have more than one excel session open.

Another option would be to pass the xl app instance you already have from the main thread through to your worker thread. You can't just pass the object to the thread function though, COM objects have to be marshaled across thread boundaries. Here's some code that will marshal and unmarshal a win32com Dispatch object:

Code: [Select]
import pythoncom

stream = pythoncom.CoMarshalInterThreadInterfaceInStream(pythoncom.IID_IDispatch, xl._oleobj_)
# pass stream to your thread func

# in the thread func get the xl app object back
xl = pythoncom.CoGetInterfaceAndReleaseStream(stream, pythoncom.IID_IDispatch)


Please let me know if either of these two solutions work for you.

Best regards,
Tony



Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 594
  • Karma: +14/-1
    • View Profile
Re: multiprocessing and queue, Excel crash on worker at xl = xl_app()
« Reply #2 on: October 01, 2014, 06:55:15 PM »
Hi (again),

just another thought, but it might be neater to collect the results from your worker threads into a queue and then write them out after they've all finished. That way you won't run into problems with multiple threads trying to write back to excel at the same time.

Even if the two workarounds I provided in the previous post work, I think this will be a better solution to your problem.

Take a look at the standard Queue class for a good way to collect the data from the worker threads:
https://docs.python.org/2/library/queue.html

Best regards,
Tony

Patrick

  • Newbie
  • *
  • Posts: 5
  • Karma: +0/-0
    • View Profile
Re: multiprocessing and queue, Excel crash on worker at xl = xl_app()
« Reply #3 on: October 02, 2014, 09:17:11 AM »
Hi Tony,
Thank you for your help,
The first two solution do not work. Excel do not crash but stay locked indefinitely at the same point (writing back to excel).
I will try your solution with the queue and will give you feed back

Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 594
  • Karma: +14/-1
    • View Profile
Re: multiprocessing and queue, Excel crash on worker at xl = xl_app()
« Reply #4 on: October 02, 2014, 09:35:32 AM »
Hi,

but you were able to get the xl object? That's what those two workaround were supposed to do. If you have multiple threads all writing back to excel at the same time I'm not too surprised you get a deadlock, you'd need to use a mutex/lock to ensure only one thread writes to excel at once - but the queue will be much simpler.

cheers,
Tony

Patrick

  • Newbie
  • *
  • Posts: 5
  • Karma: +0/-0
    • View Profile
Re: multiprocessing and queue, Excel crash on worker at xl = xl_app()
« Reply #5 on: October 02, 2014, 10:35:12 AM »
Hi
 :)
The solution with the queue is the right solution.
Thank you
 :)
I am filling the qout wihtin the worker and I am working the queue at the end of the program.

Code: [Select]
   xl = xl_app()
   while not qout.empty():
       item = qout.get()
       _log.info(item)
     
       range1=xl.Range(item[0]+":D"+item[0][1:])
   
       range1.Value=item[1]


Tony

  • Administrator
  • Hero Member
  • *****
  • Posts: 594
  • Karma: +14/-1
    • View Profile
Re: multiprocessing and queue, Excel crash on worker at xl = xl_app()
« Reply #6 on: October 02, 2014, 11:04:15 AM »
Excellent! Glad you were able to get it working :)

cheers,
Tony