Sunday 15 August 2010

Filling a queue and managing multiprocessing in python -


I am having this problem in Python:

  • I have a line of URL I need to check periodically
  • If the queue is full, I need to process every item in the queue
  • Every item in the queue processed by a single process (Multi-processing)

    So far I've been able to get "manual" in this way:

      While 1: Self.updateQueue () While not self.mainUrlQueue .empty (): domain = self.mainUrlQueue.get () # If we have not yet launched any process, then we need to do so if lane (self. Jobs) & lt; Maxprocess: self.startJob (domain) #time.sleep (1) Other: # If we have already begun the process, then we need to clear the old process in our pool and start new ones = 0 # We cycle through each process, as long as we do not get a free; Leave work only if jobdone == 0: for self-employed in PJ: #print "entry into the loop" # If the process is not finished, then p.is_alive () and jobdone == 0: #print str (P.pid) + "Job dead, a new start" self.jobs.remove (p) self.startJob (domain) jobdone = 1   

    Though it may lead to many problems and errors I wonder if I was not more suitable using a pool of process. What would be the right way to do this?

    However, my queue is very empty, and it can be filled with 300 items in a second, so I'm not sure how to do things here.

    You can use the ability to block startups to generate multiple processes And they can sleep until some data is available on the queue for the process, they sleep. If you are not familiar with this, you can try to "play" with that simple program: import import time_jula = multiprocessing. Question () def employee_man (queue): print OS .getpid (), "working", while correct: item = queue.get (true) print os.getpid (), "found", item time.Sleep 1) Simulate the "long" operation of the_pool = multiprocessing.Pool 3, Work_Mina, (the_Q,)) Do not forget the coma here ^ (5) in the range: the_queue.put ("hello") the_queue .put ("world") time.sleep (10)

    tested with Python 2.7.3 on Linux

    This 3 processes (in addition to the original process) will create. Each child implements the worker_main function. This is a simple loop that receives a new item from the queue on each iteration, if anything is not ready for the process, the workers will be blocked.

    At the start, all 3 processes will sleep until some data is quenched. When any data is available, one of the waiting workers receives that item and starts processing it. After this, it tries to get another item from the queue, if nothing is available then ...

No comments:

Post a Comment