Parallel data processing in Python -
i have architecture queue url addresses , classes process content of url addresses. @ moment code works good, slow sequentially pull url out of queue, send correspondent class, download url content , process it.
it faster , make proper use of resources if example read n
urls out of queue , shoot n
processes or threads handle downloading , processing.
i appreciate if me these:
- what packages used solve problem ?
- what other approach can think of ?
you might want python multiprocessing library. multiprocessing.pool
, can give function , array, , call function each value of array in parallel, using many or few processes specify.
Comments
Post a Comment