Python Downlaod manager,Mayapy, multithreaded and the Global Interpreter Lock

Hi,
i have improved a tool that connects to the assets server and downloads sequentially all the necessary so the artist can start his task. This is in Maya and running mayapy.exe in windows.

the improvement consists of running multiple download threads instead of just one in the background.

Reading about the GIL in the python interpreter, i learnt that the GIL prevents python code to run more than one thread simultaneously. That apparently, it executes a fixed amount of bytecode of each thread then releases the lock and lets the next thread acquire it.

So my question is: my download manager works faster in multithreaded mode, but because of the GIL, it runs slower than it actually could… does the MAYAPY interpreter rely also in the GIL? Is there a way to disable the GIL?

thanks.

You cannot disable the GIL in Maya or any CPython implementation. Very simply put put, only one thread at a time can have control over the Python interpreter, and the GIL is the mechanism for that. However, C based functions are often not affected by this (C coded modules can release the GIL), and they can run in parallel (there are some exceptions, such as I/O). Many functions in the standard library are written in C, and so are large parts of Qt. If you want to speed up your code, try to find functionality written in C and use it rather than rolling your own pure Python code. The less stuff you have in pure Python, the faster your code will be. Note that C based functions must be thread safe or else they will still block other threads.

This adds the problem of portability though. Re-compiling C modules for Maya ain’t fun. Fortunately the standard library and Qt are already built in and offer a lot of functionality. Otherwise you have to compromise between speed and portability.

From my own experience, using the requests module for HTTP transfers, I see a speed-up when using up to 4 concurrent threads. From there on the speed gains diminish quickly when adding any more threads.

For tasks like IO (particularly if you’re bound by the net) you can invoke a separate instance of MayaPy (or some other python) and run it in a different process, polling periodically to wait for it to be done. That will make it non-blocking and allow your user to continue working. Generally Maya (not just Maya python) is not great about threads - but if your problem is IO bound then you can use separate processes instead of threads and get most of what you want.

http://stackoverflow.com/questions/3044580/multiprocessing-vs-threading-python lays out most of the differences between multiprocessing and threads. You’ll need to be careful with the multiprocessing module in Maya though - it spins up a new maya, which is slow. What i’ve done in the past is a very small program which I start using a mayapy (not a full maya!) that then invokes multiprocessing to distribute the jobs without the maya overhead. In your main maya thread you will just need to check – probably with something like an idle-time script job – to see when the downloads are complete.

Risking to be captain obvious, the only answer to the question “does change X that I did speeds up my program” is “benchmark”. If the “assets server” mean local network, might probably be not worth downloading more than 2 things at a time; for a lot of small files over internet, optimum might be 4-6 simultaneous connections per server; in either case, just a speculation. Try different approaches and see what works best. Cranking really high numbers for things like downloads might result in slowdowns and timeouts.

And, yeah, you should not really worry about GIL when it comes to IO operations. GIL is only important in the context of “I’m doing these computations on the processor, and I know only one thread executes at a time and I don’t have any race conditions on my data structures”.

Thank you all for your explanations and knowledge.

I would like to try and instead of using threads, use the Python’s multiprocessing module to make the asset downloader really parallel. But here i have a concern, with QThreads i am able to emit signals whenever i want in every step of the running code and so, update the GUI with info of the progresses.
But with multiprocessing i cannot emit signals, i have checked a bit the PySide doc and there doesnt seem to be anything similar, QProcess doesnt match, it’s meant for executing other programs. All i have seen you can do with multiprocessing is to define a callback that will be executed when the process has finished running (it could be for example a callback that updates the GUI) but for my purposes, it is not enough.
Is this something that can be done? or if i want to update progressively the GUI im bound to QThreads??

Have you considered a callback that causes a QObject to emit a signal?

Hi Claudio,

my problem is not emitting a signal from the callback, my problem is i want to give report of the progress of the multiprocessing tasks. The callback is called only when the task has finished, this is not what i want.

first try and write something that runs outside maya (just a vanilla Python interpreter) and check if multiprocessing really gives you a wanted boost. My bet is that it does not as your application is not limited by the single cpu, but instead by network I/O.