APLNext Supervisor - Threads Stuck in Queue

General discussions related to APLNext's APLNextSupervisor product.

APLNext Supervisor - Threads Stuck in Queue

Postby shoncharik » July 23rd, 2015, 12:44 pm

I am currently running a multithreaded application of a server with 24 threads. The application is getting stuck not creating all the threads it needs (e.g. 13 out of 24). The codewalker says that the program is currently in the bolded line below, and the value of xQueueSize is 11. What changes do I need to make to ensure that all threads get created?

Thanks,
Steve

AddEventHandler '#' 'onDestroy' "←'S' ⎕wi ¨'Stop' 'Close' 'Delete'"

(workspacepath function variables)←specs
workspace←ConfigMultiCore workspacepath

thread←1
:While thread≤threads

rightarg←variables
leftarg←⊂2⍴(thread,threads)

←'S' ⎕wi 'XBeginCall' thread workspace function rightarg leftarg

⎕wgive 0
'Starting Thread # ',(⍕thread),' @ ',⎕ts
thread←thread+1

:EndWhile

:While 0<'S' ⎕wi 'xQueueSize'
⎕wgive 0
:Endwhile
shoncharik
 
Posts: 3
Joined: August 12th, 2014, 8:23 am

Re: APLNext Supervisor - Threads Stuck in Queue

Postby joe_blaze » July 23rd, 2015, 2:42 pm

Hi Steve,

I see from your sample code that you are 'sinking' the result of the 'XBeginCall' method. It would be useful to display the result of this method along with the thread# and show us the output.
It would be useful to establish a Supervisor log file in this code and provide the log file contents to us.
What version of the APLNext Supervisor are you using for this sample?
What is the value of the 'xnprocessors' property in this sample?
Please provide the Supervisor xml-format configuration you are using for this sample?

Thanks.
joe_blaze
 
Posts: 385
Joined: February 11th, 2007, 3:09 am
Location: Box 361 Brielle, NJ 08730-0361

Re: APLNext Supervisor - Threads Stuck in Queue

Postby shoncharik » August 7th, 2015, 11:51 am

Joe,

The result of the 'XBeginCall' method is always 1's (the same amount as the number of threads selected).
I've attached the log.
The sample is using version 1.9.4.0 of APLNext Supervisor.
The value of 'xnprocessors' is 4 24.
Here is the xml configuration:
<?xml version='1.0'?>
<config>
<workspaces>
<workspace id='CHANGE IMPACT AND ASSESSMENT - VERSION 2.10.3.1 - STEVE'>
<minpool>1</minpool>
<maxpool>1000</maxpool>
<timeout>3000000</timeout>
<debug>1</debug>
<visible>0</visible>
<wslocation>Q:\ACTUARIAL ANALYTICS\CHANGE IMPACT AND ASSESSMENT\BETA TESTING\STEVEN\CHANGE IMPACT AND ASSESSMENT - VERSION 2.10.3.1 - STEVE.w3</wslocation>
<wssize>32000000</wssize>
<evlevel>2</evlevel>
<busyid />
<user_state>started</user_state>
</workspace>
</workspaces>
</config>

Any suggestions?

Thanks,
Steve
Attachments
APLNextSupervisor.doc
(1.18 MiB) Downloaded 301 times
shoncharik
 
Posts: 3
Joined: August 12th, 2014, 8:23 am

Re: APLNext Supervisor - Threads Stuck in Queue

Postby joe_blaze » August 29th, 2015, 5:45 am

Hi Steve,

Jairo Lopez and I examined the information you provided and the Supervisor configuration and log information appear reasonable. We believe that the Supervisor is operating properly and that the number of processors allocated to instances of the APL+Win ActiveX engine is at its maximum for the machine you are using. This conclusion is based on the consideration that when a processing request is submitted to the Supervisor by the application-specific 'Controlling Application' the Supervisor must request two threads from the Windows operating system for that processing request. The first thread is necessary to create an instance of the APL+Win ActiveX engine which will run the application-specific 'kernel' function. The second thread is necessary to establish an asynchronous wait for that 'kernel' function execution to complete and return its result. An asynchronous wait is necessary so that multiple processing requests by the 'Controlling Application' can be handled by the Supervisor.

So for your machine with 24 processors the maximum number of simultaneously-running instances of the APL+Win ActiveX engine would generally less than (24-2) ÷ 2. Any additional requests submitted to the Supervisor by the 'Controlling Application' would be queued up until a processor became available and the Windows operating system granted the Supervisor the necessary two threads per processing request. The 'timeout' property of the Supervisor configuration also affects how long a processing request remains in the queue.

The attached pdf-format document discusses this topic. It will be incorporated into the next release of the APLNext Supervisor documentation and revise the Q&A section.

Thanks for asking this question.
Attachments
APLNext Supervisor - Multi-threading and Processors (Cores).pdf
Supervisor documentation update
(312.78 KiB) Downloaded 362 times
joe_blaze
 
Posts: 385
Joined: February 11th, 2007, 3:09 am
Location: Box 361 Brielle, NJ 08730-0361

Re: APLNext Supervisor - Threads Stuck in Queue

Postby Davin Church » August 29th, 2015, 12:11 pm

Just out of curiosity, why is it necessary for each allocated thread to entirely consume the resources of a single core? Each core can time-slice many threads at once, can it not? I can surely run several applications on a machine with only one core, as we've done since Windows and similar OSes came out, right? So why can't the OS allocate you multiple threads on any given core, especially if some of them are just sitting and "twiddling their thumbs"?
Davin Church
 
Posts: 407
Joined: February 24th, 2007, 1:46 am

Re: APLNext Supervisor - Threads Stuck in Queue

Postby joe_blaze » August 29th, 2015, 5:12 pm

Hi Davin,

Thanks for the question. The short answer is that how cores are used by Windows is controlled by the Microsoft Windows operating system and not the APLNext Supervisor.

The Supervisor does not directly access the target machine's processors, as that is entirely controlled by the Microsoft Windows operating system. Instead, for each task submitted by the application-specific 'Controlling Application' to the Supervisor, the Supervisor requests two new threads from the Windows operating system. If granted, the Supervisor uses one of those threads to create an instance of APL+Win ActiveX engine to execute the application-specific 'Kernel' function and the other thread to wait for the response from the 'Kernel' function. Running these two functions on separate threads distinct from the thread which is running the 'Controlling Application' and the Supervisor itself means that the Supervisor is able to continue receiving processing requests from the 'Controlling Application' function and have those requests satisfied by instances of the APL+Win ActiveX engine executing the 'Kernel' function. Each of these two threads along with the thread running the 'Controlling Application' are executing linear programs. Neither APL+Win or the Supervisor are directly managing the target machine's processors.

When the Windows operating system provides a thread to an application system, it is up to the operating system and not the application system to associate that thread with a processor (core). The work that thread performs may consume all the resources of that processor. If so, then that core cannot effectively 'time-slice'. If not, I would assume that the Windows operating system would utilize that processor for a different thread. The only way I know to conveniently observe this condition is to use the Performance tab of the Windows Task Manager.

There are also thread priority options which is another interesting topic. Too little priority for a thread running on a processor, e.g. too much 'time-slicing', would mean that the functionality of the program running in that thread would be unresponsive. Too much priority for such a thread might cause other applications on the target machine to become unresponsive. The Microsoft Windows operating system deals with this issue and not the Supervisor.

One could use APL+Win and the Supervisor on a machine with very few cores, but the performance is likely to be worse than if the Supervisor was not used. This is similar to using many cores to process tiny pieces of data where little work is to be performed on each piece. In both cases the 'data marshalling' costs are likely to exceed the processing costs for the operations involved. Parallel processing schemes are likely to yield significant performance gains only when the application-specific work the 'Kernel' function performs significantly exceeds the 'data marshalling' costs of the 'Supervisor' to send the request to a thread and receive a result from a thread. For example 'parallelizing' the APL+Win expression +/My_Array is unlikely to perform better than the existing linear version of that algorithm, even for arrays much larger than those observed in most APL+Win-based application system.

This issue of 'data marshalling' costs illustrates why the Supervisor can be successful in significantly improving application system performance. The APL+Win programmer can know, and the APL+Win interpreter cannot know, which section of the application system can be run in parallel. These sections are generally implicit or explicit loops. The APL+Win programmer can consolidate that section into an APL+Win 'Kernel' function which is run by the Supervisor. Since that 'Kernel' function is likely to include many APL+Win executable expressions, the 'work' performed significantly exceeds the 'data marshalling' costs associated with using the Supervisor, resulting in a performance gain.
joe_blaze
 
Posts: 385
Joined: February 11th, 2007, 3:09 am
Location: Box 361 Brielle, NJ 08730-0361

Re: APLNext Supervisor - Threads Stuck in Queue

Postby Davin Church » August 29th, 2015, 10:53 pm

I guess my question (in ignorance) boils down to "why do you have to 'request' threads from the OS at all", and "how are these different than just starting another task/thread (like every application does) and let the OS sort it out as best as it will fit?"
Davin Church
 
Posts: 407
Joined: February 24th, 2007, 1:46 am

Re: APLNext Supervisor - Threads Stuck in Queue

Postby joe_blaze » August 30th, 2015, 4:12 am

Good question!

Starting an additional thread is always a 'request' because the resources of the target machine are limited. The word 'request' indicates that Windows may not grant the application another thread, which can happen because there are simply no unallocated processing resources available when the thread request is made. If no additional threads are available from the Windows operating system, the Supervisor queues up the incoming requests until threads are available or the programmer-defined timeout in the Supervisor instance configuration expires.

I can't speak for every other application, but in APL+Win one can:

Start another .exe-type program, which runs in a separate thread, e.g. using the Win32 API CreateProcess method, but once started it is not under the direct control of the APL+Win program which started it. Interaction, if any, between APL+Win an the running .exe must be done with a programmer-defined protocol comprehended by both APL+Win and the .exe, e.g. files. Such an interface is likely to be synchronous with respect to APL+Win.

Create an instance of an ActiveX component and use the object model of that component, i.e. methods, properties and events. APL+Win interacts with the methods and properties in a synchronous manner, i.e. APL+Win must wait for the response from the ActiveX component. The events, if any, in the object model of the ActiveX component provide the asynchronous [independent thread] capability and not the APL+Win calling environment.

Now think of using an APL+Win [client] to start one or more APL+Win ActiveX instances [servers] in an attempt to spread the processing load to increase application system performance. When the client uses the 'Call' method to run an APL+Win function in one of the APL+Win ActiveX servers, the client must now wait for that server-side function to complete execution. While that server-side function is running, the client cannot execute other APL statements, because the Call method is synchronous. While the server-side function is running the client cannot use the Call method to run another server-side function in another APL+Win ActiveX instance.

The APLNext Supervisor extends APL+Win so that it can asynchronously interact with multiple APL+Win-based tasks, effectively providing application-level parallel processing. Using the Supervisor, multiple tasks, .e.g 'Kernel' functions, can be simultaneously running in multiple APL+Win ActiveX instances, while the client, i.e. 'Controlling Application', which started those ActiveX instances can continue executing additional APL+Win statements. No special programming has to be added to the 'Kernel' functions since the Supervisor provides the asynchronous interface to those 'Kernel' functions.

In the latest version of the APLNext Supervisor the asynchronous feedback to the 'Controlling Application' from the 'Kernel' function can occur when that function completes execution, i.e. using the ProcessCompleteCallback Supervisor event and at any stages in the execution of that function, i.e. using the ProcessProgressCallback Supervisor event.
joe_blaze
 
Posts: 385
Joined: February 11th, 2007, 3:09 am
Location: Box 361 Brielle, NJ 08730-0361

Re: APLNext Supervisor - Threads Stuck in Queue

Postby brent hildebrand » August 30th, 2015, 2:45 pm

joe_blaze wrote:Now think of using an APL+Win [client] to start one or more APL+Win ActiveX instances [servers] in an attempt to spread the processing load to increase application system performance. When the client uses the 'Call' method to run an APL+Win function in one of the APL+Win ActiveX servers, the client must now wait for that server-side function to complete execution. While that server-side function is running, the client cannot execute other APL statements, because the Call method is synchronous. While the server-side function is running the client cannot use the Call method to run another server-side function in another APL+Win ActiveX instance.


There is a way to use the ActiveX in a multi-threaded way, and that is to use the "Defer" method on the server side. Create an ActiveX instance, load up your functions and data, then use the defer method on the server to perform you calculations, and use the Notify method to fire an even on the client side when the calculation is done. True, if you just use Call, then the client side is locked up. But with the Defer method, the client is freed up and execution begins in the server.
brent hildebrand
 
Posts: 481
Joined: February 12th, 2007, 5:53 pm
Location: Loma Linda, CA

Re: APLNext Supervisor - Threads Stuck in Queue

Postby Davin Church » August 30th, 2015, 2:58 pm

Yes, I understand that much, and have written my own asynchronously multithreaded APL applications using ActiveX techniques. But I'm wondering about the OS side of the puzzle into which you're having to fit the APL Supervisor operation. It would seem to me that you could (in theory) ask for any number of threads from the OS and it should be able to grant them all (up to some very large number) and just share them across whatever cores are available as each thread gets switched back in. In that way you could have dozens of threads running at once on a (say) four-core machine and Windows would just give each one its allotment of CPU time in turn, like everything else. From the OS point of view, I don't see any real difference between one application with a dozen threads and a dozen applications with one thread each, yet you're telling me (I think) that they work entirely differently. I just can't figure out why they shouldn't both be able to work in the same way (from the OS point of view), unless you've got the APL Supervisor explicitly requesting each of its threads to be assigned to a single core and to allow nothing else to run on that core at the same time. Of course, you wouldn't want to oversaturate the cores by assigning way too many threads at once, but that's a management problem rather than a technical one.
Davin Church
 
Posts: 407
Joined: February 24th, 2007, 1:46 am

Re: APLNext Supervisor - Threads Stuck in Queue

Postby joe_blaze » August 30th, 2015, 8:52 pm

Basically having 'dozens of threads' assigned to a core would reduce performance since the numerous switch outs of programs and data would increase the data marshaling costs. That is why there are Supervisor configuration settings for minpool and maxpool which the application programmer can tune to obtain the best performance.
joe_blaze
 
Posts: 385
Joined: February 11th, 2007, 3:09 am
Location: Box 361 Brielle, NJ 08730-0361

Re: APLNext Supervisor - Threads Stuck in Queue

Postby joe_blaze » August 30th, 2015, 8:58 pm

The Defer and Notify methods can be used, but for most application programmers that is a lot of plumbing to develop. The APL+Win ActiveX engine Notify method is how the Supervisor provides the ProcessProgress event.
joe_blaze
 
Posts: 385
Joined: February 11th, 2007, 3:09 am
Location: Box 361 Brielle, NJ 08730-0361

Re: APLNext Supervisor - Threads Stuck in Queue

Postby Davin Church » August 31st, 2015, 12:07 am

joe_blaze wrote:Basically having 'dozens of threads' assigned to a core would reduce performance since the numerous switch outs of programs and data would increase the data marshaling costs. That is why there are Supervisor configuration settings for minpool and maxpool which the application programmer can tune to obtain the best performance.

I understand that you wouldn't normally want terribly many threads assigned to a given core, but that doesn't mean that it is (or should be) prohibited. It sounds like the Supervisor won't LET you assign a large maxpool (or even a moderate one). If you have an 8-core machine and can't use but 3 of them for work, that sounds like a serious problem. I'm trying to understand why it's necessary to have such a stringent maximum limit. Based on my understanding of the processes going on under the covers, I see no reason why (for instance) I couldn't assign 8 worker threads on an 8-core machine. You tell me that only 3 are available for real work, and I can't understand why every thread (both worker and management) must exist solely on a single CPU without ANY possible sharing of resources. For instance, why does an "asynchronous wait" thread need to consume 100% of a core, making it unavailable to do anything else with at the same time? That kind of work should only need (I'm guessing) about 0.001% of a core's resources, wasting 99.999% of the whole core if it is forced to run nothing else at the same time. That sounds very wasteful to me.

Somehow I feel like I'm not expressing myself well enough to ask my question. Is there something that I'm saying poorly that might be confusing?
Davin Church
 
Posts: 407
Joined: February 24th, 2007, 1:46 am

Re: APLNext Supervisor - Threads Stuck in Queue

Postby joe_blaze » August 31st, 2015, 2:56 am

The Supervisor maxpool maximum value is System.Int32.MaxValue, a big number. One would find that increasing it beyond a machine-specific, application-specific amount would not improve performance. On a machine with 64 cores, running only APL+Win and the Supervisor with heavy-work 'Kernel' functions, I would expect that performance would not increase or possibly diminish if the maxpool was set higher than 31, i.e. (64-2)÷2. One can desire any number of simultaneous, independent threads, but machines are limited in their processing capabilities, so beyond a certain number of threads requested, performance will level off or diminish. This depends on the machine's resources and the APL+Win programmer's design of the application system.

Remember that we are talking threads which are likely to completely consume a core's processing capability, i.e. heavy work done by the 'Kernel' function(s). Processor-limited applications in certain enterprises are common, the classic example being an application using stochastic modeling. In these cases there will be significant time periods when processing requests are submitted to the Supervisor by the 'Controlling Application' and there are no available resources on the target machine to satisfy those requests, so that those requests must be queued up.

Also when waiting for execution completion of a 'Kernel' function, the frequency of the checking will affect how much 'work' the waiting thread performs and thus how much processor time it consumes.

The Supervisor uses threads, but does not control the allocation of threads to cores, which is the domain of the Microsoft Windows operating system. Look to the Windows TaskManager > Performance tab to view how processors are being used. Certainly the Windows operating system attempts to share processors among threads, which can be directly observed if in addition to APL+Win and the Supervisor, there are other applications running on the target machine. In this case the APL-based application runs slower than if the other applications were not running, because that APL-based application would otherwise use all available processors on the target machine, but now some are being used by the other running applications. For enterprise-significant, heavy processing using APL+Win and the Supervisor it would not be prudent to simultaneously run other applications on the target machine.

Davin asks: I can't understand why every thread (both worker and management) must exist solely on a single CPU without ANY possible sharing of resources.
The Microsoft Windows operating system controls if a thread runs on one or more processors, not the Supervisor. The Supervisor assures that the worker and management are running on separate threads. These threads are likely to be running on separate processors because the situation we are discussing involves 'Kernel' functions which perform processing heavy enough to consume all the processing capability of the core to which it is assigned by Windows.
joe_blaze
 
Posts: 385
Joined: February 11th, 2007, 3:09 am
Location: Box 361 Brielle, NJ 08730-0361

Re: APLNext Supervisor - Threads Stuck in Queue

Postby Davin Church » August 31st, 2015, 3:34 pm

Well, that finally makes more sense (if I'm understanding you correctly). I thought you'd been saying all along that you CANNOT set the maximum threads to more than (n-2)÷2, which would only be possible if you were restricting each thread to separate cores, which is why I was so confused. I think that it would normally be perfectly reasonable to run anywhere up to n kernel function threads, even those that use most of a core, because other Supervisor work is likely to require little overhead and I would expect its impact to be minimal.

So if you can set the maxpool count to anything, why then was the original poster only getting to use about half the available threads on his CPU? His description sounded to me like the Supervisor was refusing to start more threads even though he requested more to be created?
Davin Church
 
Posts: 407
Joined: February 24th, 2007, 1:46 am

Next

Return to APLNextSupervisor

Who is online

Users browsing this forum: No registered users and 1 guest

cron