PDA

View Full Version : high cpu usage for "Process Files" step, other options?


daveronson
06-07-2004, 10:01 AM
I have noticed that the "process files" step is a very cpu intensive operation. It causes 89%-100% cpu (2.8 Ghz Xeon processor).

Typically I am recursively searching subfolders for *.sql (in one folder there are over 600 stored procedures that I process)

I was wondering
1) is anyone else seeing this behaviour and is it normal
2) are there any other options besides writing my own code or scripts to improve performance?

Thanks.

kevina
06-07-2004, 01:30 PM
I certainly don't see that behaviour in any testing here [I'm seeing maybe 3-5% CPU usage on a P4-2.4Ghz for a process files step looking for all *.exe files on an entire partition]. There is nothing about a Process Files step that should make it CPU-intensive, a Process Files step is basically the equivalent of dir /s (recursive dir).

Are you sure it isn't a child step of the process files step that is consuming CPU cycles (which is being invoked once for each matching file)? You can verify this by unchecking the child steps and running the Process Files step again (which will then just recursively obtain a list of files to process and terminate).

daveronson
06-07-2004, 07:41 PM
hmmm, I must either be misunderstanding something or have a serious config problem.

I'm using version 5.4. I create a new project with a single step, which is "Process Files". The directory is C:\, the mask is *.exe and "recursively search subfolders" is checked off.

I start the build and after a short period of time, it processes each executable (it finds 1592 of them) (1 of 1592, 2 of 1592, etc....).

I ran perfmon during this and the VisBuildPro process averaged 94% cpu.

Does this sound abnormal?

kevina
06-08-2004, 09:23 AM
Actually, I probably didn't replicate what you were describing very completely. I didn't let the Process Files step complete (I didn't let the step finish loading the list of files and iterate over them). During the iteration stage of execution of a Process Files step, my CPU did peg at 100%.

Obviously the first stage of the Process Files action is disk-intensive and consequently does not use a high percentage of CPU cycles (because typically it is waiting on the disk). Once this stage of execution is complete, it then iterates over this list of files (in memory at this point) which becomes a CPU intensive operation.

Any running application requesting CPU cycles will be given any idle cycles (hence CPU usage will typically be 100% in this scenario). If multiple apps request CPU time, then any available cycles will be time-shared between the these apps. This is normal behaviour and should be expected.

During my test, the system did not stop being responsive even though the CPU usage was at 100% (because Visual Build Pro was not hogging resources and was only using CPU time that was not needed by any other process).

For an application to use less than the available cpu cycles when needed would be an non-standard throttling of application performance (for any application). Unless you are seeing system non-responsiveness during this test I believe what you are seeing is normal.

Of course, if a child step of the Process Files step is hogging resources than that is an entirely different matter (a compiler, for example).

daveronson
06-08-2004, 10:11 AM
Ok, this sounds comparable to my situation then.

I should state that the real reason I was focussed on CPU was that I am concerned about the amount of time the Process Files step takes. Even with no sub-step it takes a long time to process the 1592 exe files (16 minutes, 5 seconds to be precise).

Back to my real world example, I am processing slightly over 600 sql files and passing the filename to osql. This part of my build is very time consuming and I'm looking for ways to optimize it.

kinook
06-08-2004, 03:44 PM
That does seem a bit slow. On a 1.1Ghz P3 laptop here, with file logging enabled and using the GUI app, a Process Files step (w/ a single Log Message child to log the filenames) for *.exe in C:\Program Files (1270 files) takes about 5 minutes to iterate over all files after the Process Files step has loaded the filenames; the Console app runs the same steps in about 3 minutes. Can you describe the hw/sw configuration of the box in a little more detail?

One alternative method would be to use a single Run Program step with a command of:

%DOSCMD% FOR /R "drive:\path" %%i IN (*.sql) DO osql "%%i" <add'l osql switches>

(see http://www.kinook.com/Forum/showthread.php?threadid=362 for caveats on the exitcode returned when using FOR).

daveronson
06-09-2004, 06:12 PM
The hardware story is a little complicated. Our build server is running in a VM on VMware's ESX server. There's two 2.8Ghz Xeon procs with 2.5GB's of RAM. Depending on what the other VM's are doing, there may or may not be a performance hit.

I have done some tweaking to the VM to give it a greater share of the processors which improved my build time. I have also switched to the VisualBuildCmd.exe application after you mentioned it. That change lopped off about 20 minutes on my build right there which is great.

I am curious about the Process Files performance though. In a few cases I've been forced to use non-VisualBuild steps to accomplish a task, for example using the Putty tools (pscp, plink) but I'd hate to have to do that for this step, which is pretty core functionality IMHO that I'd love to get working better. Just out of curiousity I wrote a small C# app that accomplishes the *.exe search and outputs each filename to the console. The first time I ran it, it took amount a minute with approximately 30% CPU usage. I'm guessing due to OS caching, subsequent runs took 7 seconds. Is there anything special that the VisualBuild "Process Files" step does that would explain the overhead?

kinook
06-10-2004, 06:32 AM
A simple C# app like you mentioned and Visual Build Pro aren't directly comparable. VBP provides a multi-threaded, extensible build environment (extensible at runtime via scripting events, custom actions, build rules, macros, custom logging components, etc.), and all of this infrastructure adds some overhead to the process (not to mention the work that is done to maintain backwards compatibility with earlier versions).

However, I agree that there may be some room for improvement in raw performance here. For our builds, the processing done for each file when using a Process Files action typically dwarfs the iterating overhead of VBP itself, and there haven't been complaints about it before now, so we weren't aware of this being an issue. Thanks for the feedback, it's on our list to investigate to see if any optimizations can be implemented.

daveronson
06-10-2004, 07:48 AM
That makes sense. Thanks for your time.

BTW, v5.4 adds some nice features, keep up the good work!