Assignment goal was to improve some code and prove it by gprof reports. For the code to profile and enhance I choose a random, small-scale project from Sourceforge.net. Finddouble, by Bernard "bsegones", is a small tool to detect duplicate files within a directory structure. Available: http://sourceforge.net/projects/finddouble/ The tool will traverse the directory structure and, for each file, find files with the same size, and then proceed to a binary comparison. First profile information % cumulative self self total time seconds seconds calls s/call s/call name 76.77 1.25 1.25 87 0.01 0.02 findDoubleOf 11.67 1.44 0.19 1562955 0.00 0.00 compareFiles 4.91 1.52 0.08 stat 1.84 1.55 0.03 __i686.get_pc_thunk.bx 1.84 1.58 0.03 call_gmon_start 1.84 1.61 0.03 3125910 0.00 0.00 clean_filename 0.61 1.62 0.01 1 0.01 0.01 str_replace 0.61 1.63 0.01 atexit 0.00 1.63 0.00 1 0.00 1.47 findDoubleInDirectory After analizing the source code and the profiling information, there were 3 functions that concerned me the most. findDoubleOf, compareFiles and clean_filename. The third one, although got a huge number of calls, only involves a few string manipulation and it's not really very processor intensive. The second one, althought I thought would be more demanding, will only be of really interest when there are multiple files with the same size. I ran the tool with different sets of data and I used the code of the efficient file copier to generate large, similar files. I wanted to code the function so it would read small blocks and and compare them one by one, stopping at the first difference, but the method used seems to be already performing the best (mmap and memcmp). Also, the most common scenario will not involve a lot of files with the same size, so I decided that I should focus on the function that was really accounting for the most processor usage. The first function, findDoubleOf, is a recursive function to traverse the directory tree. Looking at the code, there are a few improvements that seem very obvious like keeping track of the files already identified as doubles and skipping the already found matching equivalents and also builging a data base in order to catch some information of the files already scanned and/or hash tables for the content. Anywaym I focused in trying to spot which higher level functions were being called in order to find low level, faster equivalents. I tried to replace the sprintf function by a set of strcpy, strcat functions but the performance didn't really improve. I tried comenting a couple more of funcions in order to check whether it was taking a long execution time or not. I realized that there was redundant code at line 153 where the tool performs a stat to read the current element information to decide whether it is a file or a directory. This information should be available in the struct dirent *sourceFile obtained by the function readdir at line 136. So I changed the code: // get file stat : is it a file or directory ? struct stat fileStat; if( stat (sourceFileName, &fileStat ) !=0) continue; switch(fileStat.st_mode & S_IFMT) { case S_IFDIR: findDoubleInDirectory( sourceFileName); break; // directory case S_IFREG: findDoubleOf(sourceFileName, current_directory); break;// regular files or links } // end of switch() for the code // get file stat : is it a file or directory ? if(destinationFile->d_type==DT_REG){ compareFiles( sourceFileName, destinationFileName); continue; } if(destinationFile->d_type==DT_DIR){ findDoubleOf( sourceFileName, destinationFileName); continue; } the same change can be applied to the function findDoubleInDirectory. Here is the profiling report after the changes. It shows a radical improvement in the performance of the functions analyzed. My guess is that there is to blame also some sort of OS or VMWare catching. I ran the reports again with both of the orignal and modified code. The results shown in the begginning are already updated. % cumulative self self total time seconds seconds calls s/call s/call name 70.00 0.79 0.79 87 0.01 0.01 findDoubleOf 22.15 1.04 0.25 1550949 0.00 0.00 compareFiles 4.43 1.09 0.05 stat 1.77 1.11 0.02 1 0.02 0.02 str_replace 0.89 1.12 0.01 3101898 0.00 0.00 clean_filename 0.89 1.13 0.01 1 0.01 1.06 findDoubleInDirectory The results show an improvement of a 30% in the performance due to the removal of unnecessary disk access to read meta information of a bunch of small files that was already available. I'm attaching the gprof reports and the modified code. The original can be downloaded from the Sourceforge.