Version 2 (modified by cameron, 5 years ago) (diff)


Using LLVM Tools with Parabix

Let's see what we can do with the LLVM tools. We will apply Clang, the LLVM C++ compiler as well as several back-end tools, such as llc, opt, llvm-dis.

We demonstrate the tools on some prototype Parabix code for regular expression matching.

First get yourself a copy of what you need, installed on a Ubuntu 64-bit machine.

mkdir proto
cd proto
svn co
svn co
svn co

Now let's compile code for matching the regular expression: ([a-zA-Z][a-zA-Z0-9]*)://([^ /]+)(/[^ ]*)?|([^ @]+)@([^ @]+). It should be in the file RE/data/test/, modify as needed. Now run the regular compilation chain with gcc.

cd RE
cd output
cd src

Hmm. A lot of work, but now we have the executable in re. Let's look at how it performs.

perf stat -e cycles:u,instructions:u ./re ../../performance/data/howto -c
Matching Lines:32539

 Performance counter stats for './re ../../performance/data/howto -c':

        95,090,095 cycles:u                  #    0.000 GHz                    
       216,535,343 instructions:u            #    2.28  insns per cycle        

       0.034277223 seconds time elapsed

Your results with vary, depending on your machine. For interest, let's compare with egrep.

perf stat -e cycles:u,instructions:u egrep '([a-zA-Z][a-zA-Z0-9]*)://([^ /]+)(/[^ ]*)?|([^ @]+)@([^ @]+)' ../../performance/data/howto -c

 Performance counter stats for 'egrep ([a-zA-Z][a-zA-Z0-9]*)://([^ /]+)(/[^ ]*)?|([^ @]+)@([^ @]+) ../../performance/data/howto -c':

    43,229,901,018 cycles:u                  #    0.000 GHz                    
   109,259,361,636 instructions:u            #    2.53  insns per cycle        

      11.437203814 seconds time elapsed

Well egrep found the same number of matches, but about 300X slower!