# Changeset 1115

Ignore:
Timestamp:
Apr 11, 2011, 3:07:28 PM (8 years ago)
Message:

Significant edits.

Location:
docs/PACT2011
Files:
3 edited

### Legend:

Unmodified
 r1114 \section{Scalability} \subsection{Performance} Figure \ref{Scalability} (a) shows the performance of Parabix2 on three different cores: \CO{}, \CITHREE\ and \SB{}. The average processing time of the five workloads, which is evaluated as CPU cycles per thousand bytes, is divided up by bitstream parsing and byte space postprocessing. Bitstream parsing, which mainly consists of SIMD instructions, is able to achieve 17\% performance improvement moving from \CO\ to \CITHREE{}; 22\% performance improvement moving from \CITHREE\ to \SB{}, which is relatively stable compared to postprocessing, which gains 18\% to 31\% performance moving from \CO\ to \CITHREE{}; 0 to 17\% performance improvement moving from \CITHREE\ to \SB{}. Figure \ref{Scalability} (a) demonstrates the average XML well-formedness checking performance of Parabix2 for each of the workloads and as executed on each of the processor cores --- \CO{}, \CITHREE\ and \SB{}. Processing time is shown in terms of bit stream based operations executed in bit-space' and postprocessing operations executed in byte-space'. In the Parabix2 parser, bit-space parallel bit stream parser operations consist primarily of SIMD instructions; byte-space operations consist of byte comparisons across arrays of values. Executing Parabix2 on \CITHREE{} over \CO\ results in an average performance improvement of 17\% in bit stream processing whereas migrating Parabix2 from \CITHREE{} to \SB{} results in a 22\% average performance gain. Bit space measurements are stable and consistent across each of the source inputs and cores. Postprocessing operations demonstrate data dependent variance. Performance gains from 18\% to 31\% performance are observered in migrating Parabix2 from \CO\ to \CITHREE{}; 0\% to 17\% performance from \CITHREE\ to \SB{}. For the purpose of comparison, Figure \ref{Scalability} (b) shows the performance of the Expat parser on each of the processor cores. A performance improvement of less than 5\% is observed when executing Expat on \CITHREE\ over \CO\ and less than 10\% on \SB\ over \CITHREE{}. As comparison, we also measured the performance of Expat on all the three cores, which is shown is Figure \ref{Scalability} (b). The performance improvement is less than 5\% by running Expat on \CITHREE\ instead of \CO\ and it is less than 10\% by running on \SB\ instead of \CITHREE{}. Parabix2 scales much better than Expat and is able to achieve an overall performance improvement up to 26\% simply by running the same code on a newer core. Further improvement on \SB\ with AVX will be discussed in the next section. Overall, Parabix2 scales better than Expat. Simply executing identical Parabix2 object code on \SB\ results in an overall performance improvement up to 26\%. Additional performance aspects of Parabix2 on \SB\ with AVX instructions are discussed in the following sections. \begin{figure} \includegraphics[width=0.40\textwidth]{plots/Expat_scalability.pdf} } \caption{Performance Parabix vs. Expat (y-axis: Total CPU Cycles per kB)} \caption{Average Performance Parabix vs. Expat (y-axis: CPU Cycles per kB)} \label{Scalability} \end{figure} \subsection{Power and Energy} The newer processors are not only designed to have better performance but also more energy-efficient. Figure \ref{power_Parabix2} shows the average power when running Parabix2 on \CO{}, \CITHREE\ and \SB\ with different input files. On \CO{}, the average power is about 32 watts. \CITHREE\ saves 30\% of the power compared with \CO{}. Figure \ref{power_Parabix2} shows the average power consumption of Parabix2 over each workload and as executed on each of the processor cores --- \CO{}, \CITHREE\ and \SB\. Average power consumption on \CO{} is 32 watts. Execution on \CITHREE\ results in 30\% power saving over \CO{}. \SB\ saves 25\% of the power compared with \CITHREE\ and consumes only 15 watts. The energy consumption is further improved by better performance, which means a shorter processing time, as we moved to the newer cores. As a result, Parabix2 on \SB\ cost 72\% to 75\% less energy than Parabix2 on \CO{}. In XML parsing we observe energy consumption is dependent on processing time. That is, a reduction in processing time results in a directly proportional reduction in energy consumption. With newer processor cores comes improvements in application performance. As a result, Parabix2 executed on \SB\ consumes 72\% to 75\% less energy than Parabix2 on \CO{}. \begin{figure}