# Changeset 1775 for docs/HPCA2012/final_ieee/11-conclusions.tex

Ignore:
Timestamp:
Dec 13, 2011, 5:37:36 PM (8 years ago)
Message:

Minor fixes; figure placement

File:
1 edited

### Legend:

Unmodified
 r1774 % Future research In this paper we presented Parabix, a software runtime framework for This paper presents Parabix as a software runtime framework for exploiting SIMD data units found on commodity processors for text processing.  The Parabix framework allows programmers to focus on exposing the abstract SIMD machine without worrying about or having to change code to handle processor specifics (e.g., 128-bit SIMD SSE vs 256-bit SIMD on AVX). We applied Parabix technology to a widely deployed application, XML parsing and demonstrate the efficiency gains that can on AVX). Parabix technology was applied to XML parsing to demonstrate the efficiency gains that can be obtained on commodity processors. Compared to the conventional XML parsers, Expat and Xerces, we achieve 2$\times$---7$\times$ parsers, Expat and Xerces, a 2$\times$---7$\times$ improvement in performance and average 4$\times$ improvement in energy. We achieve high compute efficiency with an overall 9$\times$---15$\times$ reduction in branches, 7$\times$---15$\times$ reduction in branch mispredictions, % ?\times$reduction in LLC misses, and increase in data parallelism and process up to 128 characters with a single operation. We used the Parabix framework and XML parsers to study the features of the new 256-bit AVX extension in Intel processors. We find that while the move to 3-operand instructions deliver significant benefit the wider operations in some cases have higher overheads compared to the existing 128-bit SSE operations. We also compare Intel's SIMD extensions against the ARM \NEON{}. Note that Parabix allowed us to energy was achieved. Furthermore, computational efficiency was greatly increased, with an overall 9$\times$---15$\times$reduction in branches and 7$\times$---15$\times$reduction in branch mispredictions. The Parabix framework and XML parsers was also used to study the features of the new 256-bit AVX extension in Intel processors. While the move to 3-operand instructions delivers significant benefits, the advantage of loads and bitwise logic with 256 bits at a time was negated by the need to convert to 128 bit SIMD registers for integer operations. We expect this will be remedied with AVX2. Intel's SIMD extensions were also compared with the ARM \NEON{}. Note that Parabix allowed us to perform these studies without having to change the application source. Finally, we parallelized the Parabix XML parser to take advantage of the SIMD units in every core on the chip. We demonstrate that the Finally, the Parabix XML parser was parallelized to take advantage of the SIMD units in every core on the chip, demonstrating that the benefits of thread-level-parallelism are complementary to the fine-grain parallelism we exploit; parallelized Parabix achieves a fine-grain parallelism we exploit. In this study, our parallelized Parabix achieves a further 2$\times\$ improvement in performance.