Changeset 1389 for docs/HPCA2012


Ignore:
Timestamp:
Aug 26, 2011, 3:26:28 PM (8 years ago)
Author:
lindanl
Message:

fix spelling mistakes

Location:
docs/HPCA2012
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/06-scalability.tex

    r1386 r1389  
    33\subsection{Performance}
    44In this section, we study the performance of the XML parsers across
    5 three generations of intel architectures.  Figure \ref{Scalability}
     5three generations of Intel architectures.  Figure \ref{Scalability}
    66(a) shows the average execution time of Parabix.  We analyze the
    77execution time in terms of SIMD operations that operate on bitstreams
  • docs/HPCA2012/07-avx.tex

    r1365 r1389  
    33In this section, we discuss the scalability and performance advantages
    44of our 256-bit AVX (Advanced Vector Extensions) Parabix XML port.  The
    5 Parabix SIMD libraries originally targetted the 128-bit SSE2 SIMD
     5Parabix SIMD libraries originally targeted the 128-bit SSE2 SIMD
    66technology available on all modern 64-bit Intel and AMD processors but
    77has recently been ported to AVX. AVX technology is commercially
     
    101101benefits.  Based on the reduction of overall Bitwise-SIMD instructions
    102102we expected a 11\% improvement in performance.  Instead, perhaps
    103 bizzarely, the performance of Parabix in the 256-bit AVX
     103bizarrely, the performance of Parabix in the 256-bit AVX
    104104implementation does not improve significantly and actually degrades
    105 for files with higher markup density (average 11\%). Dewiki.xml, on
     105for files with higher markup density (average 11\%). dew.xml, on
    106106which bitwise-SIMD instructions reduced by 39\%, saw a performance
    107107improvement of 8\%.  We believe that this is primarily due to the
    108 intricacies of the first generation AVX implemention in \SB{}, with
     108intricacies of the first generation AVX implementation in \SB{}, with
    109109significant latency in many of the 256-bit instructions in comparison
    110110to their 128-bit counterparts. The 256-bit instructions also have
Note: See TracChangeset for help on using the changeset viewer.