Ignore:
Timestamp:
Aug 8, 2011, 4:43:21 PM (8 years ago)
Author:
lindanl
Message:

Create a directory for HPCA

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/parlensort.tex

    r1187 r1302  
    8989be implemented using our existing Pablo language, avoiding low-level C programming.
    9090
     91\section{DIV2 grouping}
     92Given streams $S$ and $E$ that mark the starts and ends of Names.
     93We can partition $E$ into length-sorted groups using div2-length strategy: $G_i = n(E_{i*2-1})|E_{i*2}$
     94
     95First we calculate $T$, which marks the two positions after each name.
     96Then we can get $G_1$ as follows.
     97\begin{eqnarray*}
     98T & = & E | n(E)\\
     99S_2 & = & n(n(S))\\
     100G_1 & = & S_2 \wedge T \\
     101\end{eqnarray*}
     102
     103$G_i (i>1)$ can be similarly determined as follows.
     104
     105\begin{eqnarray*}
     106S_{i*2} & = & n(n(S_{(i-1)*2} \wedge \neg T)) \\
     107G_i & = & S_{i*2} \wedge T \\
     108\end{eqnarray*}
     109
     110Once the length sorting above has completed, then we can process
     111names using bitscan techniques.
     112
     113For each $G_i (i>0)$:
     114\begin{enumerate}
     115\item Sequentially scan $G_i$ to find positions $P_i = \{p_{i,0},p_{i,1},...,p_{i,n},\}$.
     116\item For each $k$ in $\{1,2...,n\}$, calculate hash value for name $N[p_{i,k-i*2}]...N[p_{i,k-1}]$
     117\item Look up hash table. If the name exist, process next name.
     118Otherwise, if $N[p_{i,k-1}]$ is a delimiter (name length is odd), lookup name $N[p_{i,k-i*2}]...N[p_{i,k-2}] (i>1)$ or name $N[p_{1,k-2}]$.
     119If it exists, get its GID $g$ and insert $N[p_{i,k-i*2}]...N[p_{i,k-1}]$ with GID $g$.
     120If it dosen't exist, insert this name with a new GID and insert $N[p_{i,k-i*2}]...N[p_{i,k-1}]$ with the same GID.
     121If $N[p_{i,k-1}]$ is not a delimiter (name length is even), insert $N[p_{i,k-i*2}]...N[p_{i,k-1}]$ with a new GID.
     122\end{enumerate}
     123
     124% \section{Grouping Strategies}
     125% \subsection{Length Grouping}
     126% \subsubsection{ID grouping}
     127%
     128% \subsubsection{LOG grouping}
     129%
     130% \subsubsection{DIV grouping}
     131%
     132% \subsection{Length + hash Grouping}
     133%
     134% \section{Sorting Algorithms}
     135% \subsection{Sequential Sorting}
     136%
     137% \subsection{Parallel Sorting}
     138%
     139% \section{Group Processing Strategies}
     140%
     141% \subsection{Linear SIMD Search}
     142%
     143% \subsection{Trie}
     144%
     145% \subsection{Hash}
     146
    91147\end{document}
    92148
Note: See TracChangeset for help on using the changeset viewer.