# Changeset 1302 for docs/Working

Ignore:
Timestamp:
Aug 8, 2011, 4:43:21 PM (8 years ago)
Message:

Create a directory for HPCA

Location:
docs/Working
Files:
2 edited

### Legend:

Unmodified
 r1187 be implemented using our existing Pablo language, avoiding low-level C programming. \section{DIV2 grouping} Given streams $S$ and $E$ that mark the starts and ends of Names. We can partition $E$ into length-sorted groups using div2-length strategy: $G_i = n(E_{i*2-1})|E_{i*2}$ First we calculate $T$, which marks the two positions after each name. Then we can get $G_1$ as follows. \begin{eqnarray*} T & = & E | n(E)\\ S_2 & = & n(n(S))\\ G_1 & = & S_2 \wedge T \\ \end{eqnarray*} $G_i (i>1)$ can be similarly determined as follows. \begin{eqnarray*} S_{i*2} & = & n(n(S_{(i-1)*2} \wedge \neg T)) \\ G_i & = & S_{i*2} \wedge T \\ \end{eqnarray*} Once the length sorting above has completed, then we can process names using bitscan techniques. For each $G_i (i>0)$: \begin{enumerate} \item Sequentially scan $G_i$ to find positions $P_i = \{p_{i,0},p_{i,1},...,p_{i,n},\}$. \item For each $k$ in $\{1,2...,n\}$, calculate hash value for name $N[p_{i,k-i*2}]...N[p_{i,k-1}]$ \item Look up hash table. If the name exist, process next name. Otherwise, if $N[p_{i,k-1}]$ is a delimiter (name length is odd), lookup name $N[p_{i,k-i*2}]...N[p_{i,k-2}] (i>1)$ or name $N[p_{1,k-2}]$. If it exists, get its GID $g$ and insert $N[p_{i,k-i*2}]...N[p_{i,k-1}]$ with GID $g$. If it dosen't exist, insert this name with a new GID and insert $N[p_{i,k-i*2}]...N[p_{i,k-1}]$ with the same GID. If $N[p_{i,k-1}]$ is not a delimiter (name length is even), insert $N[p_{i,k-i*2}]...N[p_{i,k-1}]$ with a new GID. \end{enumerate} % \section{Grouping Strategies} % \subsection{Length Grouping} % \subsubsection{ID grouping} % % \subsubsection{LOG grouping} % % \subsubsection{DIV grouping} % % \subsection{Length + hash Grouping} % % \section{Sorting Algorithms} % \subsection{Sequential Sorting} % % \subsection{Parallel Sorting} % % \section{Group Processing Strategies} % % \subsection{Linear SIMD Search} % % \subsection{Trie} % % \subsection{Hash} \end{document}