source: docs/Working/icXML/arch-namespace.tex @ 2522

Last change on this file since 2522 was 2522, checked in by nmedfort, 7 years ago


File size: 4.5 KB
1\subsection{Namespace Handling}
4% Should we mention canonical bindings or speculation? it seems like more of an optimization than anything.
6In XML, namespaces prevents naming conflicts when multiple vocabularies are used together.
7It is especially important when a vocabulary application-dependant meaning, such as when
8XML or SVG documents are embedded within XHTML files.
9Namespaces are bound to uniform resource identifiers (URIs), which are strings used to identify
10specific names or resources.
11On line 1 of Figure \ref{fig:namespace1}, the \verb|xmlns| attribute instructs the XML
12processor to bind the prefix \verb|p| to the URI ``\verb||'' and the default (empty)
13prefix to ``\verb||''. Thus to the XML processor, the \verb|title| on line 2 and
14\verb|price| on line 4 both read as \verb|"":title| and \verb|"":price|
15respectively, whereas on line 3 and 5, \verb|p:name| and \verb|price| are seen as
16\verb|"":name| and \verb|"":price|. Even though the actual element name
17\verb|price|, due to namespace scoping rules they are viewed as two uniquely-named items
18because the current vocabulary is determined by the namespace(s) that are in-scope.
221. & \verb|<book xmlns:p="" xmlns="">| \\
232. & \verb|  <title>BOOK NAME</title>| \\
243. & \verb|  <p:name>PUBLISHER NAME</p:name>| \\
254. & \verb|  <price>X</price>| \\
265. & \verb|  <price xmlns="">Y</price>| \\
276. & \verb|</book>| \\
29\caption{XML Namespace Example}
30\label {fig:namespace1}
34In both Xerces and \icXML{}, every URI has a one-to-one mapping to a URI ID.
35These persist for the lifetime of the application through the use of a global URI pool.
36Xerces maintains a stack of namespace scopes that is pushed (popped) every time a start tag (end tag) occurs
37in the document. Because a namespace declaration affects the entire element, it must be processed prior to
38grammar validation. This is a costly process considering that a typical namespaced XML document only comes
39in one of two forms:
40(1) those that declare a set of namespaces upfront and never change them, and
41(2) those that repeatedly modify the namespaces in predictable patterns.
43For that reason, \icXML{} contains an independent namespace stack and utilizes bit vectors to cheaply perform
44% speculation and
45scope resolution options with a single XOR operation---even if many alterations are performed.
46% performance advantage figure?? average cycles/byte cost?
47When a prefix is declared (e.g., \verb|xmlns:p=""|), a namespace binding is created that maps
48the prefix (which are assigned Prefix IDs in the symbol resolution process) to the URI.
49Each unique namespace binding has a unique namespace id (NSID) and every prefix contains a bit vector marking every
50NSID that has ever been associated with it within the document. For example, in Table \ref{tbl:namespace1}, the
51prefix binding set of \verb|p| and \verb|xmlns| would be \verb|01| and \verb|11| respectively.
52To resolve the in-scope namespace binding for each prefix, a bit vector of the currently visible namespaces is
53maintained by the system. By ANDing the prefix bit vector with the currently visible namespaces, the in-scope
54NSID can be found using a bit scan instruction.
55A namespace binding table, similar to Table \ref{tbl:namespace1}, provides the actual URI ID.
60NSID & Prefix & URI & Prefix ID & URI ID \\ \hline\hline
610 & {\tt p} & {\tt} & 0 & 0 \\ \hline
621 & {\tt xmlns} & {\tt} & 1 & 1 \\ \hline
632 & {\tt xmlns} & {\tt} & 1 & 0 \\ \hline
65\caption{Namespace Binding Table Example}
70% PrefixBindings = PrefixBindingTable[prefixID];
71% VisiblePrefixBinding = PrefixBindings & CurrentlyVisibleNamespaces;
72% NSid = bitscan(VisiblePrefixBinding);
73% URIid = NameSpaceBindingTable[NSid].URIid;
75To ensure that scoping rules are adhered to,
76whenever a start tag is encountered, any modification to the currently visible namespaces is calculated and stored
77within a stack of bit vectors denoting the locally modified namespace bindings. When an end tag is found, the
78currently visible namespaces is XORed with the vector at the top of the stack.
79This allows any number of changes to be performed at each scope-level with a constant time.
80% Speculation can be handled by probing the historical information within the stack but that goes beyond the scope of this paper.
Note: See TracBrowser for help on using the repository browser.