Oct 13, 2012, 3:36:24 PM (7 years ago)

progress on namespace section; started error handling

1 edited


  • docs/Working/icXML/arch-namespace.tex

    r2439 r2449  
    4 % Xerces stack-oriented vs icXML's bit-field oriented approach
     4% Should we mention canonical bindings or speculation? it seems like more of an optimization than anything.
     6In XML, namespaces prevents naming conflicts when multiple vocabularies are used together.
     7It is especially important when a vocabulary application-dependant meaning, such as when
     8XML or SVG documents are embedded within XHTML files.
     9Namespaces are bound to uniform resource identifiers (URIs), which are strings used to identify
     10specific names or resources.
     11On line 1 of Figure \ref{fig:namespace1}, the \verb|xmlns| attribute instructs the XML
     12processor to bind the prefix \verb|p| to the URI ``\verb|pub.net|'' and the default (empty)
     13prefix to ``\verb|book.org|''. Thus to the XML processor, the \verb|title| on line 2 and
     14\verb|price| on line 4 both read as \verb|"book.org":title| and \verb|"book.org":price|
     15respectively, whereas on line 3 and 5, \verb|p:name| and \verb|price| are seen as
     16\verb|"pub.net":name| and \verb|"pub.net":price|. Even though the actual element name
     17\verb|price|, due to namespace scoping rules they are viewed as two uniquely-named items
     18because the current vocabulary is determined by the namespace(s) that are in-scope.
     221. & \verb|<book xmlns:p="pub.net" xmlns="book.org">| \\
     232. & \verb|  <title>BOOK NAME</title>| \\
     243. & \verb|  <p:name>PUBLISHER NAME</p:name>| \\
     254. & \verb|  <price>X</price>| \\
     265. & \verb|  <price xmlns="publisher.net">Y</price>| \\
     276. & \verb|</book>| \\
     29\label {fig:namespace1}
     30\caption{XML Namespace Example}
     34In Xerces, every URI is mapped to a unique URI ID number.
     35These IDs persist throughout the lifetime of the application.
     36Xerces maintains a stack of namespace scopes that is pushed (popped) every time a start tag (end tag) occurs
     37in the document. Because a namespace declaration affects the entire element, it must be processed prior to
     38grammar validation. This is a costly process considering that a typical namespaced XML document only comes
     39in one of two forms:
     40(1) those that declare a set of namespaces upfront and never change them, and
     41(2) those that repeatidly modify the namespace scope within the document in predictable patterns.
     46NSID & Prefix & URI & Prefix ID & URI ID \\ \hline\hline
     470 & {\tt p} & {\tt pub.net} & 0 & 0 \\ \hline
     481 & {\tt xmlns} & {\tt books.org} & 1 & 1 \\ \hline
     492 & {\tt xmlns} & {\tt pub.net} & 1 & 0 \\ \hline
     51\caption{Namespace Binding Table Example}
     56For that reason, ICXML contains an independent namespace stack and utilizes bit vectors to cheaply perform
     57% speculation and
     58scope resolution options with a single XOR operation---even if many alterations are performed.
     59% performance advantage figure?? average cycles/byte cost?
     60When a prefix is declared (e.g., \verb|xmlns:p="pub.net"|), a namespace binding is created that maps
     61the prefix, which are assigned prefix ids in the symbol resolution process, to the URI.
     62Each unique URI is provided with an URI ID through the use of a global URI pool, similar to Xerces.
     63Each unique namespace binding has a unique namespace id (NSID) and every prefix contains a bit vector marking every
     64NSID that has ever been associated with it within the document. For example, in Table \ref{tbl:namespace1}, the
     65prefix binding set of \verb|p| and \verb|xmlns| would be \verb|01| and \verb|11| respectively.
     66To resolve the in-scope namespace binding for each prefix, a bit vector of the currently visible namespaces is
     67maintained by the system. By ANDing the prefix bit vector with the currently visible namespaces, the in-scope
     68NSID can be found using a bit scan instruction. A namespace binding table, similar to Table \ref{tbl:namespace1},
     69provides the actual URI ID.
     71% PrefixBindings = PrefixBindingTable[prefixID];
     72% VisiblePrefixBinding = PrefixBindings & CurrentlyVisibleNamespaces;
     73% NSid = bitscan(VisiblePrefixBinding);
     74% URIid = NameSpaceBindingTable[NSid].URIid;
     76To ensure that scoping rules are adhered to,
     77whenever a start tag is encountered, any modification to the currently visible namespaces is calculated and stored
     78within a stack of bit vectors denoting the locally modified namespace bindings. When an end tag is found, the
     79currently visible namespaces is XORed with the vector at the top of the stack.
     80% Speculation can be handled by probing the historical information within the stack but that goes beyond the scope of this paper.
Note: See TracChangeset for help on using the changeset viewer.