Tryag File Manager
Home
-
Turbo Force
Current Path :
/
proc
/
self
/
root
/
usr
/
share
/
doc
/
PyXML-0.8.4
/
xmlproc
/
Upload File :
New :
File
Dir
//proc/self/root/usr/share/doc/PyXML-0.8.4/xmlproc/xmlproc_tut.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML> <HEAD> <TITLE>An overview of xmlproc</TITLE> <META NAME="Author" CONTENT="Lars Marius Garshol"> <META NAME="Generator" CONTENT="Homemade"> <META NAME="Description" CONTENT="This page documents the rough structure of xmlproc."> <META NAME="Keywords" CONTENT="XML, Python, XML parser"> <LINK REL=stylesheet HREF="standard.css" TYPE="text/css" MEDIA=screen> </HEAD> <BODY> <H1>An overview of xmlproc</H1> <P> xmlproc has been designed in a highly modular fashion, with the intention that it should be possible to reuse the modules in different contexts and applications. Emphasis has been placed on flexibility, leading to a rather large and perhaps bewildering interface. This document attempts to explain the different pieces and how they fit together. </P> <h2>The command-line parsers</h2> <P> xmlproc ships with two command-line parser applications for XML parsing. They are not useful for incorporating xmlproc in your own applications, but can be used to check that xmlproc is working, to verify documents and also to see how the parser interprets documents, by making them output canonical XML or ESIS. </P> <P> The diagram below shows how these applications use the xmlproc APIs. </P> <img src="cmdline.gif" ALT="[Diagram #1]"> <P> As the diagram shows one application (xpcmd.py) uses the object called XMLProcessor. This is the well-formedness parser that does not read the external DTD and which does not validate. The second application (xvcmd.py) uses an object called XMLValidator, which again uses the well-formedness parser XMLProcessor for basic parsing, but provides full validation on top of this. </P> <P> The command-line interfaces are documented <A HREF="cmdline.html">here</A>. </P> <h3>The DTD parser</h3> <p> The xmlproc distribution also includes a command-line application dtdcmd.py, which only uses the DTD parsing APIs of xmlproc and can parse DTD files directly. The command-line interface is as follows: </p> <pre> python dtdcmd.py [--list] <urltodtd>+ </pre> <p> Here <tt>--list</tt> makes the parser list all declarations after parsing a DTD, and <tt>urltodtd</tt> is a list of one or more DTD file names or URLs. </p> <h3>The DTD to XML Schema converter</h3> <p> The command-line application dtd2schema.py converts DTDs into equivalent XML Schema documents. The application takes one mandatory argument (the system identifier of the DTD) and one optional argument (the file name of the output document). If no output file name is specified, one is inferred from the input file name. </p> <p> The converter tries to do some naive inference of attribute groups and will produce output with attribute groups inferred from the DTD structure. </p> <h2>The parsing API</h2> <P> If you want to use xmlproc in an application of your own what you do is basically what I did with xpcmd.py and xvcmd.py: you use the xmlproc modulse in an application to provide it with XML parsing functionality. On top of that you must build whatever you want to use xmlproc for yourself. (In the case of xvcmd.py and xpcmd.py this is simply outputting parse results to the console and interpreting command-line options. </P> <P> The diagram below shows the main objects involved in this API: </P> <img src="basicapi.gif" ALT="[Diagram #2]"> <P> The central object is the Parser object, which can be either an XMLProcessor or an XMLValidator (they have the same interface, but only the latter validates and the former is faster). When it is created the Parser creates the four objects shown in the diagram. It also provides methods that can be used to make it use objects that you provide instead of these objects. (See the <A HREF="api-doco.html">documentation</A> for details</A>.) </P> <P> The roles of these four objects are: </P> <dl> <dt>Application <dd>This object receives all data events from the parsing of the document such as handle_start_tag, handle_end_tag, handle_data and so on. <dt>ErrorHandler <dd>This object receives all error events. <dt>PubIdResolver <dd>Whenever an entity reference is encountered, this object will be given the public identifier and system identifier (file name/URL) of the entity and asked to supply the system identifier to be used. <dt>InputSourceFactory <dd>Once the PubIdResolver has returned the correct file name/URL for the entity, the InputSourceFactory will be asked to create a file-like object from which the entity contents can be read. </dl> <P> So, to act on the document content, make an Application object and tell the parser to use it. To control error reporting, make an ErrorHandler. To control the resolution of public identifiers (and also to remap system identifiers), make a PubIdResolver. To add support for new kinds of URLs or to provide your own support for a class of URLs, make an InputSourceFactory. </P> <H2>Example</H3> <PRE><CODE> from xml.parsers.xmlproc import xmlproc class MyApplication(xmlproc.Application): pass # Add some useful stuff here p=xmlproc.XMLProcessor() # Make this xmlval.XMLValidator if you want to validate p.set_application(MyApplication()) p.parse_resource("foo.xml") </CODE></PRE> <h2>The GUI parser frontend</h2> <p> xmlproc includes a GUI application that can be used to parse XML documents in the wxValidator.py application. This application uses <a href="http://alldunn.com/wxPython/">wxPython</a> to create a more user-friendly interface, as shown below: </p> <img src="wxval.gif" alt="[Screenshot of wxValidator]"> <HR> <ADDRESS> Last update 2000-05-11 14:20, by <a href="mailto:larsga@garshol.priv.no">Lars Marius Garshol</a>. </ADDRESS> </DIV> </BODY> </HTML>