[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Parsing XML and IMR stuff
Hi All,
FYI Today Rob and I went on a exploration of XML parsers and had some
discussions with some grads students on the IMR.
Topic 1: XML parsers SAX vs. DOM.
I e-mailed Simon a short paper discussing SAX and DOM last night. It's a
good primer. If you want I can put a copy in the group account. SAX and DOM
are the 2 standard ways most folks build XML parsers.
Okay so what is the difference? well DOM builds a DOM tree which resides in
memory after the XML document is parsed. SAX is sort of like Flex ... it
sends you events you need to deal with for each token it gets to. I am
leaning toward DOM because of some nice features (including easy serialized
XML output to files).
Topic 2: Java Data Bindings
Okay this was the exploration (or journey of discovery) I did today that
unfortunately came up fruitless for reasons I'll get to below.
The idea is this, you want data structures in your native programming
language that correspond to the "things" you define in a dtd or XML schema.
(for GXL, things like "Node", "Graph", "Edge", etc.) What the data binding
tool does is create classes corresponding to each "thing" you define
automatically. So if you had one for Java and ran it on the gxl dtd you'd
get a Node class with fields corresponding to values defined in the dtd.
Pretty nice stuff and widely used.
Well, there ended up being a few big problems with trying to do it with
GXL. First almost all data binding generators require a valid XML Schema
definition and not a dtd :-( Second, the GXL XML Schema does not seem to be
valid. I ran it through w3c's xsv validator and it comes up as invalid in
several locations. I tried a few quick fixes, but Sun's jaxb does not like
it, and I'm still getting errors according to the w3c. I have not tried XML
SPY but it's validation tool is stricter than xsv, so I don't think I can
generate a valid gxl XML Schema is the short term :-(
So at this point, IMR is looking harder to build that I expected on
Thursday.
So you all might want to think about lots of opportunities to work on the
IMR :-)
Yuzo