[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Parsing XML and IMR stuff



Hi All,

FYI Today Rob and I went on a exploration of XML parsers and had some discussions with some grads students on the IMR.

Topic 1: XML parsers SAX vs. DOM.

I e-mailed Simon a short paper discussing SAX and DOM last night. It's a good primer. If you want I can put a copy in the group account. SAX and DOM are the 2 standard ways most folks build XML parsers.

Okay so what is the difference? well DOM builds a DOM tree which resides in memory after the XML document is parsed. SAX is sort of like Flex ... it sends you events you need to deal with for each token it gets to. I am leaning toward DOM because of some nice features (including easy serialized XML output to files).

Topic 2: Java Data Bindings

Okay this was the exploration (or journey of discovery) I did today that unfortunately came up fruitless for reasons I'll get to below.

The idea is this, you want data structures in your native programming language that correspond to the "things" you define in a dtd or XML schema. (for GXL, things like "Node", "Graph", "Edge", etc.) What the data binding tool does is create classes corresponding to each "thing" you define automatically. So if you had one for Java and ran it on the gxl dtd you'd get a Node class with fields corresponding to values defined in the dtd. Pretty nice stuff and widely used.

Well, there ended up being a few big problems with trying to do it with GXL. First almost all data binding generators require a valid XML Schema definition and not a dtd :-( Second, the GXL XML Schema does not seem to be valid. I ran it through w3c's xsv validator and it comes up as invalid in several locations. I tried a few quick fixes, but Sun's jaxb does not like it, and I'm still getting errors according to the w3c. I have not tried XML SPY but it's validation tool is stricter than xsv, so I don't think I can generate a valid gxl XML Schema is the short term :-(

So at this point, IMR is looking harder to build that I expected on Thursday.

So you all might want to think about lots of opportunities to work on the IMR :-)

Yuzo