SAX parser – nasty behaviour

The other day an office colleague was looking for a strange error in his SAX-parsing class: Every now and then the data he got in his endElement() method was crippled, resulting in conversion problems.

A few searches revealed that the SAX-Api has a nasty behaviour: it does not garantee rules for buffer-handling. This is to be done by the client-application.

Instead of relying on complete data delivered to the characters method my colleague had to buffer the data hiomself. Using a bit of sample code it was an easy fix… but you first have to get the notion of this kind of reason for an otherwise seemingly unrelated problem…

This page shows the sample code that we recycled:

package some.pkg;
public void characters(char buf[], int offset, int len)
throws SAXException
{
  String s = new String(buf, offset, len);
  if (textBuffer == null) {
    textBuffer = new StringBuffer(s);
  } else {
    textBuffer.append(s);
  }
} 

The textBuffer can be reset in the startElement method.

About these ads


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.