文档章节

Java SAX tutorial

HelloRookie
 HelloRookie
发布于 2018/09/12 11:15
字数 1432
阅读 10
收藏 2

Java SAX tutorial shows how to use Java SAX API to read and validate XML documents.

SAX

SAX (Simple API for XML) is an event-driven algorithm for parsing XML documents. SAX is an alternative to the Document Object Model (DOM). Where the DOM reads the whole document to operate on XML, SAX parsers read XML node by node, issuing parsing events while making a step through the input stream. SAX processes documents state-independently (the handling of an element does not depend on the elements that came before). SAX parsers are read-only.

SAX parsers are faster and require less memory. On the other hand, DOM is easier to use and there are tasks, such as sorting elements, rearranging elements or looking up elements, that are faster with DOM.

A SAX parser comes with JDK, so there is no need to dowload a dependency.

Java SAX parsing example

In the following example, we read an XML file with a SAX parser.

<build>
    <plugins>
        <plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>exec-maven-plugin</artifactId>
            <version>1.6.0</version>
            <configuration>
                <mainClass>com.zetcode.JavaReadXmlSaxEx</mainClass>
            </configuration>
        </plugin>
    </plugins>
</build>

We use the exec-maven-plugin to execute the Java main class from Maven.

users.xml

<?xml version="1.0" encoding="UTF-8"?>
<users>
    <user id="1">
        <firstname>Peter</firstname>
        <lastname>Brown</lastname>
        <occupation>programmer</occupation>
    </user>
    <user id="2">
        <firstname>Martin</firstname>
        <lastname>Smith</lastname>
        <occupation>accountant</occupation>
    </user>
    <user id="3">
        <firstname>Lucy</firstname>
        <lastname>Gordon</lastname>
        <occupation>teacher</occupation>
    </user>    
</users>

We are going to read this XML file.

User.java

package com.zetcode;

public class User {

    int id;
    private String firstName;
    private String lastName;
    private String occupation;

    public User() {
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getFirstName() {
        return firstName;
    }

    public void setFirstName(String firstName) {
        this.firstName = firstName;
    }

    public String getLastName() {
        return lastName;
    }

    public void setLastName(String lastName) {
        this.lastName = lastName;
    }

    public String getOccupation() {
        return occupation;
    }

    public void setOccupation(String occupation) {
        this.occupation = occupation;
    }

    @Override
    public String toString() {

        StringBuilder builder = new StringBuilder();
        builder.append("User{").append("id=").append(id)
                .append(", firstName=").append(firstName)
                .append(", lastName=").append(lastName)
                .append(", occupation=").append(occupation).append("}");

        return builder.toString();
    }
}

This is the user bean; it will hold data from XML nodes.

package com.zetcode;

import java.io.File;
import java.io.IOException;
import java.nio.file.Paths;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;

public class MyRunner {

    private SAXParser createSaxParser() {

        SAXParser saxParser = null;

        try {

            SAXParserFactory factory = SAXParserFactory.newInstance();
            saxParser = factory.newSAXParser();

            return saxParser;
        } catch (ParserConfigurationException | SAXException ex) {
        
            Logger lgr = Logger.getLogger(MyRunner.class.getName());
            lgr.log(Level.SEVERE, ex.getMessage(), ex);
        }

        return saxParser;
    }

    public List<User> parseUsers() {

        MyHandler handler = new MyHandler();
        String fileName = "src/main/resources/users.xml";
        File xmlDocument = Paths.get(fileName).toFile();

        try {

            SAXParser parser = createSaxParser();
            parser.parse(xmlDocument, handler);

        } catch (SAXException | IOException ex) {

            Logger lgr = Logger.getLogger(MyRunner.class.getName());
            lgr.log(Level.SEVERE, ex.getMessage(), ex);
        }

        return handler.getUsers();
    }
}

MyRunner creates a SAX parser and launches parsing. The parseUsers returns the parsed data in a list of User objects.

SAXParserFactory factory = SAXParserFactory.newInstance();
saxParser = factory.newSAXParser();

From the SAXParserFactory, we get the SAXParser.

SAXParser parser = createSaxParser();
parser.parse(xmlDocument, handler);

We parse the document with the parse() method. The second parameter of the method is the handler object, which contains the event handlers.

package com.zetcode;

import java.util.ArrayList;
import java.util.List;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class MyHandler extends DefaultHandler {

    private List<User> users = new ArrayList<>();
    private User user;

    private boolean bfn = false;
    private boolean bln = false;
    private boolean boc = false;

    @Override
    public void startElement(String uri, String localName,
            String qName, Attributes attributes) throws SAXException {

        if ("user".equals(qName)) {
        
            user = new User();
            
            int id = Integer.valueOf(attributes.getValue("id"));
            user.setId(id);
        }

        switch (qName) {

            case "firstname":
                bfn = true;
                break;

            case "lastname":
                bln = true;
                break;

            case "occupation":
                boc = true;
                break;
        }
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {

        if (bfn) {
            user.setFirstName(new String(ch, start, length));
            bfn = false;
        }

        if (bln) {
            user.setLastName(new String(ch, start, length));
            bln = false;
        }

        if (boc) {
            user.setOccupation(new String(ch, start, length));
            boc = false;
        }
    }

    @Override
    public void endElement(String uri, String localName,
            String qName) throws SAXException {

        if ("user".equals(qName)) {
            users.add(user);
        }
    }
    
    public List<User> getUsers() {
        
        return users;
    }
}

In the MyHandler class, we have the implementations of the event handlers.

public class MyHandler extends DefaultHandler {

The handler class must extend from the DefaultHandler, where we have the event methods.

@Override
public void startElement(String uri, String localName,
        String qName, Attributes attributes) throws SAXException {

    if ("user".equals(qName)) {
    
        user = new User();
        
        int id = Integer.valueOf(attributes.getValue("id"));
        user.setId(id);
    }

    switch (qName) {

        case "firstname":
            bfn = true;
            break;

        case "lastname":
            bln = true;
            break;

        case "occupation":
            boc = true;
            break;
    }
}

The startElement() method is called when the parser starts parsing a new element. We create a new user if the element is <user>. For other types of elements, we set boolean values.

@Override
public void characters(char[] ch, int start, int length) throws SAXException {

    if (bfn) {
        user.setFirstName(new String(ch, start, length));
        bfn = false;
    }

    if (bln) {
        user.setLastName(new String(ch, start, length));
        bln = false;
    }

    if (boc) {
        user.setOccupation(new String(ch, start, length));
        boc = false;
    }
}

The characters() method is called when the parser encounters text inside elements. Depending on the boolean variable, we set the user attributes.

@Override
public void endElement(String uri, String localName,
        String qName) throws SAXException {

    if ("user".equals(qName)) {
        users.add(user);
    }
}

At the end of the <user> element, we add the user object to the list of users.

package com.zetcode;

import java.util.List;

public class JavaReadXmlSaxEx  {

    public static void main(String[] args) {

        MyRunner runner = new MyRunner();
        List<User> lines = runner.parseUsers();
        
        lines.forEach(System.out::println);
    }
}

JavaReadXmlSaxEx starts the application. It delegates the parsing tasks to MyRunner. In the end, the retrieved data is printed to the console.

$ mvn exec:java -q
User{id=1, firstName=Peter, lastName=Brown, occupation=programmer}
User{id=2, firstName=Martin, lastName=Smith, occupation=accountant}
User{id=3, firstName=Lucy, lastName=Gordon, occupation=teacher}

This is the output of the example.

Java SAX validation example

The following example uses the XSD language to validate an XML file. XSD (XML Schema Definition) is the current standard schema language for all XML documents and data. (There are other alternative schema languages such as DTD and RELAX NG.) XSD is a set of rules to which an XML document must conform in order to be considered valid according to the schema.

users.xsd

<?xml version="1.0"?>

<xs:schema version="1.0"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified">
    
    <xs:element name="users">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="user" maxOccurs="unbounded" minOccurs="0">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element type="xs:string" name="firstname"/>
                            <xs:element type="xs:string" name="lastname"/>
                            <xs:element type="xs:string" name="occupation"/>
                        </xs:sequence>
                        <xs:attribute name="id" type="xs:int" use="required"/>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>    

</xs:schema>

This is the XSD file for validating users. It declares, for instance, that the <user> element must be within the <users> element or that the id attribute of <user> must be and integer and is mandatory.

JavaXmlSchemaValidationEx.java

package com.zetcode;

import java.io.File;
import java.io.IOException;
import java.io.Reader;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.XMLConstants;
import javax.xml.transform.sax.SAXSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

public class JavaXmlSchemaValidationEx {

    public static void main(String[] args) {

        File xsdFile = new File("src/main/resources/users.xsd");

        try {

            Path xmlPath = Paths.get("src/main/resources/users.xml");
            Reader reader = Files.newBufferedReader(xmlPath);

            String schemaLang = XMLConstants.W3C_XML_SCHEMA_NS_URI;
            SchemaFactory factory = SchemaFactory.newInstance(schemaLang);
            Schema schema = factory.newSchema(xsdFile);

            Validator validator = schema.newValidator();

            SAXSource source = new SAXSource(new InputSource(reader));
            validator.validate(source);

            System.out.println("The document was validated OK");

        } catch (SAXException ex) {
            
            Logger lgr = Logger.getLogger(JavaXmlSchemaValidationEx.class.getName());
            lgr.log(Level.SEVERE, "The document failed to validate");
            lgr.log(Level.SEVERE, ex.getMessage(), ex);
        } catch (IOException ex) {
            
            Logger lgr = Logger.getLogger(JavaXmlSchemaValidationEx.class.getName());
            lgr.log(Level.SEVERE, ex.getMessage(), ex);
        }
    }
}

The example uses the users.xsd schema to validate the users.xml file.

String schemaLang = XMLConstants.W3C_XML_SCHEMA_NS_URI;
SchemaFactory factory = SchemaFactory.newInstance(schemaLang);
Schema schema = factory.newSchema(xsdFile);

With the SchemaFactory we choose the W3C XML schema for our schema definition. In other words, our custom schema definition must also adhere to certain rules.

Validator validator = schema.newValidator();

A new validator is generated from the schema.

SAXSource source = new SAXSource(new InputSource(reader));
validator.validate(source);

We validate the XML document against the provided schema.

} catch (SAXException ex) {
    
    Logger lgr = Logger.getLogger(JavaXmlSchemaValidationEx.class.getName());
    lgr.log(Level.SEVERE, "The document failed to validate");
    lgr.log(Level.SEVERE, ex.getMessage(), ex);
}

By default, if the document is not valid, a SAXException is thrown.

In this tutorial, we have read and validated an XML document with Java SAX. You might also be interested in the related tutorials: Java DOM tutorialJava Servlet serving XML, and Java tutorial.

本文转载自:http://zetcode.com/java/sax/

HelloRookie
粉丝 4
博文 149
码字总数 26183
作品 0
广州
程序员
私信 提问
XML解析(DOM、SAX、JDOM和DOM4J)

众所周知,现在解析XML的方法越来越多,但主流的方法也就四种,即:DOM、SAX、JDOM和DOM4J 下面首先给出这四种方法的jar包下载地址 DOM:在现在的Java JDK里都自带了,在xml-apis.jar包里 SA...

拷贝忍者卡卡习
2017/01/18
0
0
详解Java解析XML的四种方法

XML现在已经成为一种通用的数据交换格式,平台的无关性使得很多场合都需要用到XML。本文将详细介绍用Java解析XML的四种方法。 XML现在已经成为一种通用的数据交换格式,它的平台无关性,语言无关...

hchen1982
2011/08/08
0
0
java解析xml文件四种方式介绍、性能比较和基本使用方法

一、基本介绍: 1)DOM(JAXP Crimson解析器) DOM是用与平台和语言无关的方式表示XML文档的官方W3C标准。DOM是以层次结构组织的节点或信息片断的集合。这个层次结构允许开发人员在树中寻找特...

Candy_Desire
2014/08/20
0
0
java解析xml的四种方法汇总

众所周知,现在解析XML的方法越来越多,但主流的方法也就四种,即:DOM、SAX、JDOM和DOM4J 下面首先给出这四种方法的jar包下载地址 DOM:在现在的Java JDK里都自带了,在xml-apis.jar包里 SA...

LYQ1990
2017/10/26
0
0
dom4j 2.0.2 和 2.1.0 版本发布,XML 操作库

dom4j 2.0.2发布,主要解决以下问题: StringIndexOutOfBoundsException in XMLWriter.writeElementContent() (#26) TreeNode has grown some generics 同时2.1.0发布,此分支最低支持Java ......

wenbody
2017/09/18
1K
5

没有更多内容

加载失败,请刷新页面

加载更多

spdlog静态库方式

spdlog新版本提供了静态库方式,这样比原来的header only方式显著提升了编译速度。 这里分析一下怎么使用: 根目录下的CMakeLists.txt中主要有如下内容: ...include(cmake/ide.cmake) // ...

chuqq
31分钟前
2
0
转载《Java 开发手册》今日发布,向全球开发者致敬!

致全球Java开发者: 代码是二进制世界的交流方式,极致的代码是我们的荣耀。 2017年春天,《阿里巴巴Java开发手册》发布,我们希望在涵盖编程规约、异常日志、单元测试、安全规约、MySQL数据...

薛定谔的旺
35分钟前
41
0
Windows常用快捷键

1、Ctrl+Shift+T 恢复误关网页 2、Ctrl+Y 反撤销(Ctrl+Z撤销)

南风末
36分钟前
0
0
获取一个字符串相同的字符出现的次数及字符分别是

// 统计一个String字符串中出现的相同字符的次数 及分别是什么 public static void charStat(String str){ long start = System.currentTimeMillis(); Map<Character,In......

凉城旧影
43分钟前
3
0
Nodejs 10以上版本不支持Deepin 问题

I solved this problem by this way : Copy script from https://deb.nodesource.com/setup_12.x. Find line DISTRO=$(lsb_release -c -s) and change to DISTRO="jessie". Save all script ......

SuShine
47分钟前
3
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部