XML解析

Android

陈广禄

字数统计: 3.2k阅读时长: 15 min

 2018/10/19   Share

Java中xml解析

#概述
一般情况下服务器端返回给客户端的数据主要包括三种类型，字符串，此处输入代码xml，json，什么是XML？可扩展标记语言，标准通用标记语言的子集，是一种用于标记电子文件使其具有结构性的标记语言。特点，可读性强，传输效率低。它只有字典一种数据格式，是一种树型存储结构，必须有根节点，逐级嵌套（用树的思想来思考），不像JSON一样是键值对存储数据，不过也稍微有那样的意思，XML是用标签和值来存储数据。XML数据写入（保存）文件可以数据的持久性，便于网络传输。实际应用场景比较少，相反json解析用的比较广泛。目前还有一些免费的接口还是会返回xml形式数据，因此学习xml解析还是有必要的！
官网提供的xml解析方式有三种： 1.Dom解析。2.Sax解析。3.Pull解析。
解析步骤：

在Assets文件夹中模拟创建XML数据

创建对应XML的Bean对象

开始解析

扩展方法：分别为JDOM解析、DOM4J。他们是在基础方法中扩展出来的。

######[注释]：需要导入相应的jar包依赖。

在Assets文件夹中模拟创建XML文件

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
    <book id="1">
        <name>骆驼祥子</name>
        <author>小陈</author>
        <year>2010</year>
        <price>86</price>
    </book>
    <book id="2">
        <name>老人与海</name>
        <author>小万</author>
        <year>2008</year>
        <price>100</price>
        <language>English</language>
    </book>

</bookstore>

一、Dom解析

优点：
1.形成了树形结构，有助于更好的理解、掌握，容易编写。
缺点：
1.将整个xml文件加载到内存中，解析时是一次性读取，对内存消耗比价大。
2.针对于一些大型数据结构，如果xml文件较大时，影响解析性能导致内存溢出等情况。因此不推荐！

#####代码实现：

public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        String fileName = "aa.xml";
        try {
            InputStream in = getResources().getAssets().open(fileName);
            dom2xml(in);
        } catch (IOException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

     public List<BOOK> dom2xml(InputStream is) throws Exception {
        List<BOOK> list = new ArrayList<>();
      //创建DocumentBuilderFactory对象
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
        //documentBuilder的parse的方法加载xml
        Document document = documentBuilder.parse(is);
        //获取标签
        NodeList booklist = document.getElementsByTagName("book");

        for (int i = 0; i < booklist.getLength(); i++) {
            //获取book节点
             Node book_item = booklist.item(i);
            BOOK book = new BOOK();
            //获取book节点的属性所有集合
            NamedNodeMap attributes = book_item.getAttributes();
            for (int j = 0; j < attributes.getLength(); j++) {
                Node attributeItem = attributes.item(j);
                String nodeName = attributeItem.getNodeName();
                String nodeValue = attributeItem.getNodeValue();
                System.out.println("属性名为："+nodeName);
                System.out.println("属性值为："+nodeValue);
            }

            NodeList childNodes = book_item.getChildNodes();

            for (int k = 0; k < childNodes.getLength(); k++) {
                Node childItem = childNodes.item(k);

                if ("name".equals(childItem.getNodeName())) {
                    System.out.println("名字："+childItem.getFirstChild().getNodeValue());
                    book.setName(childItem.getFirstChild().getNodeValue());
                } else if ("author".equals(childItem.getNodeName())) {
                    System.out.println("作者："+childItem.getFirstChild().getNodeValue());
                    book.setAuthor(childItem.getFirstChild().getNodeValue());
                } else if ("year".equals(childItem.getNodeName())) {
                    System.out.println("年份："+childItem.getFirstChild().getNodeValue());
                    book.setYear(childItem.getFirstChild().getNodeValue());
                } else if ("price".equals(childItem.getNodeName())) {
                    System.out.println("价格："+childItem.getFirstChild().getNodeValue());
                    book.setPrice(childItem.getFirstChild().getNodeValue());

                } else if ("language".equals(childItem.getNodeName())) {
                    System.out.println("语言："+childItem.getFirstChild().getNodeValue());
                    book.setLanguage(childItem.getFirstChild().getNodeValue());
                }
            }

            list.add(book);

        }
        return list;

    }
}

#####运行结果为：

I/System.out: 属性名为：id
I/System.out: 属性值为：1
I/System.out: 名字：骆驼祥子
I/System.out: 作者：小陈
I/System.out: 年份：2010
I/System.out: 价格：86
I/System.out: 属性名为：id
I/System.out: 属性值为：2
I/System.out: 名字：老人与海
I/System.out: 作者：小万
I/System.out: 年份：2008
I/System.out: 价格：100
I/System.out: 语言：English

二、SAX解析

通过自己创建的一个Handler类去逐个分析每一个节点，按顺序从最外层至里层依次进行。
优点：
1.相对于Dom解析，对内存消耗比较小。
2.适用于只处理XML文件中的数据。
3.SAX解析XML文件采用的是事件驱动，它并不需要解析完整个文档，而是按内容顺序解析文档的过程
缺点：
1.编码起来比较麻烦。代码量较大。不易编码
2.很难同时访问XML文件中的多处不同数据。

#####代码实现


public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        String fileName = "aa.xml";
        try {
            InputStream in = getResources().getAssets().open(fileName);
            sax2xml(in);
            System.out.println(sax2xml(in).size() + "开始");
        } catch (IOException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public List<Book> sax2xml(InputStream is) throws Exception {
        SAXParserFactory factory = SAXParserFactory.newInstance();
        //获取SAXParser解析实例，
        SAXParser sp = factory.newSAXParser();
        //新建解析处理器
        MyHandler handler = new MyHandler();
        //将解析交给处理器
        sp.parse(is, handler);
        //返回List
        return handler.getList();

    }

    public class MyHandler extends DefaultHandler {

        private List<Book> list;
        private Book book;
        //用于存储读取的临时变量
        private String tempString;


        /**
         * 每读到一个元素就调用该方法
         * 用来遍历xml文件开始标签
         *
         * @param uri
         * @param localName
         * @param qName      属性名
         * @param attributes
         * @throws SAXException
         */
        @Override
        public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
            if ("book".equals(qName)) {

                System.out.println("+++++++++" + "开始遍历某一本书内容" + "+++++++++");
                //读到student标签
                book = new Book();
                String value = attributes.getValue("id");
                System.out.println("属性值：" + value);
                for (int i = 0; i < attributes.getLength(); i++) {
                    System.out.println("第" + (i + 1) + "个属性名是:" + attributes.getQName(i));
                    System.out.println("第" + (i + 1) + "个属性值是:" + attributes.getValue(i));
                    if (attributes.getQName(i).equals("id")) {
                        book.setId(attributes.getValue(i));
                    }
                }
                //获取节点名
                if (!qName.equals("book") && !qName.equals("bookstore")) {
                    System.out.println("节点名：" + qName);
                }
            }
            super.startElement(uri, localName, qName, attributes);
        }

        /**
         * 读到元素的结尾调用
         * 用来遍历xml文件结束标签
         *
         * @param uri
         * @param localName
         * @param qName
         * @throws SAXException
         */
        @Override
        public void endElement(String uri, String localName, String qName) throws SAXException {
            if (qName.equals("name")) {
                book.setName(tempString);
            } else if (qName.equals("author")) {
                book.setAuthor(tempString);
            } else if (qName.equals("year")) {
                book.setYear(tempString);
            } else if (qName.equals("price")) {
                book.setPrice(tempString);
            } else if (qName.equals("language")) {
                book.setLanguage(tempString);
            } else if (qName.equals("book")) {
                System.out.println("+++++++++" + "结束遍历某一本书内容" + "+++++++++");
                 list.add(book);
            }
            super.endElement(uri, localName, qName);
        }

        /**
         * 解析到文档开始调用，一般做初始化操作
         * 用来标志解析开始
         *
         * @throws SAXException
         */
        @Override
        public void startDocument() throws SAXException {
            list = new ArrayList<>();
            super.startDocument();
            System.out.println("SAX解开始");
        }

        /**
         * 解析到文档末尾调用，一般做回收操作
         * 标志解析结束
         *
         * @throws SAXException
         */
        @Override
        public void endDocument() throws SAXException {
            super.endDocument();
            System.out.println("SAX解结束");
        }

        /**
         * 读到属性内容调用
         *
         * @param ch
         * @param start
         * @param length
         * @throws SAXException
         */
        @Override
        public void characters(char[] ch, int start, int length) throws SAXException {
            //获取节点值
            tempString = new String(ch, start, length);
            super.characters(ch, start, length);
            if (!tempString.trim().equals("")) {
                System.out.println("节点值：" + tempString);
            }
        }

        /**
         * 获取该List
         *
         * @return
         */
        public List<Book> getList() {
            return list;
        }
    }

}

#####运行结果为：

I/System.out: SAX解开始
I/System.out: +++++++++开始遍历某一本书内容+++++++++
I/System.out: 属性值：1
I/System.out: 第1个属性名是:id
I/System.out: 第1个属性值是:1
I/System.out: 节点值：骆驼祥子
I/System.out: 节点值：小陈
I/System.out: 节点值：2010
I/System.out: 节点值：86
I/System.out: +++++++++结束遍历某一本书内容+++++++++
I/System.out: +++++++++开始遍历某一本书内容+++++++++
I/System.out: 属性值：2
I/System.out: 第1个属性名是:id
I/System.out: 第1个属性值是:2
I/System.out: 节点值：骆驼祥子
I/System.out: 节点值：小万
I/System.out: 节点值：2008
I/System.out: 节点值：100
I/System.out: 节点值：English
I/System.out: +++++++++结束遍历某一本书内容+++++++++
I/System.out: SAX解结束

三、PULL解析

解决了Dom和Sax的所有问题，相对较好。也是官方强烈推荐的一款解析框架。

#####代码实现

public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        try {
            InputStream in = getResources().getAssets().open("aa.xml");
           List<Student> listResult = pull2xml(in);
            for (int i = 0; i < listResult.size(); i++) {
                 System.out.println("名字" + listResult.get(i).getName());
                System.out.println("作者：" + listResult.get(i).getAuthor());
                System.out.println("年份：" + listResult.get(i).getYear());
                System.out.println("价格：" + listResult.get(i).getPrice());
                System.out.println("语言：" + listResult.get(i).getLanguage());
        } catch (IOException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    public List<Student> pull2xml(InputStream is) throws Exception {
        List<Student> list = null;
        Student student = null;
        //创建xmlPull解析器
        XmlPullParser parser = Xml.newPullParser();
        ///初始化xmlPull解析器
        parser.setInput(is, "utf-8");
        //读取文件的类型
        int type = parser.getEventType();
        //无限判断文件类型进行读取
        while (type != XmlPullParser.END_DOCUMENT) {
            switch (type) {
                //开始标签
                case XmlPullParser.START_TAG:
                    if ("students".equals(parser.getName())) {
                        list = new ArrayList<>();
                    } else if ("student".equals(parser.getName())) {
                        student = new Student();
                        String id = parser.getAttributeValue(null, "id");
                        student.setId(id);
                    } else if ("name".equals(parser.getName())) {
                        //获取sex属性
                        String sex = parser.getAttributeValue(null, "sex");
                        student.setSex(sex);
                        //获取name值
                        String name = parser.nextText();
                        student.setName(name);
                    } else if ("nickName".equals(parser.getName())) {
                        //获取nickName值
                        String nickName = parser.nextText();
                        student.setNickName(nickName);
                    } else if ("author".equals(parser.getName())) {
                        //获取nickName值
                        String author = parser.nextText();
                        student.setAuthor(author);
                    } else if ("year".equals(parser.getName())) {
                        //获取nickName值
                        String year = parser.nextText();
                        student.setYear(year);
                    } else if ("price".equals(parser.getName())) {
                        //获取nickName值
                        String price = parser.nextText();
                        student.setPrice(price);
                    } else if ("language".equals(parser.getName())) {
                        //获取nickName值
                        String language = parser.nextText();
                        student.setLanguage(language);
                    }
                    break;
                //结束标签
                case XmlPullParser.END_TAG:
                    if ("student".equals(parser.getName())) {
                        list.add(student);
                    }
                    break;
            }
            //继续往下读取标签类型
            type = parser.next();
        }
        return list;
    }

}

#####运行结果为：

I/System.out: 名字：骆驼祥子
I/System.out: 作者：小陈
I/System.out: 年份：2010
I/System.out: 价格：86
I/System.out: 名字：老人与海
I/System.out: 作者：小万
I/System.out: 年份：2008
I/System.out: 价格：100
I/System.out: 语言：English

四、JDOM解析

优点：
1.没有向下兼容的限制，因此比DOM简单
2.速度快，易于使用，具有SAX的JAVA规则

缺点：
1.不支持于DOM中相应的遍历包
2.不能处理大于内存的文档

#####代码实现

public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        String fileName = "aa.xml";
        try {
            InputStream in = getResources().getAssets().open(fileName);
               jdom(in);
            }
        } catch (IOException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    private List<BOOK> jdom(InputStream in) throws Exception {
        List<BOOK> list = new ArrayList<>();
        SAXBuilder saxBuilder = new SAXBuilder();
        //通过saxBuilder的build方法将xml加载进来
        Document document = saxBuilder.build(in);
        //获取根节点
        Element rootElement = (Element) document.getRootElement();
        //获取根节点下的子节点集合
        List<Element> bookList = rootElement.getChildren();
        for (Element element : bookList) {
            BOOK book = new BOOK();
            List<Attribute> attrList = element.getAttributes();
            for (Attribute attr : attrList) {

                String name = attr.getName();
                String value = attr.getValue();
                System.out.println("属性名：" + name + "----属性值：" + value);
                if ("id".equals(name)) {
                    book.setId(value);
                }
            }
            List<Element> childrens = element.getChildren();
            for (Element child : childrens) {

                String name = child.getName();
                String value = child.getValue();
                System.out.println("节点值：" + name + "-----节点值" + value);
                if ("name".equals(name)) {
                    book.setName(value);
                } else if ("author".equals(name)) {
                    book.setAuthor(value);
                } else if ("year".equals(name)) {
                    book.setYear(value);
                } else if ("price".equals(name)) {
                    book.setPrice(value);
                } else if ("language".equals(name)) {
                    book.setLanguage(value);
                }
            }
            list.add(book);
        }
        return list;
    }
}

#####运行结果为：

I/System.out: 属性名：id----属性值：1
I/System.out: 节点值：name-----节点值骆驼祥子
I/System.out: 节点值：author-----节点值小陈
I/System.out: 节点值：year-----节点值2010
I/System.out: 节点值：price-----节点值86
I/System.out: 属性名：id----属性值：2
I/System.out: 节点值：name-----节点值骆驼祥子
I/System.out: 节点值：author-----节点值小万
I/System.out: 节点值：year-----节点值2008
I/System.out: 节点值：price-----节点值100
I/System.out: 节点值：language-----节点值English

五、DOM4J解析

1.DOM4J 是一个非常非常优秀的Java XML API，具有性能优异、功能强大和极端易用使用的特点，同时它也是一个开放源代码的软件。
2.它合并了许多超出基本的XML文档表示功能
如今越来越多的 Java 软件都在使用 DOM4J 来读写 XML，特别值得一提的是连 Sun 的 JAXM 也在用 DOM4J。因此是非常值得推荐的一款软件。

#####代码实现

public class MainActivity extends AppCompatActivity {

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        SAXReader reader = new SAXReader();
		try {
			// 通过reader对象的read方法加载books.xml文件,获取docuemnt对象。
			Document document = reader.read(new File("src/res/books.xml"));
			// 通过document对象获取根节点bookstore
			Element bookStore = document.getRootElement();
			// 通过element对象的elementIterator方法获取迭代器
			Iterator it = bookStore.elementIterator();
			// 遍历迭代器，获取根节点中的信息（书籍）
			while (it.hasNext()) {
				System.out.println("=====开始遍历某一本书=====");
				Element book = (Element) it.next();
				// 获取book的属性名以及 属性值
				List<Attribute> bookAttrs = book.attributes();
				for (Attribute attr : bookAttrs) {
					System.out.println("属性名：" + attr.getName() + "--属性值："
							+ attr.getValue());
				}
				Iterator itt = book.elementIterator();
				while (itt.hasNext()) {
					Element bookChild = (Element) itt.next();
					System.out.println("节点名：" + bookChild.getName() + "--节点值：" + bookChild.getStringValue());
				}
				System.out.println("=====结束遍历某一本书=====");
			}
		} catch (DocumentException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

   }
}

上述代码就是xmL常用的五大解析方法。感谢你花费时间阅读这篇文章，祝你生活愉快！

作者 [@陈广禄]
2018 年 10月 18日

Next Post

Java 8 之enum枚举
Previous Post

DIST-Vue-SuperMap

CATALOG

1. Java中xml解析

