文档章节

Android WebView -> Display WebArchive

simpower
 simpower
发布于 2014/10/09 12:09
字数 1319
阅读 523
收藏 0

Android's WebView has this saveWebArchive method since API level 11: http://developer.android.com/.

It can save entire websites as webarchives, which is great! But how do I get the downloaded contents back into a webview? I tried

webview.loadUrl(Uri.fromFile(mywebarchivefile));

But that only displays xml on the screen.

android webview webarchive

share|improve this question

asked Oct 13 '12 at 10:02

Jouke Waleson
322211



add a comment

2 Answers

activeoldestvotes

up vote16down voteaccepted

Update Feb. 21, 2014

My answer posted below does not apply to web archive files saved under Android 4.4 KitKat and newer. The saveWebArchive() method of WebView under Android 4.4 "KitKat" (and probably newer versions too) does not save the web archive in XML code that this reader code posted below. Instead it saves pages in MHT (MHTML) format. It is easy to read back the .mht files - just use:

webView.loadUrl("file:///my_dir/mySavedWebPage.mht");

That's all, much easier than the previous method, and compatible with other platforms.

Previously posted

I needed it myself, and everywhere I searched, there were unanswered questions like this. So I had to work it out myself. Below is my little WebArchiveReader class and sample code on how to use it. Please note that despite the Android docs declaring that shouldInterceptRequest() was added to WebViewClient in API11 (Honeycomb), this code works and was tested successfully in Android emulators down to API8 (Froyo). Below is all the code that's needed, I also uploaded the full project to GitHub repository athttps://github.com/gregko/WebArchiveReader

File WebArchiveReader.java:

package com.hyperionics.war_test;import android.util.Base64;import android.webkit.WebResourceResponse;import android.webkit.WebView;import android.webkit.WebViewClient;import org.w3c.dom.*;import javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;import javax.xml.parsers.ParserConfigurationException;import java.io.ByteArrayInputStream;import java.io.InputStream;import java.util.ArrayList;public abstract class WebArchiveReader {
    private Document myDoc = null;
    private static boolean myLoadingArchive = false;
    private WebView myWebView = null;
    private ArrayList<String> urlList = new ArrayList<String>();
    private ArrayList<Element> urlNodes = new ArrayList<Element>();

    abstract void onFinished(WebView webView);

    public boolean readWebArchive(InputStream is) {
        DocumentBuilderFactory builderFactory =
                DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = null;
        myDoc = null;
        try {
            builder = builderFactory.newDocumentBuilder();
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        }
        try {
            myDoc = builder.parse(is);
            NodeList nl = myDoc.getElementsByTagName("url");
            for (int i = 0; i < nl.getLength(); i++) {
                Node nd = nl.item(i);
                if(nd instanceof Element) {
                    Element el = (Element) nd;
                    // siblings of el (url) are: mimeType, textEncoding, frameName, data
                    NodeList nodes = el.getChildNodes();
                    for (int j = 0; j < nodes.getLength(); j++) {
                        Node node = nodes.item(j);
                        if (node instanceof Text) {
                            String dt = ((Text)node).getData();
                            byte[] b = Base64.decode(dt, Base64.DEFAULT);
                            dt = new String(b);
                            urlList.add(dt);
                            urlNodes.add((Element) el.getParentNode());
                        }
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
            myDoc = null;
        }
        return myDoc != null;
    }

    private byte [] getElBytes(Element el, String childName) {
        try {
            Node kid = el.getFirstChild();
            while (kid != null) {
                if (childName.equals(kid.getNodeName())) {
                    Node nn = kid.getFirstChild();
                    if (nn instanceof Text) {
                        String dt = ((Text)nn).getData();
                        return Base64.decode(dt, Base64.DEFAULT);
                    }
                }
                kid = kid.getNextSibling();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    public boolean loadToWebView(WebView v) {
        myWebView = v;
        v.setWebViewClient(new WebClient());
        WebSettings webSettings = v.getSettings();
        webSettings.setDefaultTextEncodingName("UTF-8");

        myLoadingArchive = true;
        try {
            // Find the first ArchiveResource in myDoc, should be <ArchiveResource>
            Element ar = (Element) myDoc.getDocumentElement().getFirstChild().getFirstChild();
            byte b[] = getElBytes(ar, "data");

            // Find out the web page charset encoding
            String charset = null;
            String topHtml = new String(b).toLowerCase();
            int n1 = topHtml.indexOf("<meta http-equiv=\"content-type\"");
            if (n1 > -1) {
                int n2 = topHtml.indexOf('>', n1);
                if (n2 > -1) {
                    String tag = topHtml.substring(n1, n2);
                    n1 = tag.indexOf("charset");
                    if (n1 > -1) {
                        tag = tag.substring(n1);
                        n1 = tag.indexOf('=');
                        if (n1 > -1) {
                            tag = tag.substring(n1+1);
                            tag = tag.trim();
                            n1 = tag.indexOf('\"');
                            if (n1 < 0)
                                n1 = tag.indexOf('\'');
                            if (n1 > -1) {
                                charset = tag.substring(0, n1).trim();
                            }
                        }
                    }
                }
            }

            if (charset != null)
                topHtml = new String(b, charset);
            else
                topHtml = new String(b);
            String baseUrl = new String(getElBytes(ar, "url"));
            v.loadDataWithBaseURL(baseUrl, topHtml, "text/html", "UTF-8", null);
        } catch (Exception e) {
            e.printStackTrace();
            return false;
        }
        return true;
    }

    private class WebClient extends WebViewClient {
        @Override
        public WebResourceResponse shouldInterceptRequest(WebView view, String url) {
            if (!myLoadingArchive)
                return null;
            int n = urlList.indexOf(url);
            if (n < 0)
                return null;
            Element parentEl = urlNodes.get(n);
            byte [] b = getElBytes(parentEl, "mimeType");
            String mimeType = b == null ? "text/html" : new String(b);
            b = getElBytes(parentEl, "textEncoding");
            String encoding = b == null ? "UTF-8" : new String(b);
            b = getElBytes(parentEl, "data");
            return new WebResourceResponse(mimeType, encoding, new ByteArrayInputStream(b));
        }

        @Override
        public void onPageFinished(WebView view, String url)
        {
            // our WebClient is no longer needed in view
            view.setWebViewClient(null);
            myLoadingArchive = false;
            onFinished(myWebView);
        }
    }}

Here is how to use this class, sample MyActivity.java class:

package com.hyperionics.war_test;import android.app.Activity;import android.os.Bundle;import android.webkit.WebView;import android.webkit.WebViewClient;import java.io.IOException;import java.io.InputStream;public class MyActivity extends Activity {

    // Sample WebViewClient in case it was needed...
    // See continueWhenLoaded() sample function for the best place to set it on our webView
    private class MyWebClient extends WebViewClient {
        @Override
        public void onPageFinished(WebView view, String url)
        {
            Lt.d("Web page loaded: " + url);
        }
    }

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        WebView webView = (WebView) findViewById(R.id.webView);
        try {
            InputStream is = getAssets().open("TestHtmlArchive.xml");
            WebArchiveReader wr = new WebArchiveReader() {
                void onFinished(WebView v) {
                    // we are notified here when the page is fully loaded.
                    continueWhenLoaded(v);
                }
            };
            // To read from a file instead of an asset, use:
            // FileInputStream is = new FileInputStream(fileName);
            if (wr.readWebArchive(is)) {
                wr.loadToWebView(webView);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    void continueWhenLoaded(WebView webView) {
        Lt.d("Page from WebArchive fully loaded.");
        // If you need to set your own WebViewClient, do it here,
        // after the WebArchive was fully loaded:
        webView.setWebViewClient(new MyWebClient());
        // Any other code we need to execute after loading a page from a WebArchive...
    }}

To make things complete, here is my little Lt.java class for debug output:

package com.hyperionics.war_test;import android.util.Log;public class Lt {
    private static String myTag = "war_test";
    private Lt() {}
    static void setTag(String tag) { myTag = tag; }
    public static void d(String msg) {
        // Uncomment line below to turn on debug output
        Log.d(myTag, msg == null ? "(null)" : msg);
    }
    public static void df(String msg) {
        // Forced output, do not comment out - for exceptions etc.
        Log.d(myTag, msg == null ? "(null)" : msg);
    }}

Hope this is helpful.

Update July 19, 2013

Some web pages don't have meta tag specifying text encoding, and then the code we show above does not display the characters correctly. In the GitHub version of this code I now added charset detection algorithm, which guesses the encoding in such cases. Again, seehttps://github.com/gregko/WebArchiveReader

Greg

share|improve this answer

edited Feb 21 at 22:27

answered Nov 18 '12 at 23:34

gregko
1,19541330


3

Fantastic! I hoped there would be a built in solution, but lacking that, this seems a solid alternative. Thanks!–  Jouke Waleson Nov 20 '12 at 7:12



Thank you, Jouke! I'm glad I was able to post something useful. I also uploaded the sample project to github:github.com/gregko/WebArchiveReader –  gregko Nov 25 '12 at 13:39



thanks for your answer but saveWebArchive is available from api 11, i want to support save webpage form api 8 and higher. Please help me if you have any solution regarding this. I saw your code works fine – Antarix May 14 '13 at 10:17



Actually my code posted above does not use saveWebArchive(), so it could be used even under API 8 to read archives saved elsewhere. I don't have a good solution for saveWebArchive() under older platforms, other than look at Android WebView source code to see if you could copy and adapt the code for this function for older platforms yourself. If you succeed, please post it for all of us as well! –  gregko May 14 '13 at 16:02



@gregko thanks your git project is really helpful,you saved my lot of time :) –  Tombeau Nov 28 '13 at 4:36 

show 6 more comments

up vote7down vote

I've found an undocumented way of reading saved webarchive. Just do: String raw_data = (read the mywebarchivefile as a string) and then call

webview.loadDataWithBaseURL(mywebarchivefile, raw_data, "application/x-webarchive-xml", "UTF-8", null);

The reference:http://androidxref.com/4.0.4/xref/external/webkit/Source/WebCore/loader/archive/ArchiveFactory.cpp

Available from Android 3.0, api level 11.

share|improve this answer

answered Nov 20 '13 at 0:06

Tomasz Jarosik
8113




This worked like a charm for me. Thanks for this. –  Espen Riskedal Dec 14 '13 at 16:31


本文转载自:http://stackoverflow.com/questions/12872043/android-webview-display-webarchive

共有 人打赏支持
simpower
粉丝 26
博文 584
码字总数 45012
作品 0
海淀
程序员
私信 提问
Android UI开发之WebView简单使用

If you want to deliver a web application (or just a web page) as a part of a client application, you can do it using WebView. The WebView class is an extension of Android's View......

秋风醉了
2014/06/17
0
0
如何获取WebView的内容宽度[翻译]

原文网址:http://android.pimmos.com/2011/03/24/how-to-retrieve-the-contentwidth-of-a-webview/ The extensive Android SDK allows you to do many great things with particular views ......

拉风的道长
2013/04/23
0
5
Android 想做一个 android 登录功能,并且保持登录状态直至用户注销

android小白,现在做一个东西,android activity包裹webview。现在想实现一个登录功能。在欢迎页面的时候判断是否登录过,如果登录过就跳到webview。如果没登录过就跳到另一个activity去登录...

奈萌摸尔
2015/01/26
3.6K
2
WebView 设置背景透明

WebView 在 xml 发现,最后背景还不透明,但在代码里 webview = (WebView) findViewById(R.id.webView); // webview.setBackgroundColor(0); 背景就透明了~!2.3 htc 和4.0 xiaomi2s都试过了...

marktola
2013/10/17
2.2K
2
Android的WebView与ProgressDialog结合

WebView组件支持直接加载网页,可以将其视为一个浏览器,要实现该功能,具体步骤如下: webview.xml <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" android:o......

墙头草
2011/08/05
0
0

没有更多内容

加载失败,请刷新页面

加载更多

OSChina 周一乱弹 —— 白掌柜说了卖货不卖身

Osc乱弹歌单(2019)请戳(这里) 【今日歌曲】 @爱漫爱 :这是一场修行分享羽肿的单曲《Moony》 手机党少年们想听歌,请使劲儿戳(这里) @clouddyy :开不开心? 开心呀, 我又不爱睡懒觉…...

小小编辑
今天
16
3
大数据教程(11.7)hadoop2.9.1平台上仓库工具hive1.2.2搭建

上一篇文章介绍了hive2.3.4的搭建,然而这个版本已经不能稳定的支持mapreduce程序。本篇博主将分享hive1.2.2工具搭建全过程。先说明:本节就直接在上一节的hadoop环境中搭建了! 一、下载apa...

em_aaron
今天
5
0
开始看《JSP&Servlet学习笔记》

1:WEB应用简介。其中1.2.1对Web容器的工作流程写得不错 2:编写Servlet。搞清楚了Java的Web目录结构,以及Web.xml的一些配置作用。特别是讲了@WebServlet标签 3:请求与响应。更细致的讲了从...

max佩恩
今天
5
0
mysql分区功能详细介绍,以及实例

一,什么是数据库分区 前段时间写过一篇关于mysql分表的的文章,下面来说一下什么是数据库分区,以mysql为例。mysql数据库中的数据是以文件的形势存在磁盘上的,默认放在/mysql/data下面(可...

吴伟祥
今天
5
0
SQL语句查询

1.1 排序 通过order by语句,可以将查询出的结果进行排序。放置在select语句的最后。 格式: SELECT * FROM 表名 ORDER BY 排序字段ASC|DESC; ASC 升序 (默认) DESC 降序 1.查询所有商品信息,...

stars永恒
今天
6
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部