文档章节

Android WebView -> Display WebArchive

simpower
 simpower
发布于 2014/10/09 12:09
字数 1319
阅读 504
收藏 0
点赞 0
评论 0

Android's WebView has this saveWebArchive method since API level 11: http://developer.android.com/.

It can save entire websites as webarchives, which is great! But how do I get the downloaded contents back into a webview? I tried

webview.loadUrl(Uri.fromFile(mywebarchivefile));

But that only displays xml on the screen.

android webview webarchive

share|improve this question

asked Oct 13 '12 at 10:02

Jouke Waleson
322211



add a comment

2 Answers

activeoldestvotes

up vote16down voteaccepted

Update Feb. 21, 2014

My answer posted below does not apply to web archive files saved under Android 4.4 KitKat and newer. The saveWebArchive() method of WebView under Android 4.4 "KitKat" (and probably newer versions too) does not save the web archive in XML code that this reader code posted below. Instead it saves pages in MHT (MHTML) format. It is easy to read back the .mht files - just use:

webView.loadUrl("file:///my_dir/mySavedWebPage.mht");

That's all, much easier than the previous method, and compatible with other platforms.

Previously posted

I needed it myself, and everywhere I searched, there were unanswered questions like this. So I had to work it out myself. Below is my little WebArchiveReader class and sample code on how to use it. Please note that despite the Android docs declaring that shouldInterceptRequest() was added to WebViewClient in API11 (Honeycomb), this code works and was tested successfully in Android emulators down to API8 (Froyo). Below is all the code that's needed, I also uploaded the full project to GitHub repository athttps://github.com/gregko/WebArchiveReader

File WebArchiveReader.java:

package com.hyperionics.war_test;import android.util.Base64;import android.webkit.WebResourceResponse;import android.webkit.WebView;import android.webkit.WebViewClient;import org.w3c.dom.*;import javax.xml.parsers.DocumentBuilder;import javax.xml.parsers.DocumentBuilderFactory;import javax.xml.parsers.ParserConfigurationException;import java.io.ByteArrayInputStream;import java.io.InputStream;import java.util.ArrayList;public abstract class WebArchiveReader {
    private Document myDoc = null;
    private static boolean myLoadingArchive = false;
    private WebView myWebView = null;
    private ArrayList<String> urlList = new ArrayList<String>();
    private ArrayList<Element> urlNodes = new ArrayList<Element>();

    abstract void onFinished(WebView webView);

    public boolean readWebArchive(InputStream is) {
        DocumentBuilderFactory builderFactory =
                DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = null;
        myDoc = null;
        try {
            builder = builderFactory.newDocumentBuilder();
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        }
        try {
            myDoc = builder.parse(is);
            NodeList nl = myDoc.getElementsByTagName("url");
            for (int i = 0; i < nl.getLength(); i++) {
                Node nd = nl.item(i);
                if(nd instanceof Element) {
                    Element el = (Element) nd;
                    // siblings of el (url) are: mimeType, textEncoding, frameName, data
                    NodeList nodes = el.getChildNodes();
                    for (int j = 0; j < nodes.getLength(); j++) {
                        Node node = nodes.item(j);
                        if (node instanceof Text) {
                            String dt = ((Text)node).getData();
                            byte[] b = Base64.decode(dt, Base64.DEFAULT);
                            dt = new String(b);
                            urlList.add(dt);
                            urlNodes.add((Element) el.getParentNode());
                        }
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
            myDoc = null;
        }
        return myDoc != null;
    }

    private byte [] getElBytes(Element el, String childName) {
        try {
            Node kid = el.getFirstChild();
            while (kid != null) {
                if (childName.equals(kid.getNodeName())) {
                    Node nn = kid.getFirstChild();
                    if (nn instanceof Text) {
                        String dt = ((Text)nn).getData();
                        return Base64.decode(dt, Base64.DEFAULT);
                    }
                }
                kid = kid.getNextSibling();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    public boolean loadToWebView(WebView v) {
        myWebView = v;
        v.setWebViewClient(new WebClient());
        WebSettings webSettings = v.getSettings();
        webSettings.setDefaultTextEncodingName("UTF-8");

        myLoadingArchive = true;
        try {
            // Find the first ArchiveResource in myDoc, should be <ArchiveResource>
            Element ar = (Element) myDoc.getDocumentElement().getFirstChild().getFirstChild();
            byte b[] = getElBytes(ar, "data");

            // Find out the web page charset encoding
            String charset = null;
            String topHtml = new String(b).toLowerCase();
            int n1 = topHtml.indexOf("<meta http-equiv=\"content-type\"");
            if (n1 > -1) {
                int n2 = topHtml.indexOf('>', n1);
                if (n2 > -1) {
                    String tag = topHtml.substring(n1, n2);
                    n1 = tag.indexOf("charset");
                    if (n1 > -1) {
                        tag = tag.substring(n1);
                        n1 = tag.indexOf('=');
                        if (n1 > -1) {
                            tag = tag.substring(n1+1);
                            tag = tag.trim();
                            n1 = tag.indexOf('\"');
                            if (n1 < 0)
                                n1 = tag.indexOf('\'');
                            if (n1 > -1) {
                                charset = tag.substring(0, n1).trim();
                            }
                        }
                    }
                }
            }

            if (charset != null)
                topHtml = new String(b, charset);
            else
                topHtml = new String(b);
            String baseUrl = new String(getElBytes(ar, "url"));
            v.loadDataWithBaseURL(baseUrl, topHtml, "text/html", "UTF-8", null);
        } catch (Exception e) {
            e.printStackTrace();
            return false;
        }
        return true;
    }

    private class WebClient extends WebViewClient {
        @Override
        public WebResourceResponse shouldInterceptRequest(WebView view, String url) {
            if (!myLoadingArchive)
                return null;
            int n = urlList.indexOf(url);
            if (n < 0)
                return null;
            Element parentEl = urlNodes.get(n);
            byte [] b = getElBytes(parentEl, "mimeType");
            String mimeType = b == null ? "text/html" : new String(b);
            b = getElBytes(parentEl, "textEncoding");
            String encoding = b == null ? "UTF-8" : new String(b);
            b = getElBytes(parentEl, "data");
            return new WebResourceResponse(mimeType, encoding, new ByteArrayInputStream(b));
        }

        @Override
        public void onPageFinished(WebView view, String url)
        {
            // our WebClient is no longer needed in view
            view.setWebViewClient(null);
            myLoadingArchive = false;
            onFinished(myWebView);
        }
    }}

Here is how to use this class, sample MyActivity.java class:

package com.hyperionics.war_test;import android.app.Activity;import android.os.Bundle;import android.webkit.WebView;import android.webkit.WebViewClient;import java.io.IOException;import java.io.InputStream;public class MyActivity extends Activity {

    // Sample WebViewClient in case it was needed...
    // See continueWhenLoaded() sample function for the best place to set it on our webView
    private class MyWebClient extends WebViewClient {
        @Override
        public void onPageFinished(WebView view, String url)
        {
            Lt.d("Web page loaded: " + url);
        }
    }

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);
        WebView webView = (WebView) findViewById(R.id.webView);
        try {
            InputStream is = getAssets().open("TestHtmlArchive.xml");
            WebArchiveReader wr = new WebArchiveReader() {
                void onFinished(WebView v) {
                    // we are notified here when the page is fully loaded.
                    continueWhenLoaded(v);
                }
            };
            // To read from a file instead of an asset, use:
            // FileInputStream is = new FileInputStream(fileName);
            if (wr.readWebArchive(is)) {
                wr.loadToWebView(webView);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    void continueWhenLoaded(WebView webView) {
        Lt.d("Page from WebArchive fully loaded.");
        // If you need to set your own WebViewClient, do it here,
        // after the WebArchive was fully loaded:
        webView.setWebViewClient(new MyWebClient());
        // Any other code we need to execute after loading a page from a WebArchive...
    }}

To make things complete, here is my little Lt.java class for debug output:

package com.hyperionics.war_test;import android.util.Log;public class Lt {
    private static String myTag = "war_test";
    private Lt() {}
    static void setTag(String tag) { myTag = tag; }
    public static void d(String msg) {
        // Uncomment line below to turn on debug output
        Log.d(myTag, msg == null ? "(null)" : msg);
    }
    public static void df(String msg) {
        // Forced output, do not comment out - for exceptions etc.
        Log.d(myTag, msg == null ? "(null)" : msg);
    }}

Hope this is helpful.

Update July 19, 2013

Some web pages don't have meta tag specifying text encoding, and then the code we show above does not display the characters correctly. In the GitHub version of this code I now added charset detection algorithm, which guesses the encoding in such cases. Again, seehttps://github.com/gregko/WebArchiveReader

Greg

share|improve this answer

edited Feb 21 at 22:27

answered Nov 18 '12 at 23:34

gregko
1,19541330


3

Fantastic! I hoped there would be a built in solution, but lacking that, this seems a solid alternative. Thanks!–  Jouke Waleson Nov 20 '12 at 7:12



Thank you, Jouke! I'm glad I was able to post something useful. I also uploaded the sample project to github:github.com/gregko/WebArchiveReader –  gregko Nov 25 '12 at 13:39



thanks for your answer but saveWebArchive is available from api 11, i want to support save webpage form api 8 and higher. Please help me if you have any solution regarding this. I saw your code works fine – Antarix May 14 '13 at 10:17



Actually my code posted above does not use saveWebArchive(), so it could be used even under API 8 to read archives saved elsewhere. I don't have a good solution for saveWebArchive() under older platforms, other than look at Android WebView source code to see if you could copy and adapt the code for this function for older platforms yourself. If you succeed, please post it for all of us as well! –  gregko May 14 '13 at 16:02



@gregko thanks your git project is really helpful,you saved my lot of time :) –  Tombeau Nov 28 '13 at 4:36 

show 6 more comments

up vote7down vote

I've found an undocumented way of reading saved webarchive. Just do: String raw_data = (read the mywebarchivefile as a string) and then call

webview.loadDataWithBaseURL(mywebarchivefile, raw_data, "application/x-webarchive-xml", "UTF-8", null);

The reference:http://androidxref.com/4.0.4/xref/external/webkit/Source/WebCore/loader/archive/ArchiveFactory.cpp

Available from Android 3.0, api level 11.

share|improve this answer

answered Nov 20 '13 at 0:06

Tomasz Jarosik
8113




This worked like a charm for me. Thanks for this. –  Espen Riskedal Dec 14 '13 at 16:31


本文转载自:http://stackoverflow.com/questions/12872043/android-webview-display-webarchive

共有 人打赏支持
simpower
粉丝 24
博文 456
码字总数 21045
作品 0
海淀
程序员
Android UI开发之WebView简单使用

If you want to deliver a web application (or just a web page) as a part of a client application, you can do it using WebView. The WebView class is an extension of Android's View......

秋风醉了
2014/06/17
0
0
如何获取WebView的内容宽度[翻译]

原文网址:http://android.pimmos.com/2011/03/24/how-to-retrieve-the-contentwidth-of-a-webview/ The extensive Android SDK allows you to do many great things with particular views ......

拉风的道长
2013/04/23
0
5
Android的WebView与ProgressDialog结合

WebView组件支持直接加载网页,可以将其视为一个浏览器,要实现该功能,具体步骤如下: webview.xml <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" android:o...

墙头草
2011/08/05
0
0
android WebView详解

浏览器控件是每个开发环境都具备的,这为马甲神功提供了用武之地,windows的有webbrowser,android和ios都有webview。只是其引擎不同,相对于微软的webbrowser,android及ios的webview的引擎...

amigos_wu
2012/06/25
0
1
android webView使用方法

在开发过程中应该注意几点: 1.AndroidManifest.xml中必须使用许可"android.permission.INTERNET",否则会出Web page not available错误。 2.如果访问的页面中有Javascript,则webview必须设置...

QGlaunch
2015/06/17
0
0
Android Hybrid开发:这是一份详细 & 全面的WebView学习攻略

前言 现在很多里都内置了Web网页(),比如说很多电商平台,淘宝、京东、聚划算等等,如下图 那么这种该如何实现呢?其实这是里一个叫组件实现 今天,我将献上一份全面 & 详细的 攻略,含具体...

Carson_Ho
06/19
0
0
Android WebView Memory Leak WebView内存泄漏

在这次开发过程中,需要用到webview展示一些界面,但是加载的页面如果有很多图片就会发现内存占用暴涨,并且在退出该界面后,即使在包含该webview的Activity的destroy()方法中,使用webview...

Drealin
2013/01/07
0
26
WebView使用总结(应用函数与JS函数互相调用)

1.当只用WebView的时候,最先注意的当然是在配置文件中添加访问因特网的权限; 2.如果访问的页面中有Javascript,必须设置支持Javascript: Java代码 3.如果希望点击链接由自己处理而不是新开And...

带梦想一7飞
2013/01/04
0
0
使用Kotlin:让Android与JS交互的详解

先来说说什么是JS交互: 说的俗一点就是通过我们项目中的控件来调用HTML里的JS代码,也可以通过JS来调用项目中的代码。 Android与JS之间的桥梁就是WebView了,我们是通过WebView来实现他们的...

富江___
06/11
0
0
Android WebView:这是一份 详细 & 易懂的WebView学习攻略(含与JS交互、缓存构建等)

前言 现在很多里都内置了Web网页(),比如说很多电商平台,淘宝、京东、聚划算等等,如下图 那么这种该如何实现呢?其实这是里一个叫组件实现 今天,我将献上一份全面 & 详细的 攻略,含具体...

Carson_Ho
05/21
0
0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

Docker Mac (三) Dockerfile 及命令

Dockerfile 最近学习docker的时候,遇到一件怪事,关于docker镜像可能会被破坏,还不知道它会有此措施 所以需要了解构建Dockerfile的正确方法 Dockerfile是由一系列命令和参数构成的脚本,这些命...

___大侠
24分钟前
0
0
NetCat Tutorials

Hacking with Netcat part 1: The Basics Hacking with Netcat part 2: Bind and reverse shells Hacking with Netcat part 3: Advanced Techniques 10 Introduction to Netcat - pdf NetCat......

zungyiu
24分钟前
0
0
Android Studio+NDK+Cmake 移植FFmpeg-4.0.2命令行工具

一、编译 参考大神的帖子,亲测一次编译成功:https://blog.csdn.net/bobcat_kay/article/details/80889398 鉴于以前查文档的经验,这里附上编写例子的时间:2018年7月22日 我用的是ubantu,...

她叫我小渝
25分钟前
0
0
mysql创建数据库

登录MYSQL mysql -u root -p 脚本创建数据库WeChat,并制定默认的字符集是utf8mb4。 CREATE DATABASE Wechat DEFAULT CHARSET utf8mb4 COLLATE utf8mb4_general_ci; 授权 grant all......

niithub
39分钟前
0
0
svn: Unable to connect to a repository URL 的解决方案

错误图示: 解决办法:清除本地保存的授权信息; 1:右键点击本地文件夹,选择设置; TortoiseSVN -> Settings 2:在弹出的对话框中选择 Saved Data, 右侧选择:授权地方清理所有。 然后点确...

宁哥实战课堂
今天
1
0
sleep与wait的区别

Thread.sleep(XXX)方法消耗CPU吗? 这个知识点是我之前认识一直有错误的一个知识点,在我以前的认识里面,我一直认为Thread.sleep(1000)的这一秒钟的时间内,线程的休眠是一直占用着CPU的时间...

码代码的小司机
今天
1
0
20位活跃在Github上的国内技术大牛 leij 何小鹏 亚信

本文列举了20位在Github上非常活跃的国内大牛,看看其中是不是很多熟悉的面孔? 1. lifesinger(玉伯) Github主页: https://github.com/lifesinger 微博:@ 玉伯也叫射雕 玉伯(王保平),...

海博1600
今天
1
0
Mybatis收集配置

一、Mybatis取Clob数据 1、Mapper.xml配置 <resultMap type="com.test.User" id="user"> <result column="id" property="id"/> <result column="json_data" property="jsonData" ......

星痕2018
今天
1
0
centos7设置以多用户模式启动

1、旧版本linux系统修改inittab文件,在新版本执行vi /etc/inittab 会有以下提示 # inittab is no longer used when using systemd. # # ADDING CONFIGURATION HERE WILL HAVE NO EFFECT ON......

haha360
今天
1
0
OSChina 周日乱弹 —— 局长:怕你不爱我

Osc乱弹歌单(2018)请戳(这里) 【今日歌曲】 @ andonny :分享周二珂的单曲《孤独她呀》 《孤独她呀》- 周二珂 手机党少年们想听歌,请使劲儿戳(这里) @孤星闵月 :没事干,看一遍红楼梦...

小小编辑
今天
395
12

没有更多内容

加载失败,请刷新页面

加载更多

下一页

返回顶部
顶部