文档章节

HBase的故障排除与修复

Adel
 Adel
发布于 2016/05/30 11:11
字数 2634
阅读 267
收藏 1

一般准则

总是先从主服务器的日志开始。通常情况下,他总是一行一行的重复信息。如果不是这样,说明有问题,可以Google或是用search-hadoop.com来搜索遇到的异常。

错误很少仅仅单独出现在HBase中,通常是某一个地方出了问题,引起各处大量异常和调用栈跟踪信息。遇到这样的错误,最好的办法是往上查日志,找到最初的异常。例如区域服务器会在退出的时候打印一些度量信息。Grep这个转储 应该可以找到最初的异常信息。

区域服务器的自杀是很“正常”的。当一些事情发生错误的,他们就会自杀。如果ulimit和xcievers没有修改,HDFS将无法运转正常,在HBase看来,HDFS死掉了。假想一下,你的MySQL突然无法访问它的文件系统,他会怎么做。同样的事情会发生在HBase和HDFS上。还有一个造成区域服务器切腹自杀的常见的原因是,他们执行了一个长时间的GC操作,这个时间超过了ZooKeeper的会话时长。

Logs

重要日志的位置( <user>是启动服务的用户,<hostname> 是机器的名字)

  • NameNode: $HADOOP_HOME/logs/hadoop-<user>-namenode-<hostname>.log
  • DataNode: $HADOOP_HOME/logs/hadoop-<user>-datanode-<hostname>.log
  • JobTracker: $HADOOP_HOME/logs/hadoop-<user>-jobtracker-<hostname>.log
  • TaskTracker: $HADOOP_HOME/logs/hadoop-<user>-jobtracker-<hostname>.log
  • HMaster: $HBASE_HOME/logs/hbase-<user>-master-<hostname>.log
  • RegionServer: $HBASE_HOME/logs/hbase-<user>-regionserver-<hostname>.log
  • ZooKeeper: TODO

日志级别

启用 RPC级别日志

Enabling the RPC-level logging on a RegionServer can often given insight on timings at the server. Once enabled, the amount of log spewed is voluminous. It is not recommended that you leave this logging on for more than short bursts of time. To enable RPC-level logging, browse to the RegionServer UI and click on Log Level. Set the log level to DEBUG for the package org.apache.hadoop.ipc (Thats right, for hadoop.ipc, NOT, hbase.ipc). Then tail the RegionServers log. Analyze.

To disable, set the logging level back to INFO level.

JVM 垃圾收集日志

HBase is memory intensive, and using the default GC you can see long pauses in all threads including the Juliet Pause aka "GC of Death". To help debug this or confirm this is happening GC logging can be turned on in the Java virtual machine.

To enable, in hbase-env.sh add:

export HBASE_OPTS="-XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/home/hadoop/hbase/logs/gc-hbase.log"

Adjust the log directory to wherever you log. Note: The GC log does NOT roll automatically, so you'll have to keep an eye on it so it doesn't fill up the disk.

At this point you should see logs like so:

64898.952: [GC [1 CMS-initial-mark: 2811538K(3055704K)] 2812179K(3061272K), 0.0007360 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
64898.953: [CMS-concurrent-mark-start]
64898.971: [GC 64898.971: [ParNew: 5567K->576K(5568K), 0.0101110 secs] 2817105K->2812715K(3061272K), 0.0102200 secs] [Times: user=0.07 sys=0.00, real=0.01 secs] 

In this section, the first line indicates a 0.0007360 second pause for the CMS to initially mark. This pauses the entire VM, all threads for that period of time.

The third line indicates a "minor GC", which pauses the VM for 0.0101110 seconds - aka 10 milliseconds. It has reduced the "ParNew" from about 5.5m to 576k. Later on in this cycle we see:

64901.445: [CMS-concurrent-mark: 1.542/2.492 secs] [Times: user=10.49 sys=0.33, real=2.49 secs] 
64901.445: [CMS-concurrent-preclean-start]
64901.453: [GC 64901.453: [ParNew: 5505K->573K(5568K), 0.0062440 secs] 2868746K->2864292K(3061272K), 0.0063360 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
64901.476: [GC 64901.476: [ParNew: 5563K->575K(5568K), 0.0072510 secs] 2869283K->2864837K(3061272K), 0.0073320 secs] [Times: user=0.05 sys=0.01, real=0.01 secs] 
64901.500: [GC 64901.500: [ParNew: 5517K->573K(5568K), 0.0120390 secs] 2869780K->2865267K(3061272K), 0.0121150 secs] [Times: user=0.09 sys=0.00, real=0.01 secs] 
64901.529: [GC 64901.529: [ParNew: 5507K->569K(5568K), 0.0086240 secs] 2870200K->2865742K(3061272K), 0.0087180 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
64901.554: [GC 64901.555: [ParNew: 5516K->575K(5568K), 0.0107130 secs] 2870689K->2866291K(3061272K), 0.0107820 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 
64901.578: [CMS-concurrent-preclean: 0.070/0.133 secs] [Times: user=0.48 sys=0.01, real=0.14 secs] 
64901.578: [CMS-concurrent-abortable-preclean-start]
64901.584: [GC 64901.584: [ParNew: 5504K->571K(5568K), 0.0087270 secs] 2871220K->2866830K(3061272K), 0.0088220 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
64901.609: [GC 64901.609: [ParNew: 5512K->569K(5568K), 0.0063370 secs] 2871771K->2867322K(3061272K), 0.0064230 secs] [Times: user=0.06 sys=0.00, real=0.01 secs] 
64901.615: [CMS-concurrent-abortable-preclean: 0.007/0.037 secs] [Times: user=0.13 sys=0.00, real=0.03 secs] 
64901.616: [GC[YG occupancy: 645 K (5568 K)]64901.616: [Rescan (parallel) , 0.0020210 secs]64901.618: [weak refs processing, 0.0027950 secs] [1 CMS-remark: 2866753K(3055704K)] 2867399K(3061272K), 0.0049380 secs] [Times: user=0.00 sys=0.01, real=0.01 secs] 
64901.621: [CMS-concurrent-sweep-start]

The first line indicates that the CMS concurrent mark (finding garbage) has taken 2.4 seconds. But this is a concurrent 2.4 seconds, Java has not been paused at any point in time.

There are a few more minor GCs, then there is a pause at the 2nd last line:

64901.616: [GC[YG occupancy: 645 K (5568 K)]64901.616: [Rescan (parallel) , 0.0020210 secs]64901.618: [weak refs processing, 0.0027950 secs] [1 CMS-remark: 2866753K(3055704K)] 2867399K(3061272K), 0.0049380 secs] [Times: user=0.00 sys=0.01, real=0.01 secs] 

The pause here is 0.0049380 seconds (aka 4.9 milliseconds) to 'remark' the heap.

At this point the sweep starts, and you can watch the heap size go down:

64901.637: [GC 64901.637: [ParNew: 5501K->569K(5568K), 0.0097350 secs] 2871958K->2867441K(3061272K), 0.0098370 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
...  lines removed ...
64904.936: [GC 64904.936: [ParNew: 5532K->568K(5568K), 0.0070720 secs] 1365024K->1360689K(3061272K), 0.0071930 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
64904.953: [CMS-concurrent-sweep: 2.030/3.332 secs] [Times: user=9.57 sys=0.26, real=3.33 secs] 

At this point, the CMS sweep took 3.332 seconds, and heap went from about ~ 2.8 GB to 1.3 GB (approximate).

The key points here is to keep all these pauses low. CMS pauses are always low, but if your ParNew starts growing, you can see minor GC pauses approach 100ms, exceed 100ms and hit as high at 400ms.

This can be due to the size of the ParNew, which should be relatively small. If your ParNew is very large after running HBase for a while, in one example a ParNew was about 150MB, then you might have to constrain the size of ParNew (The larger it is, the longer the collections take but if its too small, objects are promoted to old gen too quickly). In the below we constrain new gen size to 64m.

Add this to HBASE_OPTS:

export HBASE_OPTS="-XX:NewSize=64m -XX:MaxNewSize=64m <cms options from above> <gc logging options from above>"

jstack

jstack 是一个最重要(除了看Log)的java工具,可以看到具体的Java进程的在做什么。可以先用Jps看到进程的Id,然后就可以用jstack。他会按线程的创建顺序显示线程的列表,还有这个线程在做什么。

ScannerTimeoutException 或 UnknownScannerException

当从客户端到RegionServer的RPC请求超时。例如如果Scan.setCacheing的值设置为500,RPC请求就要去获取500行的数据,每500次.next()操作获取一次。因为数据是以大块的形式传到客户端的,就可能造成超时。将这个 serCacheing的值调小是一个解决办法,但是这个值要是设的太小就会影响性能。

安全客户端不能连接

([由 GSSException 引起: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]) There can be several causes that produce this symptom.

First, check that you have a valid Kerberos ticket. One is required in order to set up communication with a secure Apache HBase cluster. Examine the ticket currently in the credential cache, if any, by running the klist command line utility. If no ticket is listed, you must obtain a ticket by running the kinit command with either a keytab specified, or by interactively entering a password for the desired principal.

Then, consult the Java Security Guide troubleshooting section. The most common problem addressed there is resolved by setting javax.security.auth.useSubjectCredsOnly system property value to false.

Because of a change in the format in which MIT Kerberos writes its credentials cache, there is a bug in the Oracle JDK 6 Update 26 and earlier that causes Java to be unable to read the Kerberos credentials cache created by versions of MIT Kerberos 1.8.1 or higher. If you have this problematic combination of components in your environment, to work around this problem, first log in with kinit and then immediately refresh the credential cache with kinit -R. The refresh will rewrite the credential cache without the problematic formatting.

Finally, depending on your Kerberos configuration, you may need to install the Java Cryptography Extension, or JCE. Insure the JCE jars are on the classpath on both server and client systems.

You may also need to download the unlimited strength JCE policy files. Uncompress and extract the downloaded file, and install the policy jars into <java-home>/lib/security.

HBase hbck

1.重新修复hbase meta表(根据hdfs上的regioninfo文件,生成meta表)
hbase hbck -fixMeta

2.重新将hbase meta表分给regionserver(根据meta表,将meta表上的region分给regionservere)
hbase hbck -fixAssignments

当出现漏洞

  • hbase hbck -fixHdfsHoles (新建一个region文件夹)
  • hbase hbck -fixMeta (根据regioninfo生成meta表)
  • hbase hbck -fixAssignments (分配region到regionserver上)

region重复问题

查看meta中的region

scan 'hbase:meta' , {LIMIT=>10,FILTER=>"PrefixFilter('INDEX_11')"} 

在数据迁移的时候碰到两个重复的region
b0c8f08ffd7a96219f748ef14d7ad4f8,73ab00eaa7bab7bc83f440549b9749a3

删除两个重复的region

delete 'hbase:meta','INDEX_11,4380_2431,1429757926776.b0c8f08ffd7a96219f748ef14d7ad4f8.','info:regioninfo'  

delete 'hbase:meta','INDEX_11,5479_0041431700000000040100004815E9,1429757926776.73ab00eaa7bab7bc83f440549b9749a3.','info:regioninfo'  

删除两个重复的hdfs

/hbase/data/default/INDEX_11/b0c8f08ffd7a96219f748ef14d7ad4f8
/hbase/data/default/INDEX_11/73ab00eaa7bab7bc83f440549b9749a3

对应的重启regionserver(只是为了刷新hmaster上汇报的RIS的状态)

肯定会丢数据,把没有上线的重复region上的数据丢失

新版本的 hbck

(1)缺失hbase.version文件
加上选项 -fixVersionFile 解决
(2)如果一个region即不在META表中,又不在hdfs上面,但是在regionserver的online region集合中
加上选项 -fixAssignments 解决
(3)如果一个region在META表中,并且在regionserver的online region集合中,但是在hdfs上面没有
加上选项 -fixAssignments -fixMeta 解决,( -fixAssignments告诉regionserver close region),( -fixMeta删除META表中region的记录 (4)如果一个region在META表中没有记录,没有被regionserver服务,但是在hdfs上面有
加上选项 -fixMeta -fixAssignments 解决,( -fixAssignments 用于assign region),( -fixMeta用于在META表中添加region的记录)
(5)如果一个region在META表中没有记录,在hdfs上面有,被regionserver服务了
加上选项 -fixMeta 解决,在META表中添加这个region的记录,先undeploy region,后assign (6)如果一个region在META表中有记录,但是在hdfs上面没有,并且没有被regionserver服务
加上选项 -fixMeta 解决,删除META表中的记录
(7)如果一个region在META表中有记录,在hdfs上面也有,table不是disabled的,但是这个region没有被服务
加上选项 -fixAssignments 解决,assign这个region
(8)如果一个region在META表中有记录,在hdfs上面也有,table是disabled的,但是这个region被某个regionserver服务了
加上选项 -fixAssignments 解决,undeploy这个region
(9)如果一个region在META表中有记录,在hdfs上面也有,table不是disabled的,但是这个region被多个regionserver服务了 加上选项 -fixAssignments 解决,通知所有regionserver close region,然后assign region

(10)如果一个region在META表中,在hdfs上面也有,也应该被服务,但是META表中记录的regionserver和实际所在的regionserver不相符
加上选项 -fixAssignments 解决
(11)region holes
需要加上 -fixHdfsHoles ,创建一个新的空region,填补空洞,但是不assign 这个 region,也不在META表中添加这个region的相关信息
(12)region在hdfs上面没有.regioninfo文件
-fixHdfsOrphans 解决
(13)region overlaps
需要加上 -fixHdfsOverlaps

说明:
(1)修复region holes时,-fixHdfsHoles 选项只是创建了一个新的空region,填补上了这个区间,还需要加上-fixAssignments -fixMeta 来解决问题,( -fixAssignments 用于assign region),( -fixMeta用于在META表中添加region的记录),所以有了组合拳 -repairHoles 修复region holes,相当于-fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans
(2) -fixAssignments,用于修复region没有assign、不应该assign、assign了多次的问题

(3)-fixMeta,如果hdfs上面没有,那么从META表中删除相应的记录,如果hdfs上面有,在META表中添加上相应的记录信息
(4)-repair 打开所有的修复选项,相当于-fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps

新版本的hbck从(1)hdfs目录(2)META(3)RegionServer这三处获得region的Table和Region的相关信息,根据这些信息判断并repair

© 著作权归作者所有

上一篇: HBase性能调优
下一篇: HBase基本知识
Adel
粉丝 10
博文 71
码字总数 61751
作品 0
海淀
程序员
私信 提问
HBase应用实践专场-HBase问题排查思路

HBCK - HBCK检查什么? (1)HBase Region一致性 集群中所有region都被assign,而且deploy到唯一一台RegionServer上 该region的状态在内存中、hbase:meta表中以及zookeeper这三个地方需要保持一...

HBase技术社区
2018/09/03
0
0
HBase 表修复在线方式和离线方式

一、在线修复 1.1 使用检查命令 $ ./bin/hbase hbck 该命令可完整修复 HBase 元数据信息;存在有错误信息会进行输出; 也可以通过如下命令查看详细信息: $ ./bin/hbase hbck -details 1.2 ...

Ryan-瑞恩
2018/09/24
0
0
HBase 0.94.0 发布,分布式数据库

HBase 0.94.0 发布了,该版本兼容 0.92.0 版本,包含很多性能的提升和 bug 修复以及新特性: 性能方面的提升: [HBASE-5010] - Filter HFiles based on TTL [HBASE-4465] - Lazy-seek optim...

YANGL
2012/05/17
2.3K
2
Hbase在HDFS上的各个目录作用

1、/hbase/.META. 就是存储1中介绍的 META 表的存储路径。 2、/hbase/.archive HBase 在做 Split或者 compact 操作完成之后,会将 HFile 移到.archive 目录中,然后将之前的 hfile 删除掉,该...

爱运动的小乌龟
2017/10/19
0
0
Apache HBase 1.2.1 发布,分布式数据库

Apache HBase 1.2.1 HBase 发布了,HBase – Hadoop Database,是一个高可靠性、高性能、面向列、可伸缩的分布式存储系统,利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群。 ...

oschina
2016/04/12
3.6K
0

没有更多内容

加载失败,请刷新页面

加载更多

GMTC2019|闲鱼-基于Flutter的架构演进与创新

作者:闲鱼技术-宗心 2012年应届毕业加入阿里巴巴,主导了闲鱼基于Flutter的新混合架构,同时推进了Flutter在闲鱼各业务线的落地。未来将持续关注终端技术的演变及趋势 Flutter的优势与挑战 ...

阿里云云栖社区
12分钟前
2
0
迪蒙人工智能共享停车吸引国际关注

  近来,华为创始人任正非多次提及人工智能。即便在华为生死攸关的关键时刻,任正非依旧不忘强调教育的重要性,“如果不重视教育,实际上我们会重返贫穷的,因为这个社会,最终是要走向人工智能的...

琴殇的
13分钟前
0
0
iOS开发之EventKitUI框架的应用

iOS开发之EventKitUI框架的应用 前面博客,有介绍EventKit这个框架的使用,使用EventKit可以与系统的日历和提醒应用进行交互,读写用户的日程事件。EventKitUI,顾名思义,其实基于EventKit框...

珲少
21分钟前
0
0
从MySQL源码看其网络IO模型

从MySQL源码看其网络IO模型 前言 MySQL是当今最流行的开源数据库,阅读其源码是一件大有裨益的事情(虽然其代码感觉比较凌乱)。而笔者阅读一个Server源码的习惯就是先从其网络IO模型看起。于是...

无毁的湖光-Al
22分钟前
0
0
WebService学习笔记

什么是Web Services? Web Services 是应用程序组件 Web Services 使用开放协议进行通信 Web Services 是独立的(self-contained)并可自我描述 Web Services 可通过使用UDDI来发现 Web Serv...

榴莲黑芝麻糊
38分钟前
2
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部