1、问题描述
Spark更新Hbase的时候,抛出异常
2、错误代码
11-11 11:40:12 [WARN] [scheduler.TaskSetManager(71)] Lost task 26.0 in stage 0.0 (TID 26, 10.0.200.16): java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.Put.addColumn([B[BJ[B)Lorg/apache/hadoop/hbase/client/Put;
at com.blueview.spark.hbase.ClusterUpdateHbJob$1.call(ClusterUpdateHbJob.java:49)
at com.blueview.spark.hbase.ClusterUpdateHbJob$1.call(ClusterUpdateHbJob.java:41)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreach$1.apply(JavaRDDLike.scala:327)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreach$1.apply(JavaRDDLike.scala:327)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:870)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$28.apply(RDD.scala:870)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
3、问题原因
Jar包冲突,一定是Jar冲突,一定是
4、解决方式
a、确认 org/apache/hadoop/hbase/client/Put 在 hbase-client-1.2.2.jar 中
b、确认 hbase-client-1.2.2.jar 被加载到JVM中
c、如下方式,打印出这个类所在的Jar,看它到底来自哪里
ProtectionDomain pd= org.apache.hadoop.hbase.client.Put.class.getProtectionDomain(); LOG.info("where====================================================================="); LOG.info("where===" + pd); LOG.info("where=====================================================================");
打印结果,org/apache/hadoop/hbase/client/Put 这个类来自 /opt/software/spark/lib/spark-examples-1.4.1-hadoop2.6.0.jar
d、下载spark-examples-1.4.1-hadoop2.6.0.jar反编译,果然有org/apache/hadoop/hbase/client/Put这个类,最终找到问题
5、问题根源
为什么会这样
客户端有一个提交应用程序的脚本,脚本本身就加载了应用程序执行所需的各种Jar,比如,hbase-client-1.2.2.jar,zookeeper-3.4.5.jar等等
任务执行的过程中会将各种Jar分发到集群中的各个物理机器,当然,Spark安装目录的Lib目录下的Jar同样也会被加载到JVM中,恰巧,spark-examples-1.4.1-hadoop2.6.0.jar中包含org/apache/hadoop/hbase/client/Put ,hbase-client-1.2.2.jar中也包含org/apache/hadoop/hbase/client/Put ,对JVM来说,这样就产生了冲突
分析完毕