文档章节

大数据教程(12.4)hive实战--级联求和

em_aaron
 em_aaron
发布于 02/11 21:34
字数 4553
阅读 5
收藏 0

本节博主分享一个在工作中经常遇到的级联求和出报表的案例。需求如下:

(1)有如下访客访问次数统计表 t_access_times

访客

月份

访问次数

A

2015-01

5

A

2015-01

15

B

2015-01

5

A

2015-01

8

B

2015-01

25

A

2015-01

5

A

2015-02

4

A

2015-02

6

B

2015-02

10

B

2015-02

5

……

……

……

(2)需要输出报表:t_access_times_accumulate

访客

月份

月访问总计

累计访问总计

A

2015-01

33

33

A

2015-02

10

43

…….

…….

…….

…….

B

2015-01

30

30

B

2015-02

15

45

…….

…….

…….

…….

 (3)实现步骤

create table t_access_times(username string,month string,salary int)
row format delimited fields terminated by ',';

load data local inpath '/home/hadoop/t_access_times.dat' into table t_access_times;

A,2015-01,5
A,2015-01,15
B,2015-01,5
A,2015-01,8
B,2015-01,25
A,2015-01,5
A,2015-02,4
A,2015-02,6
B,2015-02,10
B,2015-02,5


1、第一步,先求个用户的月总金额
select username,month,sum(salary) as salary from t_access_times group by username,month

+-----------+----------+---------+--+
| username  |  month   | salary  |
+-----------+----------+---------+--+
| A         | 2015-01  | 33      |
| A         | 2015-02  | 10      |
| B         | 2015-01  | 30      |
| B         | 2015-02  | 15      |
+-----------+----------+---------+--+

2、第二步,将月总金额表 自己连接 自己连接
select *
from 
(select username,month,sum(salary) as salary from t_access_times group by username,month) A 
inner join 
(select username,month,sum(salary) as salary from t_access_times group by username,month) B
on
A.username=B.username
+-------------+----------+-----------+-------------+----------+-----------+--+
| a.username  | a.month  | a.salary  | b.username  | b.month  | b.salary  |
+-------------+----------+-----------+-------------+----------+-----------+--+
| A           | 2015-01  | 33        | A           | 2015-01  | 33        |
| A           | 2015-01  | 33        | A           | 2015-02  | 10        |
| A           | 2015-02  | 10        | A           | 2015-01  | 33        |
| A           | 2015-02  | 10        | A           | 2015-02  | 10        |
| B           | 2015-01  | 30        | B           | 2015-01  | 30        |
| B           | 2015-01  | 30        | B           | 2015-02  | 15        |
| B           | 2015-02  | 15        | B           | 2015-01  | 30        |
| B           | 2015-02  | 15        | B           | 2015-02  | 15        |
+-------------+----------+-----------+-------------+----------+-----------+--+

3、第三步,从上一步的结果中
进行分组查询,分组的字段是a.username a.month
求月累计值:  将b.month <= a.month的所有b.salary求和即可
select A.username,A.month,max(A.salary) as salary,sum(B.salary) as accumulate
from 
(select username,month,sum(salary) as salary from t_access_times group by username,month) A 
inner join 
(select username,month,sum(salary) as salary from t_access_times group by username,month) B
on
A.username=B.username
where B.month <= A.month
group by A.username,A.month
order by A.username,A.month;

 (4)操作效果

0: jdbc:hive2://centos-aaron-h1:10000> create table t_access_times(username string,month string,salary int)
0: jdbc:hive2://centos-aaron-h1:10000> row format delimited fields terminated by ',';
OK
No rows affected (0.908 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> load data local inpath '/home/hadoop/t_access_times.dat' into table t_access_times;
Loading data to table default.t_access_times
Table default.t_access_times stats: [numFiles=1, totalSize=123]
OK
INFO  : Loading data to table default.t_access_times from file:/home/hadoop/t_access_times.dat
INFO  : Table default.t_access_times stats: [numFiles=1, totalSize=123]
No rows affected (2.88 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> select username,month,sum(salary) as salary from t_access_times group by username,month;
Query ID = hadoop_20190212052316_64866ab3-25a5-4f1e-8ae4-7b2dcfcc1c1f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0001, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0001/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0001
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0001
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0001/
INFO  : Starting Job = job_1549919838832_0001, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0001/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-02-12 05:23:38,459 Stage-1 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:23:38,459 Stage-1 map = 0%,  reduce = 0%
2019-02-12 05:23:55,144 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.51 sec
INFO  : 2019-02-12 05:23:55,144 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.51 sec
2019-02-12 05:24:01,293 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.37 sec
INFO  : 2019-02-12 05:24:01,293 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.37 sec
MapReduce Total cumulative CPU time: 3 seconds 370 msec
Ended Job = job_1549919838832_0001
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.37 sec   HDFS Read: 7681 HDFS Write: 52 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 370 msec
OK
INFO  : MapReduce Total cumulative CPU time: 3 seconds 370 msec
INFO  : Ended Job = job_1549919838832_0001
+-----------+----------+---------+--+
| username  |  month   | salary  |
+-----------+----------+---------+--+
| A         | 2015-01  | 33      |
| A         | 2015-02  | 10      |
| B         | 2015-01  | 30      |
| B         | 2015-02  | 15      |
+-----------+----------+---------+--+
4 rows selected (46.143 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> select *
0: jdbc:hive2://centos-aaron-h1:10000> from 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) A 
0: jdbc:hive2://centos-aaron-h1:10000> inner join 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) B
0: jdbc:hive2://centos-aaron-h1:10000> on
0: jdbc:hive2://centos-aaron-h1:10000> A.username=B.username;
Query ID = hadoop_20190212052542_208d2ee5-d122-4a12-a0d6-aa6ec90e031a
Total jobs = 5
Launching Job 1 out of 5
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0002, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0002/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0002
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0002
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0002/
INFO  : Starting Job = job_1549919838832_0002, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0002/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-02-12 05:25:55,359 Stage-1 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:25:55,359 Stage-1 map = 0%,  reduce = 0%
2019-02-12 05:26:05,614 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.61 sec
INFO  : 2019-02-12 05:26:05,614 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.61 sec
2019-02-12 05:26:11,762 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.39 sec
INFO  : 2019-02-12 05:26:11,762 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.39 sec
MapReduce Total cumulative CPU time: 3 seconds 390 msec
Ended Job = job_1549919838832_0002
Launching Job 2 out of 5
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
INFO  : MapReduce Total cumulative CPU time: 3 seconds 390 msec
INFO  : Ended Job = job_1549919838832_0002
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0003, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0003/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0003
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0003
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0003/
INFO  : Starting Job = job_1549919838832_0003, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0003/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0003
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
2019-02-12 05:26:34,772 Stage-3 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:26:34,772 Stage-3 map = 0%,  reduce = 0%
2019-02-12 05:26:43,958 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.86 sec
INFO  : 2019-02-12 05:26:43,958 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.86 sec
2019-02-12 05:26:52,127 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 1.73 sec
INFO  : 2019-02-12 05:26:52,127 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 1.73 sec
MapReduce Total cumulative CPU time: 1 seconds 730 msec
Ended Job = job_1549919838832_0003
Stage-7 is selected by condition resolver.
Stage-8 is filtered out by condition resolver.
Stage-2 is filtered out by condition resolver.
Execution log at: /tmp/hadoop/hadoop_20190212052542_208d2ee5-d122-4a12-a0d6-aa6ec90e031a.log
INFO  : MapReduce Total cumulative CPU time: 1 seconds 730 msec
INFO  : Ended Job = job_1549919838832_0003
INFO  : Stage-7 is selected by condition resolver.
INFO  : Stage-8 is filtered out by condition resolver.
INFO  : Stage-2 is filtered out by condition resolver.
2019-02-12 05:26:58     Starting to launch local task to process map join;      maximum memory = 518979584
2019-02-12 05:26:59     Dump the side-table for tag: 1 with group count: 2 into file: file:/tmp/hadoop/2d536889-8e64-4ece-91b9-6ae10c4ff631/hive_2019-02-12_05-25-42_429_4168125726438328049-1/-local-10005/HashTable-Stage-4/MapJoin-mapfile01--.hashtable
2019-02-12 05:26:59     Uploaded 1 File to: file:/tmp/hadoop/2d536889-8e64-4ece-91b9-6ae10c4ff631/hive_2019-02-12_05-25-42_429_4168125726438328049-1/-local-10005/HashTable-Stage-4/MapJoin-mapfile01--.hashtable (346 bytes)
2019-02-12 05:26:59     End of local task; Time Taken: 1.103 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 4 out of 5
Number of reduce tasks is set to 0 since there's no reduce operator
INFO  : Execution completed successfully
INFO  : MapredLocal task succeeded
INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1549919838832_0004, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0004/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0004
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0004
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0004/
INFO  : Starting Job = job_1549919838832_0004, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0004/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0004
Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 0
2019-02-12 05:27:14,062 Stage-4 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 0
INFO  : 2019-02-12 05:27:14,062 Stage-4 map = 0%,  reduce = 0%
2019-02-12 05:27:22,362 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.83 sec
INFO  : 2019-02-12 05:27:22,362 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.83 sec
MapReduce Total cumulative CPU time: 830 msec
Ended Job = job_1549919838832_0004
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.39 sec   HDFS Read: 7283 HDFS Write: 208 SUCCESS
Stage-Stage-3: Map: 1  Reduce: 1   Cumulative CPU: 1.73 sec   HDFS Read: 7285 HDFS Write: 208 SUCCESS
Stage-Stage-4: Map: 1   Cumulative CPU: 0.83 sec   HDFS Read: 5188 HDFS Write: 208 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 950 msec
OK
INFO  : MapReduce Total cumulative CPU time: 830 msec
INFO  : Ended Job = job_1549919838832_0004
+-------------+----------+-----------+-------------+----------+-----------+--+
| a.username  | a.month  | a.salary  | b.username  | b.month  | b.salary  |
+-------------+----------+-----------+-------------+----------+-----------+--+
| A           | 2015-01  | 33        | A           | 2015-01  | 33        |
| A           | 2015-01  | 33        | A           | 2015-02  | 10        |
| A           | 2015-02  | 10        | A           | 2015-01  | 33        |
| A           | 2015-02  | 10        | A           | 2015-02  | 10        |
| B           | 2015-01  | 30        | B           | 2015-01  | 30        |
| B           | 2015-01  | 30        | B           | 2015-02  | 15        |
| B           | 2015-02  | 15        | B           | 2015-01  | 30        |
| B           | 2015-02  | 15        | B           | 2015-02  | 15        |
+-------------+----------+-----------+-------------+----------+-----------+--+
8 rows selected (101.008 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> 

0: jdbc:hive2://centos-aaron-h1:10000> select A.username,A.month,max(A.salary) as salary,sum(B.salary) as accumulate
0: jdbc:hive2://centos-aaron-h1:10000> from 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) A 
0: jdbc:hive2://centos-aaron-h1:10000> inner join 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) B
0: jdbc:hive2://centos-aaron-h1:10000> on
0: jdbc:hive2://centos-aaron-h1:10000> A.username=B.username
0: jdbc:hive2://centos-aaron-h1:10000> where B.month <= A.month
0: jdbc:hive2://centos-aaron-h1:10000> group by A.username,A.month
0: jdbc:hive2://centos-aaron-h1:10000> order by A.username,A.month;
Query ID = hadoop_20190212053047_a2bcb673-b252-4277-85dd-8085248520aa
Total jobs = 7
Launching Job 1 out of 7
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0005, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0005/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0005
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0005
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0005/
INFO  : Starting Job = job_1549919838832_0005, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0005/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-02-12 05:30:59,370 Stage-1 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:30:59,370 Stage-1 map = 0%,  reduce = 0%
2019-02-12 05:31:06,540 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.49 sec
INFO  : 2019-02-12 05:31:06,540 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.49 sec
2019-02-12 05:31:12,653 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.29 sec
INFO  : 2019-02-12 05:31:12,653 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.29 sec
MapReduce Total cumulative CPU time: 2 seconds 290 msec
Ended Job = job_1549919838832_0005
Launching Job 2 out of 7
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0006, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0006/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0006
INFO  : MapReduce Total cumulative CPU time: 2 seconds 290 msec
INFO  : Ended Job = job_1549919838832_0005
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0006
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0006/
INFO  : Starting Job = job_1549919838832_0006, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0006/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0006
Hadoop job information for Stage-5: number of mappers: 1; number of reducers: 1
2019-02-12 05:31:37,597 Stage-5 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-5: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:31:37,597 Stage-5 map = 0%,  reduce = 0%
2019-02-12 05:31:49,323 Stage-5 map = 100%,  reduce = 0%, Cumulative CPU 3.24 sec
INFO  : 2019-02-12 05:31:49,323 Stage-5 map = 100%,  reduce = 0%, Cumulative CPU 3.24 sec
2019-02-12 05:31:55,512 Stage-5 map = 100%,  reduce = 100%, Cumulative CPU 4.02 sec
MapReduce Total cumulative CPU time: 4 seconds 20 msec
Ended Job = job_1549919838832_0006
Stage-9 is selected by condition resolver.
Stage-10 is filtered out by condition resolver.
Stage-2 is filtered out by condition resolver.
INFO  : 2019-02-12 05:31:55,512 Stage-5 map = 100%,  reduce = 100%, Cumulative CPU 4.02 sec
INFO  : MapReduce Total cumulative CPU time: 4 seconds 20 msec
INFO  : Ended Job = job_1549919838832_0006
INFO  : Stage-9 is selected by condition resolver.
INFO  : Stage-10 is filtered out by condition resolver.
INFO  : Stage-2 is filtered out by condition resolver.
Execution log at: /tmp/hadoop/hadoop_20190212053047_a2bcb673-b252-4277-85dd-8085248520aa.log
2019-02-12 05:32:00     Starting to launch local task to process map join;      maximum memory = 518979584
2019-02-12 05:32:01     Dump the side-table for tag: 1 with group count: 2 into file: file:/tmp/hadoop/e8520f79-8d60-4b0e-a593-b9cfbad8463e/hive_2019-02-12_05-30-47_300_154987026293528311-4/-local-10007/HashTable-Stage-6/MapJoin-mapfile21--.hashtable
2019-02-12 05:32:01     Uploaded 1 File to: file:/tmp/hadoop/e8520f79-8d60-4b0e-a593-b9cfbad8463e/hive_2019-02-12_05-30-47_300_154987026293528311-4/-local-10007/HashTable-Stage-6/MapJoin-mapfile21--.hashtable (346 bytes)
2019-02-12 05:32:01     End of local task; Time Taken: 0.824 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 4 out of 7
Number of reduce tasks is set to 0 since there's no reduce operator
INFO  : Execution completed successfully
INFO  : MapredLocal task succeeded
INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1549919838832_0007, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0007/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0007
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0007
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0007/
INFO  : Starting Job = job_1549919838832_0007, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0007/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0007
Hadoop job information for Stage-6: number of mappers: 1; number of reducers: 0
2019-02-12 05:32:16,644 Stage-6 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-6: number of mappers: 1; number of reducers: 0
INFO  : 2019-02-12 05:32:16,644 Stage-6 map = 0%,  reduce = 0%
2019-02-12 05:32:27,075 Stage-6 map = 100%,  reduce = 0%, Cumulative CPU 1.06 sec
INFO  : 2019-02-12 05:32:27,075 Stage-6 map = 100%,  reduce = 0%, Cumulative CPU 1.06 sec
MapReduce Total cumulative CPU time: 1 seconds 60 msec
Ended Job = job_1549919838832_0007
Launching Job 5 out of 7
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0008, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0008/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0008
INFO  : MapReduce Total cumulative CPU time: 1 seconds 60 msec
INFO  : Ended Job = job_1549919838832_0007
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0008
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0008/
INFO  : Starting Job = job_1549919838832_0008, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0008/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0008
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
2019-02-12 05:32:44,584 Stage-3 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:32:44,584 Stage-3 map = 0%,  reduce = 0%
2019-02-12 05:32:56,191 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 2.87 sec
INFO  : 2019-02-12 05:32:56,191 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 2.87 sec
2019-02-12 05:33:02,318 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 3.62 sec
INFO  : 2019-02-12 05:33:02,318 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 3.62 sec
MapReduce Total cumulative CPU time: 3 seconds 620 msec
Ended Job = job_1549919838832_0008
Launching Job 6 out of 7
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0009, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0009/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0009
INFO  : MapReduce Total cumulative CPU time: 3 seconds 620 msec
INFO  : Ended Job = job_1549919838832_0008
INFO  : Number of reduce tasks determined at compile time: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0009
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0009/
INFO  : Starting Job = job_1549919838832_0009, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0009/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0009
Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 1
2019-02-12 05:33:15,716 Stage-4 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:33:15,716 Stage-4 map = 0%,  reduce = 0%
2019-02-12 05:33:22,868 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.75 sec
INFO  : 2019-02-12 05:33:22,868 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.75 sec
2019-02-12 05:33:28,985 Stage-4 map = 100%,  reduce = 100%, Cumulative CPU 1.59 sec
MapReduce Total cumulative CPU time: 1 seconds 590 msec
Ended Job = job_1549919838832_0009
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 2.29 sec   HDFS Read: 7284 HDFS Write: 208 SUCCESS
Stage-Stage-5: Map: 1  Reduce: 1   Cumulative CPU: 4.02 sec   HDFS Read: 7284 HDFS Write: 208 SUCCESS
Stage-Stage-6: Map: 1   Cumulative CPU: 1.06 sec   HDFS Read: 5638 HDFS Write: 212 SUCCESS
Stage-Stage-3: Map: 1  Reduce: 1   Cumulative CPU: 3.62 sec   HDFS Read: 5420 HDFS Write: 212 SUCCESS
Stage-Stage-4: Map: 1  Reduce: 1   Cumulative CPU: 1.59 sec   HDFS Read: 5597 HDFS Write: 64 SUCCESS
Total MapReduce CPU Time Spent: 12 seconds 580 msec
OK
INFO  : 2019-02-12 05:33:28,985 Stage-4 map = 100%,  reduce = 100%, Cumulative CPU 1.59 sec
INFO  : MapReduce Total cumulative CPU time: 1 seconds 590 msec
INFO  : Ended Job = job_1549919838832_0009
+-------------+----------+---------+-------------+--+
| a.username  | a.month  | salary  | accumulate  |
+-------------+----------+---------+-------------+--+
| A           | 2015-01  | 33      | 33          |
| A           | 2015-02  | 10      | 43          |
| B           | 2015-01  | 30      | 30          |
| B           | 2015-02  | 15      | 45          |
+-------------+----------+---------+-------------+--+
4 rows selected (162.792 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> 

    最后寄语,以上是博主本次文章的全部内容,如果大家觉得博主的文章还不错,请点赞;如果您对博主其它服务器大数据技术或者博主本人感兴趣,请关注博主博客,并且欢迎随时跟博主沟通交流。

© 著作权归作者所有

共有 人打赏支持
em_aaron
粉丝 75
博文 111
码字总数 195178
作品 3
黄浦
高级程序员
私信 提问
大数据经典学习路线(及供参考)之 一

1.Linux基础和分布式集群技术 学完此阶段可掌握的核心能力: 熟练使用Linux,熟练安装Linux上的软件,了解熟悉负载均衡、高可靠等集群相关概念,搭建互联网高并发、高可靠的服务架构; 学完此...

柯西带你学编程
2018/05/22
0
0
Hadoop实战开发教程 Hadoop学习视频资料汇总

Hadoop实战开发教程 Hadoop学习视频汇总 Hadoop大数据零基础高端实战培训系列配文本挖掘项目(七大亮点、十大目标) 课程讲师:迪伦 课程分类:大数据 适合人群:初级 课时数量:230课时 用到技...

beifangbubai
2014/07/28
3.3K
4
大数据教程(12.1)hive中SQL操作补充知识

上一篇博客分享了hive的基础操作知识,本节博主将继续补充分享一些hive的SQL操作知识。 一、保存select查询结果的几种方式: 二、Hive Join操作 三、具体实例: 1、获取已经分配班级的学生姓...

em_aaron
01/26
0
0
大数据经典学习路线(及供参考)

转:https://blog.csdn.net/yuexianchang/article/details/52468291 目录(?)[+]

junzixing1985
2018/04/15
0
0
云计算资料分享(要的马上进来)

在网上搜集到了三套关于大数据云计算的教程资料,虽然不是很全面,但是总比没有的好,需要的朋友赶紧下载吧。 《基于Saas的云计算工作流中间件与大型企业管理云开发实战(及Paas平台下多系统生...

幻影1234
2013/04/02
542
8

没有更多内容

加载失败,请刷新页面

加载更多

【结构型】- 享元模式

享元模式 作用:利用共享技术有效地支持大量细粒度对象的复用 享元模式状态 内部状态:在享元对象内部不随外界环境改变而改变的共享部分,存储于享元对象内部 外部状态:随着环境的改变而改变...

ZeroneLove
昨天
1
0
Vue 中使用UEditor富文本编辑器-亲测可用-vue-ueditor-wrap

一、Vue中在使用Vue CLI开发中默认没法使用UEditor 其中UEditor中也存在不少错误,再引用过程中。 但是UEditor相对还是比较好用的一个富文本编辑器。 vue-ueditor-wrap说明 Vue + UEditor + ...

tianma3798
昨天
4
0
php-fpm配置

php-fpm配置 修改bbs.wangzb.cc.conf配置文件,将端口9000改为9001,重新访问网站是失败的 修改配置文件 # vim /etc/nginx/conf.d/bbs.wangzb.cc.conf# nginx -s reloadfastcgi_pass 1...

wzb88
昨天
1
0
配置方案:Redis持久化RDB和AOF

Redis持久化方案 Redis是内存数据库,数据都是存储在内存中,为了避免进程退出导致数据的永久丢失,需要定期将Redis中的数据以某种形式(数据或命令)从内存保存到硬盘。当下次Redis重启时,...

linuxprobe16
昨天
6
0
介绍NoSQL最受欢迎的产品

MongoDB MongoDB是一个基于分布式文件存储的数据库。由C++语言编写。主要解决的是海量数据的访问效率问题,为WEB应用提供可扩展的高性能数据存储解决方案。当数据量达到50GB以上的时候,Mon...

问题终结者
昨天
9
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部