hive笔记 安装

原创
2017/06/25 16:38
阅读数 141

摘自:https://my.oschina.net/jackieyeah/blog/735424

安装好jdk,hadoop(单机解压安装)
    wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.8.0/hadoop-2.8.0.tar.gz
    tar -zxvf hadoop-2.7.3.tar.gz -C /usr/local/
    mv /usr/local/hadoop-2.7.3/ /usr/local/hadoop
    sudo chown -R hadoop ./hadoop
    sudo useradd -m hadoop -s /bin/bash
    sudo passwd hadoop
    sudo adduser hadoop sudo
    免密ssh
    cd ~/.ssh/
    ssh-keygen -t rsa
    cat ./id_rsa.pub >> ./authorized_keys
    ssh localhost

    su hadoop
    查看版本
    cd hadoop/
    ./bin/hadoop version
    跑个例子
    mkdir ./input
    cp ./etc/hadoop/*.xml ./input
    ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar                                                                      grep ./input ./output 'dfs[a-z.]+'
    cat ./output/*
    ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar                                                                      grep ./input ./output 'dfs[a-z.]+'
    rm -r ./output
    
安装hive
    wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/stable-2/apache-hive-2.1.1-bin.tar.gz
    tar -zxvf apache-hive-2.1.1-bin.tar.gz -C /usr/local/
    mv /usr/local/apache-hive-2.1.1-bin/ /usr/local/hive2.1.1
    
/etc/profile
    export HADOOP_HOME=/usr/local/hadoop
    export HIVE_HOME=/usr/local/hive2.1.1
    export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:$HIVE_HOME/conf:$PATH

    source /etc/profile

    sudo chown -R hadoop hive2.1.1/

su hadoop
source /etc/profile
    
hadoop@ubuntu64:/usr/local/hive2.1.1/conf$ cp hive-env.sh.template hive-env.sh
hadoop@ubuntu64:/usr/local/hive2.1.1/conf$ cp hive-default.xml.template hive-site.xml
hadoop@ubuntu64:/usr/local/hive2.1.1/conf$ cp hive-log4j2.properties.template hive-log4j2.properties
hadoop@ubuntu64:/usr/local/hive2.1.1/conf$ cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
hadoop@ubuntu64:/usr/local/hive2.1.1/conf$ ls

在 hive-env.sh 文件中指定 Hadoop 安装路径
export JAVA_HOME=/usr/local/jdk1.7.0_75
export HADOOP_HOME=/usr/local/hadoop
export HIVE_HOME=/usr/local/hive2.1.1
export HIVE_CONF_DIR=/usr/local/hive2.1.1/conf
export HIVE_AUX_JARS_PATH=/usr/local/hive2.1.1/lib
 
export JAVA_HOME=/usr/local/jdk1.7.0_75
export JRE_HOME=/usr/local/jdk1.7.0_75/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME JRE_HOME CLASSPATH PATH
export HADOOP_HOME=/usr/local/hadoop
export HIVE_HOME=/usr/local/hive2.1.1
export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:$HIVE_HOME/conf:$PATH


创建HDFS目录
    感觉要用root啊
    在Hive 中创建表之前需要创建以下 HDFS 目录并给它们赋相应的权限。
hdfs dfs -mkdir -p /user/hive/warehouse
hdfs dfs -mkdir -p /user/hive/tmp
hdfs dfs -mkdir -p /user/hive/log
hdfs dfs -chmod g+w /user/hive/warehouse
hdfs dfs -chmod g+w /user/hive/tmp
hdfs dfs -chmod g+w /user/hive/log

修改hive-site.xml
将 hive-site.xml 文件中以下几个配置项的值设置成上一步中创建的几个路径
<property>
    <name>hive.exec.scratchdir</name>
    <value>/user/hive/tmp</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
</property>
<property>
    <name>hive.querylog.location</name>
    <value>/user/hive/log</value>
    <description>Location of Hive run time structured log file</description>
</property>

Hive Metastore

默认情况下, Hive 的元数据保存在内嵌的 Derby 数据库里, 但一般情况下生产环境会使用 MySQL 来存放 Hive 元数据。
创建数据库和用户

假定你已经安装好 MySQL。下面创建一个 hive 数据库用来存储 Hive 元数据,且数据库访问的用户名和密码都为 hive。
mysql> CREATE DATABASE hive;
mysql> USE hive;
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL ON hive.* TO 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
mysql> FLUSH PRIVILEGES;
mysql> quit;

修改hive-site.xml
需要在 hive-site.xml 文件中配置 MySQL 数据库连接信息。
<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8&amp;useSSL=false</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
</property>
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
</property>
 
运行Hive
在命令行运行 hive 命令时必须保证以下两点:
    HDFS 已经启动。可以使用 start-dfs.sh 脚本来启动 HDFS。
    MySQL Java 连接器添加到 $HIVE_HOME/lib 目录下。我安装时使用的是 mysql-connector-java-5.1.41.jar。
从 Hive 2.1 版本开始, 我们需要先运行 schematool 命令来执行初始化操作。
schematool -dbType mysql -initSchema

jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8&amp;useSSL=false

root@ubuntu64:/usr/local/hive2.1.1/conf# schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:        jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:       root
Starting metastore schema initialization to 2.1.0
Initialization script hive-schema-2.1.0.mysql.sql
Initialization script completed
schemaTool completed

hive命令报错
Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D                                                                   java.io.tmpdir%7D/$%7Bsystem:user.name%7D
        at java.net.URI.checkPath(URI.java:1804)
        at java.net.URI.<init>(URI.java:752)
        at org.apache.hadoop.fs.Path.initialize(Path.java:202)
        ... 12 more
hive-site.xml中添加:
<property>
    <name>system:java.io.tmpdir</name>
    <value>/usr/local/hive2.1.1/tmpdir</value>
</property>
<property>
    <name>system:user.name</name>
    <value>root</value>
</property>
创建目录:
root@ubuntu64:/usr/local/hive2.1.1# ls
root@ubuntu64:/usr/local/hive2.1.1# mkdir tmpdir
root@ubuntu64:/usr/local/hive2.1.1# hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/usr/local/hive2.1.1/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive> show tables;
OK
Time taken: 1.5 seconds

CREATE TABLE IF NOT EXISTS employee_hr( name string, employee_id int, sin_number string, start_date timestamp ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE;
hive> CREATE TABLE IF NOT EXISTS employee_hr( name string, employee_id int, sin_number string, start_date timestamp ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE;
OK
Time taken: 1.094 seconds
hive> LOAD DATA LOCAL INPATH '/home/ken/employee.txt' OVERWRITE INTO TABLE employee_hr;
Loading data to table default.employee_hr
OK
Time taken: 0.524 seconds
hive>

CREATE TABLE employee_bk AS SELECT * FROM employee_hr;
TRUNCATE TABLE employee_hr //删除employee中的数据,保留表结构

INSERT INTO TABLE employee SELECT * FROM employee_hr;

LOAD DATA LOCAL INPATH '/home/ken/employee_hr.txt' OVERWRITE INTO TABLE employee_hr;

 

展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部