文档章节

hive常用sql语句

 张欢19933
发布于 2017/03/10 19:15
字数 1261
阅读 25
收藏 1

数据库

创建数据库

hive> CREATE DATABASE financials;


hive> CREATE DATABASE IF NOT EXISTS financials;

显示现在有的数据库

hive> SHOW DATABASES;
default
financials
hive> CREATE DATABASE human_resources;
hive> SHOW DATABASES;
default
financials
human_resources

条件查询数据库

hive> SHOW DATABASES LIKE 'h.*';
human_resources
hive> ...

创建指定存放文件位置 数据库

hive> CREATE DATABASE financials
    > LOCATION '/my/preferred/directory';

创建数据库时 添加注释信息

hive> CREATE DATABASE financials
    > COMMENT 'Holds all financial tables';
hive> DESCRIBE DATABASE financials;
financials   Holds all financial tables
  hdfs://master-server/user/hive/warehouse/financials.db

使用数据库

hive> USE financials;

删除数据库

hive> DROP DATABASE IF EXISTS financials;

当数据库存在表时,先要删除表 再能删除数据库

当数据库中存在表时无法直接删除,会提示数据库非空,存在表,这时可以使用CASCADE关键字

hive> DROP DATABASE IF EXISTS financials CASCADE;

数据表

创建表

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY(col_name[ASC|DESC], ...)] INTO num_buckets BUCKETS]
[SKEWED BY (col_name, col_name, ...) ON ([(col_value, col_value, ...), ...|col_value, col_value, ...]) [STORED AS DIRECTORIES]
  [ [ROW FORMAT row_format] [STORED AS file_format]
   | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)] 
  [AS select_statement] 


例:
CREATE TABLE IF NOT EXISTS mydb.employees (
  name         STRING COMMENT 'Employee name',
  salary       FLOAT  COMMENT 'Employee salary',
  subordinates ARRAY<STRING> COMMENT 'Names of subordinates',
  deductions   MAP<STRING, FLOAT>
               COMMENT 'Keys are deductions names, values are percentages
  address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>
               COMMENT 'Home address')
COMMENT 'Description of the table'
TBLPROPERTIES ('creator'='me', 'created_at'='2012-01-02 10:00:00', ...)
LOCATION '/user/hive/warehouse/mydb.db/employees';
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
  LIKE existing_table_or_view_name
  [LOCATION hdfs_path];


例:
CREATE TABLE IF NOT EXISTS mydb.employees2
LIKE mydb.employees;

显示某个数据库中的表

hive> USE mydb;
hive> SHOW TABLES;
employees
table1
table2

显示指定筛选条件 表名

hive> USE mydb;
hive> SHOW TABLES 'empl.*';
employees

显示表扩展信息

hive> DESCRIBE EXTENDED mydb.employees;
name    string  Employee name
salary  float   Employee salary
subordinates    array<string>   Names of subordinates
deductions      map<string,float> Keys are deductions names, values are percentages
address struct<street:string,city:string,state:string,zip:int>  Home address
Detailed Table Information      Table(tableName:employees, dbName:mydb, owner:me,
...
location:hdfs://master-server/user/hive/warehouse/mydb.db/employees,
parameters:{creator=me, created_at='2012-01-02 10:00:00',
            last_modified_user=me, last_modified_time=1337544510,
            comment:Description of the table, ...}, ...)

指定显示某个字段的信息

hive> DESCRIBE mydb.employees.salary;
salary  float   Employee salary

外部表,删除表不删除数据

CREATE EXTERNAL TABLE IF NOT EXISTS stocks (
  exchange        STRING,
  symbol          STRING,
  ymd             STRING,
  price_open      FLOAT,
  price_high      FLOAT,
  price_low       FLOAT,
  price_close     FLOAT,
  volume          INT,
  price_adj_close FLOAT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/data/stocks';

复制表结构仓库外部表

CREATE EXTERNAL TABLE IF NOT EXISTS mydb.employees3
LIKE mydb.employees
LOCATION '/path/to/data';

分区表

CREATE TABLE employees (
  name         STRING,
  salary       FLOAT,
  subordinates ARRAY<STRING>,
  deductions   MAP<STRING, FLOAT>,
  address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>
)
PARTITIONED BY (country STRING, state STRING);

HIVE 可以将分区表设置成“严格”模式,禁止分区表的查询没有一个WHERE子句

hive> set hive.mapred.mode=strict;
hive> SELECT e.name, e.salary FROM employees e LIMIT 100;
FAILED: Error in semantic analysis: No partition predicate found for
 Alias "e" Table "employees"
hive> set hive.mapred.mode=nonstrict;
hive> SELECT e.name, e.salary FROM employees e LIMIT 100;

查看现有分区

hive> SHOW PARTITIONS employees;
...
Country=CA/state=AB
country=CA/state=BC
...
country=US/state=AL
country=US/state=AK

查看分区详细 分区键

hive> SHOW PARTITIONS employees PARTITION(country='US');
country=US/state=AL
country=US/state=AK
...
hive> SHOW PARTITIONS employees PARTITION(country='US', state='AK');
country=US/state=AK

通过 DESC 显示分区键

hive> DESCRIBE EXTENDED employees;
name         string,
salary       float,
...
address      struct<...>,
country      string,
state        string
Detailed Table Information...
partitionKeys:[FieldSchema(name:country, type:string, comment:null),
FieldSchema(name:state, type:string, comment:null)],
...

查看建表语句

hive> show create table score;
OK
CREATE  TABLE `score`(
  `num` string,
  `name` string,
  `score` float)
PARTITIONED BY (
  `year` string,
  `month` string,
  `day` string)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\t'
  LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://hdpcls1/user/hive/warehouse/score'
TBLPROPERTIES (
  'transient_lastDdlTime'='1489117674')
Time taken: 0.35 seconds, Fetched: 19 row(s)

从表读入数据到表中

INSERT OVERWRITE TABLE employees
PARTITION (country = 'US', state = 'OR')
SELECT * FROM staged_employees se
WHERE se.cnty = 'US' AND se.st = 'OR';

从文件读入 分区表

从本地
LOAD DATA LOCAL INPATH '${env:HOME}/california-employees'
INTO TABLE employees
PARTITION (country = 'US', state = 'CA');

从hdfs
LOAD DATA  INPATH '/california-employees'
INTO TABLE employees
PARTITION (country = 'US', state = 'CA');

为外部表增加指定分区

ALTER TABLE log_messages ADD PARTITION(year = 2012, month = 1, day = 2)
LOCATION 'hdfs://master_server/data/log_messages/2012/01/02';

导出数据

INSERT OVERWRITE LOCAL DIRECTORY '/tmp/ca_employees'
SELECT name, salary, address
FROM employees
WHERE se.state = 'CA';

桶表

CREATE TABLE par_table(viewTime INT, userid BIGINT,
     page_url STRING, referrer_url STRING,
     ip STRING COMMENT 'IP Address of the User')
 COMMENT 'This is the page view table'
 PARTITIONED BY(date STRING, pos STRING)
 CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
 ROW FORMAT DELIMITED ‘\t’
   FIELDS TERMINATED BY '\n'
STORED AS SEQUENCEFILE;

删除表

DROP TABLE IF EXISTS employees;

修改表结构

ALTER TABLE modifies  table metadata  only. The  data  for  the  table  is
untouched. It’s up to you to ensure that any modifications are consistent
with the actual data.

修改表名

ALTER TABLE log_messages RENAME TO logmsgs;

增加,修改,删除 表分区

ALTER TABLE log_messages ADD IF NOT EXISTS
PARTITION (year = 2011, month = 1, day = 1) LOCATION '/logs/2011/01/01'
PARTITION (year = 2011, month = 1, day = 2) LOCATION '/logs/2011/01/02'
PARTITION (year = 2011, month = 1, day = 3) LOCATION '/logs/2011/01/03'
ALTER TABLE log_messages PARTITION(year = 2011, month = 12, day = 2)
SET LOCATION 's3n://ourbucket/logs/2011/01/02';
ALTER TABLE log_messages DROP IF EXISTS PARTITION(year = 2011, month = 12, day = 2);

修改列

ALTER TABLE log_messages
CHANGE COLUMN hms hours_minutes_seconds INT
COMMENT 'The hours, minutes, and seconds part of the timestamp'
AFTER severity;

这个命令可以允许改变列名、数据类型、注释、列位置或者它们的任意组合

增加列

ALTER TABLE log_messages ADD COLUMNS (
 app_name   STRING COMMENT 'Application name',
 session_id LONG   COMMENT 'The current session id');

字段位置在所有列后面(partition列前)

删除 替换列

ALTER TABLE log_messages REPLACE COLUMNS (
 hours_mins_secs INT    COMMENT 'hour, minute, seconds from timestamp',
 severity        STRING COMMENT 'The message severity'
 message         STRING COMMENT 'The rest of the message');

 

© 著作权归作者所有

粉丝 47
博文 533
码字总数 244932
作品 0
海淀
私信 提问
5-Hadoop之旅-Hive(一)

Hive能做什么? Hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供完整的sql查询功能,可以将sql语句转换为MapReduce任务进行运行。其优点是学习成本低...

SET
2016/09/12
85
0
hive(04)、使用dbeaver客户端连接hive数据仓库

我们登陆hive shell 写复杂的长的sql语句不是很方便,没有格式化拷贝粘贴等常用操作,查询结果也不是很直观,时我们可以使用第三方的客户端连接hive进行操作,于是我们使用支持hive的数据库客...

MaxBill
2018/01/15
510
4
Presto介绍与常用查询优化方法

Presto Hive使用MapReduce作为底层计算框架,是专为批处理设计的。但随着数据越来越多,使用Hive进行一个简单的数据查询可能要花费几分到几小时,显然不能满足交互式查询的需求。 2012年秋季...

高广超
2018/10/10
0
0
Impala SQL 语言参考

Impala SQL 语言参考 Cloudera Impala 的查询语言是基于 SQL 的。为了保护用户在技能和查询设计方面的已有投资,Impala 提供与 Hive 查询语言(HiveQL)的高度兼容: 因为使用与 Hive 记录表结...

weiqingbin
2013/12/20
11.1K
1
hive无法执行带where语句的SQL

应用场景 当在伪分布式集群上,搭建部署了hive以后,发现hive无法执行带where语句的sql,那hive将无法使用,下面介绍解决该问题的方案! 操作步骤 hive连接执行sql,可以执行带where语句的s...

wsc449
2017/11/13
0
0

没有更多内容

加载失败,请刷新页面

加载更多

EDI 电子数据交换全解指南

EDI(Electronic Data Interchange,电子数据交换)技术使得企业与企业(B2B)实现通信自动化,帮助交易伙伴和组织更快更好地完成更多工作,并消除了人工操作带来的错误。从零售商到制造商、物...

EDI知行软件
今天
3
0
CentOS7的LVM动态扩容

# 问题 CentOS7上面的磁盘空间有点紧张,需要扩容。 解决 查询当前磁盘状态 [root@xxx ~]# lsblkNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTfd0 2:0 1 4K ...

亚林瓜子
今天
4
0
Kafka 0.8 Producer (0.9以前版本适用)

Kafka旧版本producer由scala编写,0.9以后已经废除 示例代码如下: import kafka.producer.KeyedMessage;import kafka.javaapi.producer.Producer;import kafka.producer.ProducerConfig;......

实时计算
今天
5
0
Giraph源码分析(八)—— 统计每个SuperStep中参与计算的顶点数目

作者|白松 目的:科研中,需要分析在每次迭代过程中参与计算的顶点数目,来进一步优化系统。比如,在SSSP的compute()方法最后一行,都会把当前顶点voteToHalt,即变为InActive状态。所以每次...

数澜科技
今天
6
0
Navicat 快捷键

操作 结果 ctrl+q 打开查询窗口 ctrl+/ 注释sql语句 ctrl+shift +/ 解除注释 ctrl+r 运行查询窗口的sql语句 ctrl+shift+r 只运行选中的sql语句 F6 打开一个mysql命令行窗口 ctrl+l 删除一行 ...

低至一折起
今天
13
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部