HIVE直接读入json
- get_json_object(jsonStr, kn)
返回值: 一个key对应的一个字段
说明:解析json的字符串jsonStr,若输入的json字符串无效,则返回NULL。
样例:表test,字段json_str存储了json字符串
{ "firstName": "Brett", "lastName":"McLaughlin", "email": "aaaa" }
{ "firstName": "jesse", "lastName":"AJ", "email": "bb" }
Hive处理语句:select get_json_object(json_str,'$.firstName') from origin_lefuwx.test t
- json_tuple(jsonStr, k1, k2, ...)
返回值:多个key对应的多个字段
说明:比 get_json_object 高效,可以在一次调用中输入多个键
样例:表test,字段json_str存储了json字符串
{ "firstName": "Brett", "lastName":"McLaughlin", "email": "aaaa" }
{ "firstName": "jesse", "lastName":"AJ", "email": "bb" }
HIVE处理语句:select json_tuple(json_str,'firstName','lastName','email') from origin_lefuwx.test t
HIVE直接读入url
- parse_url(url, partToExtract[, key])
partToExtract的选项包含[HOST,PATH,QUERY,REF,PROTOCOL,FILE,AUTHORITY,USERINFO]
样例:
http://192.168.111.23:8888/filebrowser/view=/DB/lefuwx/test/part-m-00000
Hive处理语句:select parse_url('http://192.168.111.23:8888/filebrowser/view=/DB/lefuwx/test/part-m-00000','HOST')
- parse_url_tuple(url, k1,k2..)
partToExtract的选项包含[HOST,PATH,QUERY,REF,PROTOCOL,FILE,AUTHORITY,USERINFO]
样例:
http://192.168.111.23:8888/filebrowser/view=/DB/lefuwx/test/part-m-00000
Hive处理语句:select parse_url_tuple('http://192.168.111.23:8888/filebrowser/view=/DB/lefuwx/test/part-m-00000','HOST','PATH')