hive 的HQL基本操作

阿新 • • 發佈：2019-01-10

1. 資料庫管理

1.建立一個名為myhive的資料庫,並新增描述資訊和屬性資訊

create database myhive comment 'this is myhive db' with dbproperties('author'='ljy','date'='2018-0510');

2.檢視屬性資訊

describe database extended myhive;

3.在資料庫原有屬性的基礎增加新的屬性

alter database myhive set dbproperties('id'='1');

4.切換資料庫myhive

use myhive 
;

5.列出所有的資料庫

show databases;

6.刪除資料庫myhive

drop database myhive;

2. 資料表管理

1.建立表

create table t_order(id int,name string,rongliang double) row format delimited fields terminated by '\t';

2.顯示所有的表

show tables;

3.查看錶結構

#方式1
desc t_order;
#方式2
desc extended t_order;
#方式3 

desc formatted t_order;

4.重命名錶

#將t_order的表名修改為t_user
alter table t_order rename to t_user;

5.修改列名

#將欄位rongliang改成int 型別的age,並新增描述資訊(comment的內容可選)
alter table t_user change column rongliang age int comment 'user name';

6.增加列

alter table t_user add columns(address string comment 'user address' 
,phone string comment 'user phone');

7.使用after 關鍵字,將修改後的欄位放在某欄位之後

#將phone欄位修改為email,並將其放在age之後
alter table t_user change column phone email string after age;

8.使用first關鍵字,將修改的欄位調到第一個欄位

#修改age的型別為tinyint,並將其移到第一個欄位
alter table t_user change column age age tinyint first;

9.刪除列

#刪除列時,將不要刪除的列放在columns裡面,有逗號隔開-->刪除address
alter table t_user replace columns(id int,name string,age int,email string);

3. 從檔案中向表中匯入資料

1.準備資料

vim user1.txt
0011    張三  男   20  [email protected]
0012    李四  男   22  [email protected]
0013    王五  女   25  [email protected]
0014    趙六  男   21  [email protected]

vim user2.txt
0021    小明  男   20  [email protected]
0022    小李  女   22  [email protected]
0023    小紅  女   25  [email protected]
0024    小剛  男   21  [email protected]

2.建立t_user1表

#建立t_user1表
create table t_user1(id int comment 'user id',
                     name string comment 'user name',
                     sex string comment 'user sex',
                     age int comment 'user age',
                     email string comment 'user email'
                    ) row format delimited fields terminated by '\t';
#建立t_user2表
create table t_user2(id int comment 'user id',
                     name string comment 'user name',
                     sex string comment 'user sex',
                     age int comment 'user age',
                     email string comment 'user email'
                    ) row format delimited fields terminated by '\t';

3.將1中建立的user2.txt上傳到dhfs上hive資料夾下

hadoop fs -put user2.txt /hive

4.從linux上將user1.txt匯入到表t_user1中

load data local inpath '/home/hadoop1/test/user1.txt' into table t_user1;

5.使用overwrite關鍵字,將user2.txt的資料匯入到t_user1表,則t_user1表的資料會被覆蓋成user2.txt

load data local inpath '/home/hadoop1/test/user2.txt' overwrite into table t_user1;

6.從hdfs上將user2.txt匯入到表t_user1中

#如果不加overwrite關鍵字,則是在原有資料的基礎上增加新的資料
#如果是從hdfs上載入資料到表中,則hdfs上的資料會被移動(不是複製哦,是直接移動)到t_usre1下面
load data inpath '/hive/user2.txt' into table t_user1;

4. 將資料從一個表中匯入到另一個表中

1.載入linux上的user2.txt到表t_user2

load data local inpath '/home/hadoop1/test/user2.txt' into table t_user2;

2.從表t_usre2中到載入資料到t_user1中(需要t_user1和t_user2具有想同的表結構)

insert into table t_user1 select * from t_user2;

3.將表t_user2的資料載入到t_user1中,並覆蓋user1中原來的資料(需要t_user1和t_user2具有想同的表結構)

insert overwrite table t_user1 select * from t_user2;

4.多插入模式

#需要t_user1和t_user2具有想同的表結構
from t_user2 insert into table t_user1 select *;
#從表t_user2中查詢到的結果集需要與表t_user1具有相同的資料結構(欄位個數,型別匹配)
from t_user2 insert into table t_user1 select id,name,sex,age,email;

5.多插入覆蓋模式

#結合4#1中的情況,該情況會覆蓋表t_user1的原有資料
from t_user2 insert overwrite table t_user1 select *;
#結合4#2中的情況,該情況會覆蓋表t_user1的原有資料
from t_user2 insert overwrite table t_user1 select id,name,sex,age,email;

5. 匯出表資料

1.匯出到linux本地

insert overwrite local directory '/home/hadoop1/test/fromhive' row format delimited fields terminated by '\t' select * from t_user1;

2.匯出到hdfs的資料夾內

insert overwrite directory 'hdfs://192.168.93.111:9000/hive' row format delimited fields terminated by '\t' select * from t_user1;

6. 內部表和外部表

內部表:在hive建立的時候,預設情況下hive負責管理資料,即hive把資料移入它的倉庫目錄”warehouse”

外部表:有使用者來控制資料的建立和刪除,外部資料的位置需要在建立表的時候指明,使用external關鍵字以後,hive知道資料並不由自己管理,因此不會把資料移動到自己的資料倉庫目錄.事實上在定義的時候,它甚至不是檢查這以路徑存在.這是一個非常重要的特性,因為這意味著你可以把建立資料推遲到建立表之後再進行.

主要區別:

內部表在刪除表的時候,元資料和資料檔案會一起被刪除.而外部表只會刪除表的元資料,不會刪除資料檔案本身.

1.根據hdfs上的檔案目錄建立外部表,刪除hdfs上的資料檔案,表中的資料也有沒有了

資料格式需要與表格式一樣,否則無法從表中得到正確的資料,可同時識別多個檔案到表中

create external table t_user3(id int comment 'user id',
                     name string comment 'user name',
                     sex string comment 'user sex',
                     age int comment 'user age',
                     email string comment 'user email'
                    ) row format delimited fields terminated by '\t' 
                    stored as textfile
                    location 'hdfs://192.168.93.111:9000/hive';
 -----------------------------------------------------------------------------------
 #將user2.txt傳到/hive資料夾下,即可從表中回去資料
 hadoop fs -put /home/hadoop1/test/user1.txt /hive
 hive> select * from t_user3;
    OK
    11  張三  男   20  zhangsan@163.com
    12  李四  男   22  lisi@qq.com
    13  王五  女   25  wangwu@126.com
    14  趙六  男   21  zhaoliu@qq.com

2.根據linux的檔案目錄建立外部表(資料檔案需要事先存在,建立表的時候才可以直接有資料,如果表建立了才建立的資料檔案,需要load data後才能有資料,這是linux本地與hdfs上建立外部表的區別)

同時刪除了linux上的資料檔案,還能從表中獲取到資料

create external table t_user4(id int comment 'user id',
                     name string comment 'user name',
                     sex string comment 'user sex',
                     age int comment 'user age',
                     email string comment 'user email'
                    ) row format delimited fields terminated by '\t' 
                    stored as textfile
                    location '/home/hadoop1/test/hivedb';

7. 表的查詢操作

準備測試資料

student.txt

欄位:sno sname sex age sdept

95001   張三  男   20  電腦科學
95002   李四  男   21  資訊與計算科學
95003   王五  女   19  應用數學
95004   趙六  男   22  軟體工程

建立student表

create table student(sno int,sname string,sex string,age int,sdept string) row format delimited fields terminated by '\t' stored as textfile;

load資料到表student

load data local inpath '/home/hadoop1/test/student.txt' into table student;

course.txt

欄位 cno cname

1   資料庫
2   數學
3   網路工程
4   資訊系統
5   作業系統
6   資料結構

建立course表

create table course(cno int,cname string) row format delimited fields terminated by '\t' stored as textfile;

load 資料到course表

load data local inpath '/home/hadoop1/test/course.txt' into table course;

sc.txt

欄位sno cno grade

95001   1   81
95001   2   85
95001   3   88
95002   4   98
95002   1   97
95002   2   90
95004   4   80

建立sc表

create table sc(sno int,cno int,grade int) row format delimited fields terminated by '\t' stored as textfile;

load資料到sc表

load data local inpath '/home/hadoop1/test/sc.txt' into table sc;

查詢操作

查詢選修了課程的學生姓名


#內連結    inner關鍵字可以省略

select distinct student.sno,sname from student inner join sc on student.sno=sc.sno;


#或者

select sno,sname from student where sno in (select distinct sno from sc);

查詢學生的總人數
```
select count(sno) from student;
```
計算1號課程的平均分
```
select avg(grade) from sc where cno=1;
```

查詢各科成績平均分


#hql的執行順序是 先分組,然後再求每組的平均分

select cno,avg(grade) from sc group by cno;

查詢選修1號課程的學生最高分數
```
select max(grade) from sc where cno=1;
```

左外連線

找出左右表中student.sno=sc.sno的共同部分,然後再加上左表中剩下的,對應右邊中用null來補齊

select * from student left join sc on student.sno=sc.sno;
<hr />

95001  張三  男   20  電腦科學   95001   1   81
95001  張三  男   20  電腦科學   95001   2   85
95001  張三  男   20  電腦科學   95001   3   88
95002  李四  男   21  資訊與計算科學 95002   4   98
95002  李四  男   21  資訊與計算科學 95002   1   97
95002  李四  男   21  資訊與計算科學 95002   2   90
95003  王五  女   19  應用數學    NULL    NULL    NULL
95004  趙六  男   22  軟體工程    95004   4   80

右外連線

找出左右表中共同的部分,然後再加上右表中的剩下的,對應左表中用null來補齊
```
select * from student right join sc on student.sno=sc.sno;
```
全連結(mysql不支援)

找出左右表中共同的部分,然後加上左表中剩餘的部分,對應右表中用null來補齊,之後再加上右表中剩餘的部分,對應左表中用null來補齊
```
select * from student full join sc on student.sno=sc.sno;
```

左外連線,右外連線,全連結更好的例子說明:

表a

id    name
001   張三
002 李四
003   趙六

表b

id    sex
001   男
002 女
005   男

左連線


select * from a left join b on a.id=b.id;
-----------------------------------------

001   張三  001 男
002 李四    002 女
003   趙六  null    null

右外連線


select * from a right join b on a.id=b.id;
-----------------------------------------

001   張三  001 男
002 李四    002 女
null  null    005 男

全外連結

select * from a full join b on a.id=b.id;
001   張三  001 男
002 李四    002 女
003   趙六  null    null
null  null    005 男

以上全是經過測試過的HQL語句,不全面,後期會再更新…

hive 的HQL基本操作

1. 資料庫管理 1.建立一個名為myhive的資料庫,並新增描述資訊和屬性資訊 create database myhive comment 'this is myhive db' with dbproperties('author'='ljy'

2、hive的基本操作

like -s txt code del class ext data 數據 1、創建表 hive>CREATE TABLE userTables(id INT,name STRING); 或者 hive> CREATE TABLE userTabl

hive的基本操作與應用

AI text -a SM 創建文件夾 con 結果基本 input 1.啟動hadoop 2.Hdfs上創建文件夾創建的文件夾是datainput 3.上傳文件至hdfs 啟動Hive 4。創建原始文檔表 5.導入文件內容到表docs並

大資料（十九）：hive資料庫基本操作與表分類

一、建立資料庫 1.建立一個數據，資料庫在HDFS上的預設儲存路徑是/user/hive/warehouse/*.db create database db_hive; 2.避免建立的資料庫已經存在，增加if not exists create database

Java程式碼實現對hive的基本操作

1.匯入jar包在eclipse上新建java專案,並在專案下建個lib資料夾,然後將jar包放到lib中匯入專案 hive的lib下的將其全部匯入到專案中 2.測試在你要測試的hive的主機的/usr/tmp建個student檔案,裡面放入一些

深入淺出學Hive——Hive Shell基本操作

•從命令列執行指定的sql語句 •$HIVE_HOME/bin/hive -e 'select a.col from tab1 a' •以指定的hive環境變數執行指定的sql語句 •$HIVE_HOME/bin/hive -e 'select a.col from tab1 a' -hivecon

Hadoop---Hive的基本操作

建立資料庫(使用SCHEMA方式) CREATE SCHEMA userdb; 建立資料庫 CREATE DATABASE userdb; 建立資料庫（存在則不建立，不存在則建立） CREATE DATABASE IF NOT EXISTS userdb; 列出資料庫列表 SHOW

hive--hdfs基本操作

檢視檔案 hdfs dfs -ls/hive/default/qh_oi_detail_part_final_samuel 命令 hdfs dfs -du -h /data/ 結果 102.3 M 307.0 M /data/ 第一列標示該目錄下總檔案大小第

HIVE HSQL 基本操作命令

建立表：　　hive>create table tablename(id int,name string,password string); 　　建立一個名字為tablename的表，表的屬性有int id; string name; s

hive基本操作與應用

nbsp ima doc 統計 info inf 文檔 http hadoop 通過hadoop上的hive完成WordCount 啟動hadoop Hdfs上創建文件夾上傳文件至hdfs 啟動Hive 創建原始文檔表導入文件內容到表docs並查看用

Hive基本操作與案例

IV -- 案例文件沒有 rep alt mp4 function 1. 創建數據庫，切換數據庫 create database testdb; use testdb; 2. 創建管理表 create table emp( empno

Hive基本操作

上傳新的兩個文件修改 rec ner store fun mapred Hive基本操作 01.Hive是什麽 Hive介紹 Hive是基於Hadoop的一個數據倉庫工具，可以將結構化的數據文件映射為一張數據庫表，並提供類SQL查詢功能。 Hive是SQL解析引

hive的資料組織格式和基本操作

hive的資料組織格式: 庫：管理資料不同模組的資料最好放在不同的資料庫中 &n

HIVE安裝和基本操作

一、安裝HIVE 1、在本地檔案系統中找到hadoop所在目錄，開啟命令終端，如圖： 2、通過命令終端進入到hadoop所在資料夾，並通過命令ls 檢視bin檔案下所有命令，如圖： 3、執行啟動命令start-all.sh啟動hadoop，並用jps命令檢視所有程序是否成功啟動，如

大資料（十七）：Hive簡介、安裝與基本操作

一、簡介 Hive由Facebook開源用於解決海量結構化日誌的資料統計。Hive是基於Hadoop的一個數據倉庫工具，可以將結構化的資料檔案對映為一張表，並提供類Sql查詢的功能。 hive本質是將HQL轉化為MapRedu

hive資料庫概念和基本操作

hive資料庫：概念： hive中的資料庫本質上僅僅是表的一個目錄或名稱空間。對於具有很多使用者和組的大叢集而言，這種方式可以避免表的命名衝突。通常會使用資料庫來將生產表組織成邏輯組。示例：本地模式配置了資料庫的統一存放目錄為~/hive/warehouse，那麼建立的所有資料庫都存在

Hive安裝與基本操作(一)

Hive安裝地址 1．Hive官網地址 http://hive.apache.org/ 2．文件檢視地址 https://cwiki.apache.org/confluence/display/Hive/GettingStarted 3．下載地址 http://archive.apache.org

hive筆記-hive配置及基本操作

hive筆記 1、hive中涉及的概念： 1、hive介紹： 1、是建立在 Hadoop 上的資料倉庫基礎構架。它提供了一系列的工具，可以用來進行資料提取轉化載入（ETL）。 2、這是一種可以

004-hive基本操作

文章目錄 hive 基本操作 1、資料庫的基本操作 2、表的操作 3、表的其他操作 4、hive匯入資料 5、hive 匯出資料 hive 基本操作 1、資料庫的基本操作 1）預設的資料庫

Hive基本操作，DDL操作(建立表，修改表，顯示命令)，DML操作(Load Insert Select),Hive Join,Hive Shell引數(內建運算子、內建函式)等

1. Hive基本操作 1.1 DDL操作1.1.1 建立表建表語法 CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name

hive 的HQL基本操作

1. 資料庫管理

2. 資料表管理

3. 從檔案中向表中匯入資料

4. 將資料從一個表中匯入到另一個表中

5. 匯出表資料

6. 內部表和外部表

7. 表的查詢操作

相關推薦