HIVE表中分区的删除

2022-05-07 14:27:56 浏览数 (1)

HIVE本身是不太支持更新的,要从其中删除某一行其实也是费劲的。

不过HIVE本身还提供一种机制,可以删除其中的分区。只要某一条记录在某个分区中,就可以实现用个“转弯”的方式来实现,即先删除分区,再手动去掉这条记录,再导入到分区中。

如首先查到某个表中有记录7904个, 

用这个命令:ALTER TABLE shphonefeature DROP IF EXISTS PARTITION(year = 2015, month = 10, day = 1);删除掉指定分区

再一查数据,就没有任何数据了。

hive> select count(*) from shphonefeature; Query ID = ndscbigdata_20160331105618_575ad188-25b8-4de8-9f79-0aa306908193 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes):   set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers:   set hive.exec.reducers.max=<number> In order to set a constant number of reducers:   set mapreduce.job.reduces=<number> Starting Job = job_1459079550905_0023, Tracking URL = http://ubuntu-bigdata-5:8088/proxy/application_1459079550905_0023/ Kill Command = /usr/hadoop/bin/hadoop job  -kill job_1459079550905_0023 Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1 2016-03-31 10:56:24,782 Stage-1 map = 0%,  reduce = 0% 2016-03-31 10:56:30,096 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 3.27 sec 2016-03-31 10:56:31,135 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.53 sec 2016-03-31 10:56:34,234 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.88 sec MapReduce Total cumulative CPU time: 6 seconds 880 msec Ended Job = job_1459079550905_0023 MapReduce Jobs Launched:  Stage-Stage-1: Map: 2  Reduce: 1   Cumulative CPU: 6.88 sec   HDFS Read: 516060 HDFS Write: 5 SUCCESS Total MapReduce CPU Time Spent: 6 seconds 880 msec OK 7910 Time taken: 17.572 seconds, Fetched: 1 row(s) hive> ALTER TABLE shphonefeature DROP IF EXISTS PARTITION(year = 2015, month = 10, day = 1); Dropped the partition year=2015/month=10/day=1 OK Time taken: 0.238 seconds hive> select * from shphonefeature limit 10; OK Time taken: 0.117 seconds

0 人点赞