1. 程式人生 > >hbase-0.98.3小試牛刀

hbase-0.98.3小試牛刀

最近一直在考慮統計分析的基礎資料、中間資料、結果資料該怎麼存放才有利於寫入、讀取、彙總了,Mysql當然是不二人選,不過涉及到更新的時候,都要先select後update,效率比較低,另外一個就是不同維度的統計資料要是排序的話,那就慘了,到處都是索引,整一張表就是各種索引了,又導致寫入變得更加慢,當然有很多辦法可以加速這個過程,也是可以信賴的,不過總不甘心,想試一試是否有更好辦法了!這時hbase自然出現了,天然的和Hadoop在一塊,有基礎維度後就可以計算出各種其它不同的維度,各種維度計算還是可以重複進行的,統計資料本就是一個維度+資料這樣的結果,剛好就是key/value了,如果需要支援value排序的話,也可以完美解決!遂想,好歹先試一試了,這不才有如下這個實驗了!

Hadoop環境:

hadoop2.2.0 +HA(QJM),4節點

Hbase環境:

hbase-0.98.3-hadoop2(安裝目錄:/home/hadoop/hbase-0.98.3-hadoop2),4個節點(hadoop25\hadoop28\hadoop201\hadoop224)

ZK環境:

三節點的獨立的ZK叢集(ZK25、ZK28、ZK224,clientPort為2181)

以下為具體安裝Hbase-0.98.3那些事!

我有現成已經搭建好,可以穩定執行的Hadoop2.2.0+HA(QJM)的環境,搭建hbase其實很容易的(測試而已,所以要求也不高)。

1、下載hbase-0.98.3-hadoop2-bin.tar.gz

直接從http://hbase.apache.org下載最新的穩定版本hbase-0.98.3-hadoop2,這個版本預設支援的就是hadoop2.2.0,所以省了好些麻煩事。

2、配置那點事

(1)修改hadoop2.2.0的配置檔案hdfs-site.xml(增加支援append和增大開啟檔案的個數限制)

     <property>
        <name>dfs.datanode.max.xcievers</name>
        <value>4096</value>
     </property>
     <property> 
        <name>dfs.support.append</name> 
        <value>true</value> 
     </property>

這個配置修改了,需要把Hadoop叢集重啟。(我

(2)修改hbase的配置檔案conf/hbase-env.sh
#
#/**
# * Copyright 2007 The Apache Software Foundation
# *
# * Licensed to the Apache Software Foundation (ASF) under one
# * or more contributor license agreements.  See the NOTICE file
# * distributed with this work for additional information
# * regarding copyright ownership.  The ASF licenses this file
# * to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# * with the License.  You may obtain a copy of the License at
# *
# *     http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */


# Set environment variables here.


# This script sets variables multiple times over the course of starting an hbase process,
# so try to keep things idempotent unless you want to take an even deeper look
# into the startup scripts (bin/hbase, etc.)


# The java implementation to use.  Java 1.6 required.
export JAVA_HOME=/home/hadoop/jdk1.7.0_45


# Extra Java CLASSPATH elements.  Optional.
# export HBASE_CLASSPATH=


# The maximum amount of heap to use, in MB. Default is 1000.
export HBASE_HEAPSIZE=1024


# Extra Java runtime options.
# Below are what we set by default.  May only work with SUN JVM.
# For more on why as well as other possible settings,
# see http://wiki.apache.org/hadoop/PerformanceTuning
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"


# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.


# This enables basic gc logging to the .out file.
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"


# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"


# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"


# Uncomment one of the below three options to enable java garbage collection logging for the client processes.


# This enables basic gc logging to the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"


# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"


# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"


# Uncomment below if you intend to use the EXPERIMENTAL off heap cache.
# export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="
# Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.




# Uncomment and adjust to enable JMX exporting
# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.
# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
#
# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"
# export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"


# File naming hosts on which HRegionServers will run.  $HBASE_HOME/conf/regionservers by default.
# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers


# Uncomment and adjust to keep all the Region Server pages mapped to be memory resident
#HBASE_REGIONSERVER_MLOCK=true
#HBASE_REGIONSERVER_UID="hbase"


# File naming hosts on which backup HMaster will run.  $HBASE_HOME/conf/backup-masters by default.
# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters


# Extra ssh options.  Empty by default.
# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"


# Where log files are stored.  $HBASE_HOME/logs by default.
# export HBASE_LOG_DIR=${HBASE_HOME}/logs


# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers 
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"


# A string representing this instance of hbase. $USER by default.
# export HBASE_IDENT_STRING=$USER


# The scheduling priority for daemon processes.  See 'man nice'.
# export HBASE_NICENESS=10


# The directory where pid files are stored. /tmp by default.
export HBASE_PID_DIR=/home/hadoop/hbase-0.98.3-hadoop2


# Seconds to sleep between slave commands.  Unset by default.  This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HBASE_SLAVE_SLEEP=0.1


# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false


# The default log rolling policy is RFA, where the log file is rolled as per the size defined for the 
# RFA appender. Please refer to the log4j.properties file to see more details on this appender.
# In case one needs to do log rolling on a date change, one should set the environment property
# HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
# For example:
# HBASE_ROOT_LOGGER=INFO,DRFA
# The reason for changing default to RFA is to avoid the boundary case of filling out disk space as 
# DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.
(3)修改hbase的配置檔案conf/hbase-size.xml(利用現成的ZK叢集,不使用hbase自行管理ZK模式)
<configuration>
    <property>
	    <name>hbase.rootdir</name>
	    <value>hdfs://mycluster/hbase</value> <span style="background-color: rgb(255, 0, 0);"> <!--必須和core-site.xml的fs.defaultFS值一致--></span>
	</property>
	<property>
	    <name>hbase.cluster.distributed</name>
	    <value>true</value>
	</property>
    <property>  
        <name>hbase.tmp.dir</name>  
        <value>/home/hadoop/hbase-0.98.3-hadoop2/tmp</value>  
    </property> 
    <property>
	    <name>hbase.zookeeper.quorum</name> 
	    <value>zk25,zk28,zk224</value>
	</property>
</configuration>

(4)修改hbase的配置檔案conf/regionservers(指定啟動作為regionserver的節點)
<pre name="code" class="html">hadoop25
hadoop28
hadoop201
hadoop224

(5)把hadoop/etc/hadoop/hdfs-site.xml檔案複製到hbase的conf目錄下

(這個很重要哦,否則會報錯誤,我就遇到問題了,報mycluster未未知的主機)

------------------------------------------------------------------------------------------------------------------------------------------------

===將上述動作在每一臺需要安裝hbase的節點上重複執行,呵呵,其實配好一臺機器,直接COPY更好====

------------------------------------------------------------------------------------------------------------------------------------------------

(6)再就是人見人愛的執行下
/home/hadoop/hbase-0.98.3-hadoop2/bin/start-hbase.sh

完成啟動了。

(7)看看hbase有否啟動成功?

http://hmaster:60010(你在那臺機器執行start-hbase.sh那麼哪一臺機器就是hmaster)

當然最直接了當的辦法還是看下日誌有沒有錯誤:

hbase-hadoop-regionserver-hadoop25log

hbase-hadoop-master-hadoop25.log

相關推薦

hbase-0.98.3小試牛刀

最近一直在考慮統計分析的基礎資料、中間資料、結果資料該怎麼存放才有利於寫入、讀取、彙總了,Mysql當然是不二人選,不過涉及到更新的時候,都要先select後update,效率比較低,另外一個就是不同維度的統計資料要是排序的話,那就慘了,到處都是索引,整一張表就是各種索引了

hbase 0.98.1叢集安裝

分享一下我老師大神的人工智慧教程!零基礎,通俗易懂!http://blog.csdn.net/jiangjunshow 也歡迎大家轉載本篇文章。分享知識,造福人民,實現我們中華民族偉大復興!        

HBase 0.94.3的HRegion名字

2013-01-10 周海漢 2013.1.10 HBase 可以通過Region server的60030埠看到各區域的資訊。 Region Name Start Key End Key Metrics -RO

hbase-0.98學習筆記

hbase-0.98版已經出了,主要的改變有以下幾點: 1、修復了一些bug,添加了cell視覺化標籤、cell ACLs和透明的伺服器端加密。 2、一些效能改進包括:預寫式日誌執行緒模型,在高負載下提高了事務吞吐量;反向掃描;在快照檔案上的mapreduce;stripe

Hbase-0.98.6原始碼分析--Put寫操作Client端流程

        客戶端程式寫資料通過HTable和Put進行操作,我們從客戶端程式碼開始分析寫資料的流程:        可以看到,客戶端寫資料最終的呼叫了HTableInterface的put()方法,因為HTableInterface只是一個介面,所以最終呼叫的是它的

HBase 0.98 分散式叢集安裝詳解

概述 HBase是一個分散式的、面向列的開源資料庫,該技術來源於 Fay Chang 所撰寫的Google論文“Bigtable:一個結構化資料的分散式儲存系統”。就像Bigtable利用了Google檔案系統(File System)所提供的分散式資料

hbase 0.98.9客戶端的兩個引數調優

公司的專案有用到hbase資料庫,而我正好負責hbase客戶端的介面程式碼編寫工作;實際就是為hbase中的各個表,提供增,刪,改,查的功能。  前段時間,同事對介面進行測試時,跟我反饋:在使用visualVM在檢視執行緒執行狀態時,發現hbase客戶端的執行緒很多,具體

hbase-0.98,完全分散式安裝

1.Hbase體系結構及叢集規劃 體系結構 a.Zookeeper Zookeeper Quorum中除了儲存了-ROOT-表的地址和HMaster的地址,HRegionServer也會把自己以Ephemeral方式註冊到Zookeeper中,

Hadoop-2.6.0+Zookeeper-3.4.6+Spark-1.5.0+Hbase-1.1.2+Hive-1.2.0叢集搭建

前言 本部落格目的在於跟大家分享大資料平臺搭建過程,是筆者半年的結晶。在大資料搭建過程中,希望能給大家提過一些幫助,這也是本部落格的

3.0.2→3.2.12 Sharded Cluster升級(mmapv1引擎不換)

mongodb balancer upgrade前期準備: 1)3.2.12版本準備好 2)升級過程中,保證client不會修改集合元數據。例如:不能執行下列操作:sh.enableSharding()sh.shardCollection()sh.addShard()db.createCollec

macOS Sierra(10.12.6), odoo(11.0), Python(3.5.4)配置

hat exp 環境 操作 bre ted .html 提示符 don 欣聞odoo11支持python3環境了,趕緊在mac平臺嘗試一下: 前期設置,參考另篇文章:macOS Sierra 10.12.6 odoo 10.0 開發環境配置 因為odoo11尚未正式發布

引用mchange-commons-java-0.2.3.4.jar架包

ava color artifact clas pos pom 引用 repos spa pom.xml中增加 <!-- https://mvnrepository.com/artifact/com.mchange/mchange-commons-java --

Linux(RHEL)5.4/5.5/5.8/6.0/6.3 ISO鏡像文件-下載地址

rhel iso 版本有RedHat Enterprise Linux(RHEL)5.4/5.5/5.8/6.0/6.3 ISO鏡像文件下載地址:RHEL 5.4 ISO下載http://rhel.ieesee.net/uingei/rhel-server-5.4-i386-dvd.isohttp:/

Java反編譯工具Luyten-0.5.3

喜歡 title blank jar包文件 http 文件 使用 tps 比較 Luyten是一款很強大的反編譯工具包,是一款github的開源工具,軟件功能非常強大,界面簡潔明晰、操作方便快捷,設計得很人性化。 工具軟件下載路徑:https://github.com/de

DotSoft.C3DTools.v7.0.2.3 1CD+DotSoft.MapWorks.v7.0.2.0 1CD

1.0 空氣 cat strong 裏的 3.1 edge 實例 blade -+加工及雕刻類軟件+- ~~~~~~~~~~~~~~~~~~~~~~ Third Wave Systems產品: ThirdWaveSystems AdvantEdge v7.1 Wi

Android Studio 第六十七期 - Android Glide3.7.03.8.0用法

adb rsa orm com 是你 之一 question load mage 一、前言:再優秀的開源庫都有坑要填手上的項目使用的圖片加載框架是:Universal-Image-Loader+業務需要定制化的一些代碼。Universal-Image-Loader 這個框架

基於【CentOS-7+ Ambari 2.7.0 + HDP 3.0】搭建HAWQ數據倉庫之一 —— MariaDB 安裝配置

ola http iad com grant stop drive 數據庫 commit 一、安裝並使用MariaDB作為Ambari、Hive、Hue的存儲數據庫。 yum install mariadb-server mariadb 啟動、查看狀態,檢查mariad

純小白入手 vue3.0 CLI - 3.1 - 路由 ( router )

ref 替換 就是 export div from forms clas 應用開發 vue3.0 CLI 真小白一步一步入手全教程系列:https://www.cnblogs.com/ndos/category/1295752.html 盡量把紛繁的知識,肢解重組成為可以

[環境配置]Ubuntu 16.04 原始碼編譯安裝OpenCV-3.2.0+OpenCV_contrib-3.2.0及產生的問題

1.OpenCV-3.2.0+OpenCV_contrib-3.2.0編譯安裝過程 1)下載官方要求的依賴包 GCC 4.4.x or later CMake 2.6 or higher Git GTK+2.x or higher, including headers (libgtk2.

Windows下編譯Yolov3(CUDA9.1+cudnn7.0+OpenCV 3.1.0

按照官網給出CUDA9.1+cudnn7.0+OpenCV 3.1.0的版本安裝 1 安裝CUDA 9.1 預設位置安裝後發現環境變數自動加入了path   2 安裝cudnn 7.0.5 for cuda 9.1 下載地址:https://dev