1. 程式人生 > >Hive權限之審計

Hive權限之審計

eba yellow hive desc tee nbsp run ng- get

因為在生產環境中大量使用hive。而hive的權限又較弱,假設可以記錄全部hive操作,在增強安全性的同一時候,還可以統計hive表的使用頻率;同一時候假設可以記錄hql的開始和結束時間,則可以找出系統中花費時間較多的job,針對性的進行優化,因此跟蹤hive的使用軌跡,增強安全的同一時候也能方便問題定位。

怎樣記錄用戶操作了?Hive Hook為我們提供的方便的開放接口。

我們對hive的使用主要有兩種使用場景,一是平時直接在命令行下運行的hql操作,此時運行hql的實體就是OS的登錄用戶。第二種是從webapp獲取的業務數據需求人員創建定時報表的hql腳本。此時運行hql的真正實體事實上是報表創建者,系統不過代理運行而已,此時記錄用戶的行為則須要重寫hive.security.authenticator.manager

Hive默認使用HadoopDefaultAuthenticator獲取運行hql的用戶,使用其返回的用戶進行權限驗證。

為了使hive可以以代理的模式去運行,我們須要提供自己的authenticator。返回真正的hql運行者。下面配置可設置authenticator:

<property>

<name>hive.security.authenticator.manager</name>

<value>com.pplive.bip.hive.auth.Authenticator</value>

<description>bip user authenticator</description>

</property>

僅僅有管理員能夠開啟代理模式。能夠使用下面方式傳遞代理用戶:

Hive -d bip.user=xxx 或 hive --define bip.user=xxx

重寫authenticator代碼演示樣例:

public classAuthenticator implements HiveAuthenticationProvider {

private finalstaticString BIP_USER="bip.user";

privateStringuserName;

privateStringbipUser

;

privateList<String>groupNames;

privateConfigurationconf;

@Override

publicList<String> getGroupNames() {

returngroupNames;

}

@Override

publicStringgetUserName() {

this.bipUser = SessionState.get().getHiveVariables().get(BIP_USER);

if(this.bipUser !=null &&!this.bipUser.isEmpty()) {

if( AdminManager.isAdmin(this.userName)) {

returnthis.bipUser;

} else {

thrownewRuntimeException("bip.user is set while youare not admin");

}

} else{

returnthis.userName;

}

}

@Override

publicvoidsetConf(Configuration conf) {

this.conf = conf;

UserGroupInformation ugi = null;

try{

ugi = ShimLoader.getHadoopShims().getUGIForConf(conf);

// UserGroupInformation.createProxyUser(user, realUser);

} catch(Exception e) {

thrownewRuntimeException(e);

}

if(ugi == null){

thrownewRuntimeException(

"Can not initialize PPLive Authenticator.");

}

this.userName = ugi.getUserName();

if(ugi.getGroupNames() !=null) {

this.groupNames = Arrays.asList(ugi.getGroupNames());

}

}

publicString getProxy() {

return this.userName;

}

Hive提供的SemanticHook能夠方便我們記錄hql語義分析前後的狀態。Execute Hook能夠記錄hql翻譯成job提交運行前後的狀態。 Driver Hook能夠記錄包含整個編譯運行過程前後的狀態。

SemanticHook記錄語義分析後的行為:

public voidpostAnalyze(HiveSemanticAnalyzerHookContext context,

List<Task<?

extendsSerializable>> rootTasks)

throws SemanticException {

Hivehive = null;

try {

hive= context.getHive();

}catch(HiveException e) {

e.printStackTrace();

throw new RuntimeException(e);

}

Set<ReadEntity>inputs = context.getInputs();

Set<WriteEntity>outputs = context.getOutputs();

Set<String>readTables = newHashSet<String>();

for(ReadEntity input :inputs) {

Table table = input.getT();

if(table!=null) {

readTables.add(table.getTableName());

}

}

Set<String>writeTables = newHashSet<String>();

for(WriteEntity output :outputs) {

Table table = output.getT();

if(table!=null) {

writeTables.add(table.getTableName());

}

}

HiveAuthenticationProviderauthenticationProvider = SessionState.get().getAuthenticator();

if(authenticationProviderinstanceof Authenticator) {

Authenticatorauthenticator = (Authenticator)authenticationProvider; //ip

this.logger.info(String.format("phase=SA&executor=%s&proxy=%s&db=%s&cmd=%s&readTables=%s&writeTables=%s", authenticator.getUserName(),

authenticator.getProxy(), hive.getCurrentDatabase(),context.getCommand(),readTables.toString(),writeTables.toString()));

}

StringuserName = SessionState.get().getAuthenticator().getUserName();

logger.debug(String.format("%s execute %s, read tables:%s, writetables:%s", userName, context.getCommand(),readTables, writeTables));

}

Execute Hook記錄job狀態:

public classExecuteHook implements ExecuteWithHookContext {

Loggerlogger= Logger.getLogger(DriverRunHook.class);

privateHiveAuthenticationProviderauthenticationProvider = null;

private static final String JOB_START_TIME="PRE_EXEC_HOOK";

private static SimpleDateFormat dateFormat =new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

@Override

public void run(HookContexthookContext) throwsException {

QueryPlanqueryPlan = hookContext.getQueryPlan();

StringqueryId = queryPlan.getQueryId();

StringqueryStr = queryPlan.getQueryStr();

if(authenticationProvider==null){

authenticationProvider= SessionState.get().getAuthenticator();

}

Stringresult = null;

switch(hookContext.getHookType()){

//hive.exec.pre.hooks

case PRE_EXEC_HOOK:

hookContext.getConf().setLong(JOB_START_TIME,System.currentTimeMillis());

break;

//hive.exec.post.hooks

case POST_EXEC_HOOK:

result= "Success";

break;

//hive.exec.failure.hooks

case ON_FAILURE_HOOK:

result= "Failure";

break;

default:

break;

}

if(hookContext.getHookType()!= HookContext.HookType.PRE_EXEC_HOOK&&authenticationProvider instanceofAuthenticator) {

long jobEndTime = System.currentTimeMillis();

HiveConfconf = hookContext.getConf();

long jobStartTime =conf.getLong(JOB_START_TIME, jobEndTime);

long timeTaken =(jobEndTime-jobStartTime)/1000;

Authenticatorauthenticator = (Authenticator)authenticationProvider; //ip

this.logger.info(String.format("phase=EXEC&result=%s&executor=%s&proxy=%s&db=%s&queryId=%s&queryStr=%s&jobName=%s&jobStartTime=%s&jobEndTime=%s&timeTaken=%d", result,authenticator.getUserName(),authenticator.getProxy(),

Hive.get().getCurrentDatabase(),queryId, queryStr,conf.getVar(HiveConf.ConfVars.HADOOPJOBNAME),dateFormat.format(new Date(jobStartTime)),

dateFormat.format(newDate(jobEndTime)),timeTaken));

}

}

}

DriverHook記錄整個過程運行時間:

public classDriverRunHook implements HiveDriverRunHook{

Loggerlogger= Logger.getLogger(DriverRunHook.class);

privateHiveAuthenticationProviderauthenticationProvider = null;

private static SimpleDateFormat dateFormat =new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

private long startTime = 0;

@Override

public voidpreDriverRun(HiveDriverRunHookContext hookContext)

throws Exception {

if(authenticationProvider==null){

authenticationProvider= SessionState.get().getAuthenticator();

}

startTime = System.currentTimeMillis();

}

@Override

public voidpostDriverRun(HiveDriverRunHookContext hookContext)

throws Exception {

if(authenticationProviderinstanceofAuthenticator) {

long endTime = System.currentTimeMillis();

long timeTaken = (endTime-startTime)/1000;

Authenticatorauthenticator = (Authenticator)authenticationProvider; //ip

this.logger.info(String.format("phase=DriverRun&executor=%s&proxy=%s&db=%s&cmd=%s&startTime=%s&endTime=%s&timeTaken=%d", authenticator.getUserName(),authenticator.getProxy(),

Hive.get().getCurrentDatabase(),hookContext.getCommand(),dateFormat.format(newDate(startTime)),dateFormat.format(new Date(endTime)),timeTaken));

}

}

}

Hive權限之審計