Hive/Impala/Hbase/Spark Kerberos
使用hadoop Kerberos有幾個地方需要注意,避免真實環境老是有問題:
1. 我以前使用IP地址構建叢集(雖然也用了DNS解析),但是有時候你直接通過主機名+DNS解析來做叢集,這2者從我實際測試來看是有區別的,使用IP沒有任何問題,但是使用DNS解析,開啟kerberos總是會有些許問題,因此如果使用DNS解析,建議客戶端連線kerberos的主機把叢集的IP和主機名寫入/etc/hosts,至於為什麼很難解釋。
2. 如果你的kerberos使用的是強加密,256就是強加密,一定要記得把JAVA下的JCE的JAR包替換,包括你使用的客戶端,否者老會出現拿不到認證,因為最起碼的加密級別都不一樣,肯定會有問題。
3.如果是使用spark standalone,程式要使用hdfs kerberos,這個地方最好spark要支援你的HADOOP,YARN版本,我沒有細測,但是版本不同,多少有點問題。
4.如果沒有使用LDAP配合Kerberos, 你所建立的keytab使用者在作業系統上也建立一個同名使用者,當然,這個地方你要看具體的程式報錯,如果提示使用者找不到,就建立,否者也應該不需要。
開啟Hadoop Kerberos之後,要連線元件需要比不開啟Kerberos多2個動作,一個是告訴程式KDC主機是誰,另外一個告訴程式使用者是誰。 簡單說就是多幾行程式碼:
System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
Configuration conf = new Configuration();
conf.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab(" [email protected]", "c:\\hive.keytab");
上面是我在WINDOW測試,所以krb5.conf指定在C盤,如果是LINUX應用伺服器,當然不需要指定,直接放到/etc目錄下即可,預設就會去/etc找krb5.conf, 至於keytab,這個必須指定,你可以放到程式的類路徑下classpath:keytab即可。
下面簡單的貼一下相應的HIVE,HBASE,SPARK,IMPALA連線kerberos的JAVA程式碼:
HIVE:
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;
/**
* Hello world!
*
*/
public class App
{
public static void main( String[] args ) throws ClassNotFoundException, SQLException, IOException
{
//String driver = "com.cloudera.impala.jdbc41.Driver";
//String driver = "com.cloudera.hive.jdbc41.HS2Driver";
String driver = "org.apache.hive.jdbc.HiveDriver";
//String url = "jdbc:impala://10.40.2.103:21050/default;UseSasl=0;AuthMech=3;UID=impala;PWD=";
String url = "jdbc:hive2://xxxx.xxxx.com:10000/default;principal=hive/ [email protected]";
String username = "hive";
String password = "hive";
Connection connection = null;
Class.forName(driver);
System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
Configuration conf = new Configuration();
conf.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("[email protected]", "c:\\hive.keytab");
connection = DriverManager.getConnection(url);
String sql = "select count(*) from test.test";
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(sql);
while(resultSet.next()) {
System.out.println(resultSet.getInt(1));
}
resultSet.close();
statement.close();
connection.close();
}
}
impala:
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;
/**
* Hello world!
*
*/
public class App
{
public static void main( String[] args ) throws ClassNotFoundException, SQLException, IOException
{
String driver = "com.cloudera.impala.jdbc41.Driver";
//String driver = "com.cloudera.hive.jdbc41.HS2Driver";
//String driver = "org.apache.hive.jdbc.HiveDriver";
//String url = "jdbc:impala://10.40.2.103:21050/default;UseSasl=0;AuthMech=3;UID=impala;PWD=";
String url = "jdbc:impala://10.40.2.103:21050/test;UseSasl=0;AuthMech=3;UID=impala;PWD=;principal=hive/[email protected]";
String username = "hive";
String password = "hive";
Connection connection = null;
System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
Class.forName(driver);
Configuration conf = new Configuration();
conf.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("[email protected]", "c:\\hive.keytab");
connection = DriverManager.getConnection(url);
String sql = "select count(*) from hbase_test";
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(sql);
while(resultSet.next()) {
System.out.println(resultSet.getInt(1));
}
resultSet.close();
statement.close();
connection.close();
}
}
HBASE:
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.security.UserGroupInformation;
public class App {
public static void main(String[] args) throws IOException {
String table = "mes:test";
Configuration conf = HBaseConfiguration.create();
System.setProperty("HADOOP_USER_NAME", "hbase");
conf.set("hbase.zookeeper.quorum",
"tsczbddndev1.trinasolar.com");
conf.set("hbase.zookeeper.property.clientPort", "2181");
//conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
// conf.setInt("hbase.client.operation.timeout",10000);
// conf.setInt("hbase.rpc.timeout",6000);
conf.setInt("hbase.client.retries.number", 3);
System.setProperty("java.security.krb5.conf", "resource/krb5.conf");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("[email protected]", "resource/hive.keytab");
HTable myTable = new HTable(conf, TableName.valueOf(table));
Put put = new Put(Bytes.toBytes("CDH5.10.21"));
put.add(Bytes.toBytes("cf"), Bytes.toBytes("name"), Bytes.toBytes("this is a test"));
myTable.put(put);
myTable.flushCommits();
System.out.println("put successful");
myTable.close();
}
}
Spark on Yarn:
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.DataFrame;
/**
* Hello world!
*
*/
public class App {
public static void main(String[] args) throws IOException {
System.setProperty("HADOOP_USER_NAME", "hdfs");
SparkConf sparkConf = new SparkConf().setAppName("JavaWordCount");
sparkConf.setMaster("yarn-client");
// sparkConf.set("spark.submit.deployMode", "client");
//sparkConf.set("spark.yarn.jar", "hdfs:///tmp/spark-assembly_2.10-1.6.0-cdh5.10.2.jar");
sparkConf.set("spark.yarn.appMasterEnv.CLASSPATH",
"$CLASSPATH:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/*");
// System.setProperty("sun.security.krb5.debug", "true");
System.setProperty("java.security.krb5.conf", "c:\\krb5.conf");
Configuration conf = new Configuration();
// conf.set("hadoop.security.authentication", "Kerberos");
// sparkConf.set("spark.security.credentials.hdfs.enabled", "true");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("[email protected]", "c:\\hive.keytab");
JavaSparkContext ctx = new JavaSparkContext(sparkConf);
JavaRDD<String> lines = ctx.textFile("/tmp/a.sql");
System.out.println(lines.count());
ctx.close();
}
}