kylin報錯及解決方案總結
一、在build cube這一步中報錯:Value not exists!
查詢該步的mr日誌,提示 Not a valid value:2017-05-31,有兩種可能
1.該錯誤是由於build過程中,所引用的維表資料發生了變化,使用該值查詢維表,維表中不存在這條資料。
2.olap表關聯了維表,但只使用了關聯欄位,如果olap表的code在維表裡不存在,則會報錯
解決:
1、確定維表中是否存在該值。
2、確定維表中為什麼不存在。
3、該值在olap中是否合理。
4.如果是 olap表關聯了維表,但只使用了關聯欄位,如果olap表的code在維表裡不存在,則會報錯
可以設定 true 這樣可以強制kylin關聯dim表 過濾掉olap在維表不存在的值
但如果維表本身有問題(資料不全或者為空)會導致olap的資料被過濾 請根據場景設定
設定後檢查sql是否符合預期(如圖)
二、.build 第三步Extract Fact Table Distinct報錯 :ArrayIndexOutOfBoundsException: -1 at (資料溢位)
原因:olap中有與維表同名的欄位或者維表之間有同名的欄位(或者模型有問題,join的欄位,維度要選擇事實表字段,已解決)
解決:去掉同名欄位
三、build第四步Build Dimension Dictionary
Failed to create dictionary on OLAP.OLAP_CUSTOMER_STOCK_2_DA.CUST_ID_STOCK
java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_CUSTOMER_STOCK_2_DA.CUST_ID_STOCK
at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325) at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File does not exist: /kylin/kylin-kylin_metadata/resources/GlobalDict/dict/OLAP.OLAP_CUSTOMER_STOCK_2_DA/CUST_ID_STOCK/.index at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1828) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:127) at org.apache.kylin.dict.DictionaryManager.getDictionary(DictionaryManager.java:114) at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:65) at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81) at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323) ... 14 more
或者以下日誌
出現原因是該欄位配置了全域性字典,如果同時提交多個segment構建任務,並且Build Dimension Dictionary這步正好同時執行到,會導致多工操作同一個字典檔案,導致異常
全域性字典任務儘量不要並行構建,出現問題後,resume任務或者重新提交build(目前已經增加分散式鎖,可以並行構建,應該不會有這種報錯)
四、
使用全域性時,構建cube的第四步Build Dimension Dictionary出錯。
報錯資訊如下所示,其中OLAP.OLAP_LOG_WEB_TS_DI.ORIGINAL_SESSION_ID 是設定為全域性字典的欄位。
java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_LOG_WEB_TS_DI.ORIGINAL_SESSION_ID at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325) at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException at org.apache.kylin.dict.CachedTreeMap.writeValue(CachedTreeMap.java:240) at org.apache.kylin.dict.CachedTreeMap.write(CachedTreeMap.java:374) at org.apache.kylin.dict.AppendTrieDictionary.flushIndex(AppendTrieDictionary.java:1043) at org.apache.kylin.dict.AppendTrieDictionary$Builder.build(AppendTrieDictionary.java:954) at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:82) at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81) at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323) ... 14 more result code:2
出錯原因:使用全域性字典有容量的限制,Count distinct指標欄位的字串長度不能超過255。建議檢查出錯欄位的原始資料長度。
四-2:
使用全域性時,構建cube的第四步Build Dimension Dictionary出錯。
目前為偶發異常 ,discard後重新提交
四-3:
構建cube的第四步Build Dimension Dictionary出錯。
java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_MKT_LOG_ACCESS_PAGE_INDICATOR_DI.USER_ID at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325) at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:185) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.io.EOFException at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:127) at org.apache.kylin.dict.DictionaryManager.getDictionary(DictionaryManager.java:114) at org.apache.kylin.dict.AppendTrieDictionary$Builder.createNewBuilder(AppendTrieDictionary.java:884) at org.apache.kylin.dict.AppendTrieDictionary$Builder.getInstance(AppendTrieDictionary.java:844) at org.apache.kylin.dict.AppendTrieDictionary$Builder.getInstance(AppendTrieDictionary.java:838) at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:65) at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81) at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323) ... 14 more Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.kylin.dict.AppendTrieDictionary.readFields(AppendTrieDictionary.java:1238) at org.apache.kylin.dict.DictionaryInfoSerializer.deserialize(DictionaryInfoSerializer.java:74) at org.apache.kylin.dict.DictionaryInfoSerializer.deserialize(DictionaryInfoSerializer.java:34) at org.apache.kylin.common.persistence.ResourceStore.getResource(ResourceStore.java:146) at org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:421) at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:103) at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:100) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257) at com.google.common.cache.LocalCache.get(LocalCache.java:4000) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:120) ... 21 more
(聯絡管理員處理)
1.查詢全域性字典的路徑,找出.index大小為0的目錄,刪除或者mv
五、
GlobalDict /dict/OLAP.OLAP_CUSTOMER_NEW_3_DI/CUST_PHONE1_ENCRYPTED should have 0 or 1 append dict but 2
(聯絡管理員處理)
1.從cubedesc中找出不一樣的全域性字典
2.刪除這些字典以及對應的segment
補充說明:
1.上述處理方法可能會導致字典異常,造成count distinct不準,最好清空整個cube 執行metastore.sh clean後重跑資料
六、
hbase問題導致,取消任務,重試即可
七:重新整理任務列表失敗
異常為:
java.lang.NullPointerException
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
at org.apache.kylin.rest.service.JobService.parseToJobStep(JobService.java:309)
at org.apache.kylin.rest.service.JobService.parseToJobInstance(JobService.java:303)
at org.apache.kylin.rest.service.JobService.access$000(JobService.java:73)
at org.apache.kylin.rest.service.JobService$1.apply(JobService.java:134)
at org.apache.kylin.rest.service.JobService$1.apply(JobService.java:131)
at com.google.common.collect.Iterators$8.transform(Iterators.java:860)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at com.google.common.collect.Lists.newArrayList(Lists.java:145)
at com.google.common.collect.Lists.newArrayList(Lists.java:125)
at org.apache.kylin.rest.service.JobService.listCubeJobInstance(JobService.java:131)
at org.apache.kylin.rest.service.JobService.listAllJobs(JobService.java:103)
at org.apache.kylin.rest.service.JobService.listAllJobs(JobService.java:84)
at org.apache.kylin.rest.service.JobService$$FastClassBySpringCGLIB$$83a44b2a.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:629)
at org.apache.kylin.rest.service.JobService$$EnhancerBySpringCGLIB$$29ce7197.listAllJobs(<generated>)
at org.apache.kylin.rest.controller.JobController.list(JobController.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
處理方法:
去server05上重新整理任務列表,檢視kylin.out最後停住的記錄
使用./metastore.sh remove "/execute/${id}”刪除該記錄
八:hbase建表衝突
1) discard任務
2)呼叫刪除segment api
3)去hbase中刪除該表
九:
有全域性字典,並且Build Base Cuboid Data時間過長
檢視mr的counter發現gc時間較長
1) 利用adhoc查詢設定全域性字典的列的distinct大小
2)增加cube設定kylin.job.mr.config.override.mapred.map.child.java.opts=-Xmx8g kylin.job.mr.config.override.mapreduce.map.memory.mb=8500
如果記憶體還不夠可改為(可以優先調整map大小,即前兩項,如還不行,四項均設定)
kylin.job.mr.config.override.mapreduce.map.java.opts=-Xmx15g kylin.job.mr.config.override.mapreduce.map.memory.mb=16000
kylin.job.mr.config.override.mapreduce.reduce.java.opts=-Xmx15g kylin.job.mr.config.override.mapreduce.reduce.memory.mb=16000
十、第三步
問題:
java.lang.IllegalStateException at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:98) at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92) at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
解決:reload metadata
十一:查詢報錯Not a valid ID
可能原因1:修改cube 增加維度時會造成cube元資料不同步
當修改一個cube增加新的維度欄位後,cube build能成功完成。但是當查詢語句中包含該新增加的維度時,會報如下錯誤:Not a valid ID。該維度並未包含在cube的元資料中。所以在使用kylin的過程中,應儘量避免在cube上做修改,建議新建cube或者clone cube後進行修改。
可能原因2:cube設定了自動合併
kylin的自動合併有bug,建議關閉,查詢有問題的時間段可以通過重新build修復
十二:Build Dimension Dictionary報錯
java.lang.IllegalStateException: Dup key found, key=[0], value1=[0,未知], value2=[0,null] at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:85) at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:68) at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79) at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:56) at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65) at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:674) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:60) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)
原因:維表主鍵有重複,0在維表出現兩次,分別是[0,未知],[0,null]
十三:
load kylin中已有的olap表出錯
原因是存在desbroken狀態的cube導致 drop掉以即可
十四:
build第三步報錯
原因:幾臺build機器之間的配置不同步,比如
kylin.cube.aggrgroup.max.combination的值的設定不一樣
十五:
現象:設定了全域性字典的指標build完之後,資料與hive中查詢的不一致
解決:全域性字典問題,清空資料後,刪除該字典,重新build即可
十六:
mr任務失敗提示 Error: GC overhead limit exceeded,原因是mr記憶體不夠
可增加cube設定kylin.job.mr.config.override.mapred.map.child.java.opts=-Xmx8g kylin.job.mr.config.override.mapreduce.map.memory.mb=8500
如果記憶體還不夠可改為(可以優先調整map大小,即前兩項,如還不行,四項均設定)
kylin.job.mr.config.override.mapreduce.map.java.opts=-Xmx15g
kylin.job.mr.config.override.mapreduce.map.memory.mb=16000
kylin.job.mr.config.override.mapreduce.reduce.java.opts=-Xmx15g
kylin.job.mr.config.override.mapreduce.reduce.memory.mb=16000