Flink實戰(七十二):監控(四)自定義metrics相關指標(二)
技術標籤:Flink入門
宣告:本系列部落格是根據SGG的視訊整理而成,非常適合大家入門學習。
專案實現程式碼舉例:
新增自定義監控指標,以flink1.5的Kafka讀取以及寫入為例,新增rps、dirtyData等相關指標資訊。�kafka讀取和寫入重點是先拿到RuntimeContex初始化指標,並傳遞給要使用的序列類,通過重寫序列化和反序列化方法,來更新指標資訊。
不加指標的kafka資料讀取、寫入Demo。
public class FlinkEtlTest { private static final Logger logger = LoggerFactory.getLogger(FlinkEtlTest.class); public static void main(String[] args) throws Exception { final ParameterTool params = ParameterTool.fromArgs(args); String jobName = params.get("jobName"); StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); /** 設定kafka資料 */ String topic = "myTest01"; Properties props = new Properties(); props.setProperty("bootstrap.servers", "localhost:9092"); props.setProperty("zookeeper.quorum", "localhost:2181/kafka"); // 使用FlinkKafkaConsumer09以及SimpleStringSchema序列化類,讀取kafka資料 FlinkKafkaConsumer09<String> consumer09 = new FlinkKafkaConsumer09(topic, new SimpleStringSchema(), props); consumer09.setStartFromEarliest(); // 使用FlinkKafkaProducer09和SimpleStringSchema反序列化類,將資料寫入kafka String sinkBrokers = "localhost:9092"; FlinkKafkaProducer09<String> myProducer = new FlinkKafkaProducer09<>(sinkBrokers, "myTest01", new SimpleStringSchema()); DataStream<String> kafkaDataStream = env.addSource(consumer09); kafkaDataStream = kafkaDataStream.map(str -> { logger.info("map receive {}",str); return str.toUpperCase(); }); kafkaDataStream.addSink(myProducer); env.execute(jobName); } }
下面重新複寫flink的
FlinkKafkaConsumer09
FlinkKafkaProducer09
方法,加入metrics的監控。
為kafka讀取新增相關指標
- 繼承FlinkKafkaConsumer09,獲取它的RuntimeContext,使用當前MetricGroup初始化指標引數。
public class CustomerFlinkKafkaConsumer09<T> extends FlinkKafkaConsumer09<T> { CustomerSimpleStringSchema customerSimpleStringSchema; // 構造方法有多個 public CustomerFlinkKafkaConsumer09(String topic, DeserializationSchema valueDeserializer, Properties props) { super(topic, valueDeserializer, props); this.customerSimpleStringSchema = (CustomerSimpleStringSchema) valueDeserializer; } @Override public void run(SourceContext sourceContext) throws Exception { //將RuntimeContext傳遞給customerSimpleStringSchema customerSimpleStringSchema.setRuntimeContext(getRuntimeContext()); // 初始化指標 customerSimpleStringSchema.initMetric(); super.run(sourceContext); } }
重寫SimpleStringSchema類的反序列化方法,當資料流入時變更指標。
public class CustomerSimpleStringSchema extends SimpleStringSchema { private static final Logger logger = LoggerFactory.getLogger(CustomerSimpleStringSchema.class); public static final String DT_NUM_RECORDS_RESOVED_IN_COUNTER = "dtNumRecordsInResolve"; public static final String DT_NUM_RECORDS_RESOVED_IN_RATE = "dtNumRecordsInResolveRate"; public static final String DT_DIRTY_DATA_COUNTER = "dtDirtyData"; public static final String DT_NUM_BYTES_IN_COUNTER = "dtNumBytesIn"; public static final String DT_NUM_RECORDS_IN_RATE = "dtNumRecordsInRate"; public static final String DT_NUM_BYTES_IN_RATE = "dtNumBytesInRate"; public static final String DT_NUM_RECORDS_IN_COUNTER = "dtNumRecordsIn"; protected transient Counter numInResolveRecord; //source RPS protected transient Meter numInResolveRate; //source dirty data protected transient Counter dirtyDataCounter; // tps protected transient Meter numInRate; protected transient Counter numInRecord; //bps protected transient Counter numInBytes; protected transient Meter numInBytesRate; private transient RuntimeContext runtimeContext; public void initMetric() { numInResolveRecord = runtimeContext.getMetricGroup().counter(DT_NUM_RECORDS_RESOVED_IN_COUNTER); numInResolveRate = runtimeContext.getMetricGroup().meter(DT_NUM_RECORDS_RESOVED_IN_RATE, new MeterView(numInResolveRecord, 20)); dirtyDataCounter = runtimeContext.getMetricGroup().counter(DT_DIRTY_DATA_COUNTER); numInBytes = runtimeContext.getMetricGroup().counter(DT_NUM_BYTES_IN_COUNTER); numInRecord = runtimeContext.getMetricGroup().counter(DT_NUM_RECORDS_IN_COUNTER); numInRate = runtimeContext.getMetricGroup().meter(DT_NUM_RECORDS_IN_RATE, new MeterView(numInRecord, 20)); numInBytesRate = runtimeContext.getMetricGroup().meter(DT_NUM_BYTES_IN_RATE , new MeterView(numInBytes, 20)); } // 源表讀取重寫deserialize方法 @Override public String deserialize(byte[] value) { // 指標進行變更 numInBytes.inc(value.length); numInResolveRecord.inc(); numInRecord.inc(); try { return super.deserialize(value); } catch (Exception e) { dirtyDataCounter.inc(); } return ""; } public void setRuntimeContext(RuntimeContext runtimeContext) { this.runtimeContext = runtimeContext; } }
程式碼中使用自定義的消費者進行呼叫:
CustomerFlinkKafkaConsumer09<String> consumer09 = new CustomerFlinkKafkaConsumer09(topic, new CustomerSimpleStringSchema(), props);
為kafka寫入新增相關指標
- 繼承FlinkKafkaProducer09類,重寫open方法,拿到RuntimeContext,初始化指標資訊傳遞給CustomerSinkStringSchema。
public class CustomerFlinkKafkaProducer09<T> extends FlinkKafkaProducer09<T> { public static final String DT_NUM_RECORDS_OUT = "dtNumRecordsOut"; public static final String DT_NUM_RECORDS_OUT_RATE = "dtNumRecordsOutRate"; CustomerSinkStringSchema schema; public CustomerFlinkKafkaProducer09(String brokerList, String topicId, SerializationSchema serializationSchema) { super(brokerList, topicId, serializationSchema); this.schema = (CustomerSinkStringSchema) serializationSchema; } @Override public void open(Configuration configuration) { producer = getKafkaProducer(this.producerConfig); RuntimeContext ctx = getRuntimeContext(); Counter counter = ctx.getMetricGroup().counter(DT_NUM_RECORDS_OUT); //Sink的RPS計算 MeterView meter = ctx.getMetricGroup().meter(DT_NUM_RECORDS_OUT_RATE, new MeterView(counter, 20)); // 將counter傳遞給CustomerSinkStringSchema schema.setCounter(counter); super.open(configuration); } }
重寫SimpleStringSchema的序列化方法
public class CustomerSinkStringSchema extends SimpleStringSchema { private static final Logger logger = LoggerFactory.getLogger(CustomerSinkStringSchema.class); private Counter sinkCounter; @Override public byte[] serialize(String element) { logger.info("sink data {}", element); sinkCounter.inc(); return super.serialize(element); //複寫serialize方法,序列化繼續使用父類提供的序列化方法 } public void setCounter(Counter counter) { this.sinkCounter = counter; } } 複製程式碼
新的kafkaSinkApi使用
CustomerFlinkKafkaProducer09<String> myProducer=newCustomerFlinkKafkaProducer09<>(sinkBrokers,"mqTest01",newCustomerSinkStringSchema());
獲取 Metrics
這樣就可以在監控框架裡面看到採集的指標資訊了,
比如flink_taskmanager_job_task_operator_dtDirtyData指標,dtDirtyData是自己新增的指標,前面的字串是operator預設使用的metricGroup。
獲取 Metrics 有三種方法,首先可以在 WebUI 上看到;其次可以通過 RESTful API 獲取,RESTful API 對程式比較友好,比如寫自動化指令碼或程式,自動化運維和測試,通過 RESTful API 解析返回的 Json 格式對程式比較友好;最後,還可以通過 Metric Reporter 獲取,監控主要使用 Metric Reporter 功能。
資料分析:
分析任務有時候為什麼特別慢呢?
當定位到某一個 Task 處理特別慢時,需要對慢的因素做出分析。分析任務慢的因素是有優先順序的,可以從上向下查,由業務方面向底層系統。因為大部分問題都出現在業務維度上,比如檢視業務維度的影響可以有以下幾個方面,併發度是否合理、資料波峰波谷、資料傾斜;其次依次從 Garbage Collection、Checkpoint Alignment、State Backend 效能角度進行分析;最後從系統性能角度進行分析,比如 CPU、記憶體、Swap、Disk IO、吞吐量、容量、Network IO、頻寬等。