awk與grep 日誌實際專案練習
shell統計日誌,求每五分鐘訪問量的寫法
日誌格式如下:訪問IP、時間、返回狀態碼
XX.XX.XX.XX - - [09/Sep/2015:10:30:00 +0800] 200
XX.XX.XX.XX - - [09/Sep/2015:10:34:00 +0800] 206
XX.XX.XX.XX - - [09/Sep/2015:10:37:00 +0800] 302
XX.XX.XX.XX - - [09/Sep/2015:10:32:00 +0800] 303
XX.XX.XX.XX - - [09/Sep/2015:11:30:00 +0800] 200
XX.XX.XX.XX - - [09/Sep/2015:11:39:00 +0800] 200
XX.XX.XX.XX - - [09/Sep/2015:12:29:00 +0800] 200
XX.XX.XX.XX - - [09/Sep/2015:12:30:00 +0800] 200
需求如下:
統計每五分鐘的請求數,沒有結果的輸出為0(或者不輸出都可以),如下格式:
10:00-10:04 0
10:05-10:09 0
- [[email protected] /tmp]$ awk -F: '{a[$2":"($3-$3%5)]++}END{for(i in a){split(i,t);print i,t[1]":"t[2]+4,a[i] | "sort -t: -k1n -k2n"}}' a
- 10:30 10:34 3
- 10:35 10:39 1
- 11:30 11:34 1
- 11:35 11:39 1
- 12:25 12:29 1
- 12:30 12:34 1
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
linux命令 對日誌檔案的IP出現的次數進行統計 並顯示次數最多的前六名 解決方法:
grep -i -o -E "([0-9]{1,3}\.){3}[0-9]{1,3}" test1.txt | sort -n | uniq -c | sort -n -r | head -6
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
awk統計訪問日誌前10的ip 日誌格式如下:
2013-10-2910:26:09,INFO, send [email protected]
[email protected],ip=10.3.22.134,mailType=4,emailId=526f1bd8c8f2a90213662a67
shell命令如下:
cat mail-2013-10-28.log | awk -F ',' '{print $8}' | sort | uniq -c | sort -k1nr | head -10
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
awk 統計檔案中滿足特定條件的行數
例 檔案tt.txt
sdfasf,I,55,56,asdfadf223
sdfasf,I,55,56,asdfadf2230k
1313,I,55,56,asdfad
xvxzv,I,55,56,asdf
adfa,a,d,afasd
vafasf,fff,aw,aaa
fasf,a,55,56,asdf
asdcc,I,55,fasdf,33
asdfasdf,I,fa,56,adsf
統計已逗號為分割的第二列為“I”,第三列為“55”,第四列為“56” 的行數,並輸出已第一列為
的各行及對應出現次數 到文字sum.txt
awk 'BEGIN{FS=",";}{if($2=="I" && $3=="55" && $4=='56') a[$1]++} END{for (i in a) print i,a[i];}' tt.txt >sum.txt
或統計./diraa 目錄下各個檔案滿足條件的
awk 'BEGIN{FS=",";}{if($2=="I" && $3=="55" && $4=='56') a[$1]++} END{for (i in a) print i,a[i];}' ./diraa/* >sum.txt