提升開發效率N倍的20+命令列神器!(附 demo)

阿新 • • 發佈：2021-10-03

背景

文章推送的封面圖就是我自己在用的鍵盤⌨️實拍（不做推薦，適合自己的才是最好的）

本文主要來源於在之前公司的小組內部的一個小分享，整理成一篇文章po出來。題目叫 “Shell 助力開發效率提升”，更切題的應該是叫“命令列”提升開發效率，這裡並沒有講到 Shell 程式設計，而是主要介紹 Linux 或者 Mac 下常用的一些基本工具命令來幫助處理一些日常事務。

通過本文的介紹，你應該對相關命令有一個初步的瞭解，知道比如用什麼命令可以完成怎樣的操作，至於具體的引數，不需要刻意地背誦，等到需要用到的時候，再去cmd --help或者man cmd，用得多了，常用的命令也就自然記住了。

本文首先介紹了 Linux/Mac 下一些常用的命令列工具，然後用具體的示例闡述了常用的命令用法，最後通過一兩個案例來說明這些工具的強大之處：

比如給定一個 nginx 日誌檔案，能夠找出 HTTP 404 請求最多的 top 10 是什麼? 比如能找到請求耗時最多的 top 10 是什麼?
再比如能夠簡單的得到每小時的"PV"是多少? 再比如拿到一篇文章，能否簡單統計一下這篇文章單次詞頻最高的10個詞語是什麼?
需要批量改某個資料夾下的檔名，批量將資料夾下的圖片壓縮成固定大小的，等等。

Mac 環境

zsh
on-my-zsh
plugin

git
autojump
osx(man-preview/quick-look/pfd(print Finder director)/cdf(cd Finder))

常用快捷鍵(bindkey)

演示: 高亮/git/智慧補全/跳轉(j, d)...

這裡給大家展示一個小Demo，之前在視訊號（程式猿石頭，歡迎關注）中分享的一個小視訊，演示瞭如何在目錄之間快速跳轉。

關於 Mac 程式猿提高效率的相關技巧，更多的可以參考以下三篇文章：

工欲善其事，必先利其器 -- Mac 軟體推薦（序）
有了這幾個神器，瞬間逼格就上去了
優秀的程式設計師是如何利用工具來提升工作效率的？

Shell 基礎命令

which/whereis，常用whatis，man，--help

➜.oh-my-zshgit:(master)$whereisls
/bin/ls
➜.oh-my-zshgit:(master)$whichls
ls:aliasedtols-G

基本檔案目錄操作

rm，mkdir，mv，cp，cd，ls，ln，file，stat，wc(-l/w/c)，head，more，tail，cat...

利器管道:|

Shell 文字處理

這裡就是通過案例講了一下12個命令的大致用法和引數，可以通過點選右邊的目錄（我部落格有目錄，公眾號上木有）直達你想要了解的命令。

find,grep,xargs,cut,paste,comm
join,sort,uniq,tr,sed,awk

find

常用引數

檔名-name，檔案型別-type，查詢最大深度-maxdepth
時間過濾(create/access/modify)-[cam]time
執行動作-exec

示例

find./-name"*.json"
find.-maxdepth7-name"*.json"-typef
find.-name"*.log.gz"-ctime+7-size+1M-delete(atime/ctime/mtime)
find.-name"*.scala"-atime-7-execdu-h{}\;

grep

常用引數

-v(invert-match)，
-c(count)，
-n(line-number)，
-i(ignore-case)，
-l， -L， -R(-r， --recursive)， -e

示例

grep'partner'./*.scala-l
grep-e'World'-e'first'-i-R./(-e:or)

相關命令:grep -z / zgrep / zcat xx | grep

xargs

常用引數

-n(每行列數)，
-I(變數替換)
-d(分隔符)， Mac 不支援，注意與GNU版本的區別

示例

find.-typef-name"*.jpg"|xargs-n1-I{}du-sh{}

cut

常用引數

-b(位元組)
-c(字元)
-f(第幾列)，-d(分隔符)，f 範圍:n, n-, -m, n-m

示例

echo"helloworldhellp"|cut-c1-10
cut-d，-f2-8csu.db.export.csv

paste

常用引數

-d 分隔符
-s 列轉行

示例

➜Documents$catfile1
111
222
333
444
➜Documents$catfile2
one1
two2
three3
one14

➜Documents$paste-d,file1file2
111,one1
222,two2
333,three3
444,one14
➜Documents$paste-s-d:file1file2
a11:bbb:333:444
one1:two2:three3:one14

join

類似sql中的...inner join ...on ...，-t分隔符，預設為空格或tab

➜Documents$catj1
111
222
333
444
555
➜Documents$catj2
one10
one21
two42
three53
one154
➜Documents$join-11-23j1j2
111one2
222two4
333three5
444one15

comm

常用引數

用法comm [-123i] file1 file2
字典序列， 3列: 只在file1/file2/both
-去掉某列，i忽略大小寫

示例

➜Documents$seq15>file11
➜Documents$seq26>file22
➜Documents$catfile11
1
2
3
4
5
➜Documents$catfile22
2
3
4
5
6
➜Documents$commfile11file22
1
2
3
4
5
6
➜Documents$comm-1file11file22
2
3
4
5
6
➜Documents$comm-2file11file22
1
2
3
4
5
➜Documents$comm-23file11file22
1

相關命令diff(類似git diff)

sort

常用引數

-d， --dictionary-order
-n， --numeric-sort
-r， --reverse
-b， --ignore-leading-blanks
-k， --key

示例

➜Documents$catfile2
one1
two2
three3
one14
➜Documents$sortfile2
one1
one14
three3
two2
➜Documents$sort-b-k2-rfile2
one14
three3
two2
one1

uniq

常用引數

-c 重複次數
-d 重複的
-u 沒重複的
-f 忽略前幾列

示例

➜Documents$catfile4
11
22
33
11
11
➜Documents$sortfile4|uniq-c
311
122
133
➜Documents$sortfile4|uniq-d
11
➜Documents$sortfile4|uniq-u
22
33
➜Documents$catfile3
one1
two1
three3
one14
➜Documents$uniq-c-f1file3
2one1
1three3
1one14

注意：uniq比較相鄰的是否重複，一般與sort聯用

tr

常用引數

-c 補集
-d 刪除
-s 壓縮相鄰重複的

示例

➜Documents$echo'1111234444533hello'|tr'[1-3]''[a-c]'
aaaabc44445cchello
➜Documents$echo'1111234444533hello'|tr-d'[1-3]'
44445hello
➜Documents$echo'1111234444533hello'|tr-dc'[1-3]'
11112333
➜Documents$echo'1111234444533hello'|tr-s'[0-9]'
123453hello
➜Documents$echo'helloworld'|tr'[:lower:]''[:upper:]'
HELLOWORLD

sed

常用引數

-d 刪除
-s 替換， g 全域性
-e 多個命令疊加
-i 修改原檔案(Mac下加引數 ""，備份)

示例

➜Documents$catfile2
one1
two2
three3
one14
➜Documents$sed"2,3d"file2
one1
one14
➜Documents$sed'/one/d'file2
two2
three3
➜Documents$sed's/one/111/g'file2
1111
two2
three3
11114
#將one替換成111並將含有two的行刪除
➜Documents$sed-e's/one/111/g'-e'/two/d'file2
1111
three3
11114
#()標記(轉義),\1引用
➜Documents$sed's/\([0-9]\)/\1.html/g'file2
one1.html
two2.html
three3.html
one1.html4.html
#與上面一樣&標記匹配的字元
➜Documents$sed's/[0-9]/&.html/g'file2
one1.html
two2.html
three3.html
one1.html4.html
➜Documents$catmobile.csv
"13090246026"
"18020278026"
"18520261021"
"13110221022"
➜Documents$sed's/\([0-9]\{3\}\)[0-9]\{4\}/\1xxxx/g'mobile.csv
"130xxxx6026"
"180xxxx8026"
"185xxxx1021"
"131xxxx1022"

awk

基本引數和語法

NR 行號， NF 列數量
$1第1列，$2, $3...
-F fs fs分隔符，字串或正則
語法:awk 'BEGIN{ commands } pattern{ commands } END{ commands }'，流程如下:

執行begin
對輸入每一行執行pattern{ commands }， pattern 可以是正則/reg exp/，關係運算等
處理完畢，執行 end

示例

➜Documents$catfile5
1111aacc
2222bb
3333d
1111
1111
#行號，列數量，第3列
➜Documents$awk'{printNR"("NF"):"，$3}'file5
1(4):aa
2(3):bb
3(3):d
4(2):
5(2):
#字串分割，列印1，2列
➜Documents$awk-F"xxxx"'{print$1，$2}'mobile.csv
"1306026"
"1808026"
"1851021"
"1311022"
#新增表示式
➜Documents$awk'$1>=22{printNR":"，$3}'file5
2:bb
3:d
#累加1到36，奇數，偶數
➜Documents$seq36|awk'BEGIN{sum=0;print"question:"}{print$1"+";sum+=$1}END{print"=";printsum}'|xargs|sed's/+=/=/'
question:1+2+3+4+5+6+7+8+9+10+11+12+13+14+15+16+17+18+19+20+21+22+23+24+25+26+27+28+29+30+31+32+33+34+35+36=666
➜Documents$seq36|awk'BEGIN{sum=0;print"question:"}$1%2==1{print$1"+";sum+=$1}END{print"=";printsum}'|xargs|sed's/+=/=/'
question:1+3+5+7+9+11+13+15+17+19+21+23+25+27+29+31+33+35=324
➜Documents$seq36|awk'BEGIN{sum=0;print"question:"}$1%2!=1{print$1"+";sum+=$1}END{print"=";printsum}'|xargs|sed's/+=/=/'
question:2+4+6+8+10+12+14+16+18+20+22+24+26+28+30+32+34+36=342

其他高階語法：for, while等，各種函式等，本身awk是一個強大的語言，可以掌握一些遊戲買號地圖基本的用法。

實際應用

日誌統計分析

例如拿到一個nginx日誌檔案，可以做很多事情，比如看哪些請求是耗時最久的進而進行優化，比如看每小時的"PV"數等等。

➜Documents$head-n5std.nginx.log
106.38.187.225--[20/Feb/2017:03:31:01+0800]www.tanglei.name"GET/baike/208344.htmlHTTP/1.0"301486"-""Mozilla/5.0(compatible;MSIE7.0;WindowsNT5.1;.NETCLR1.1.4322)360JKyunjiankong975382""106.38.187.225,106.38.187.225"-0.000
106.38.187.225--[20/Feb/2017:03:31:02+0800]www.tanglei.name"GET/baike/208344.htmlHTTP/1.0"301486"-""Mozilla/5.0(compatible;MSIE7.0;WindowsNT5.1;.NETCLR1.1.4322)360JKyunjiankong975382""106.38.187.225,106.38.187.225"-0.000
10.130.64.143--[20/Feb/2017:03:31:02+0800]stdbaike.bdp.cc"POST/baike/wp-cron.php?doing_wp_cron=1487532662.2058920860290527343750HTTP/1.1"200182"-""WordPress/4.5.6;http://www.tanglei.name/baike""10.130.64.143"0.2050.205
10.130.64.143--[20/Feb/2017:03:31:02+0800]www.tanglei.name"GET/external/api/login-statusHTTP/1.0"200478"-""-""10.130.64.143"0.0030.004
10.130.64.143--[20/Feb/2017:03:31:02+0800]www.tanglei.name"GET/content_util/authorcontents?count=5&offset=0&israndom=1&author=9HTTP/1.0"20011972"-""-""10.130.64.143"0.0130.013

上面是nginx的一個案例，例如希望找到top 10 請求的path:

head-n10000std.nginx.log|awk'{print$8","$10}'|grep',404'|sort|uniq-c|sort-nr-k1|head-n10
#or
head-n10000std.nginx.log|awk'$10==404{print$8}'|sort|uniq-c|sort-nr-k1|head-n10

當然，你可能一次不會直接處理成功，一般會先少拿一部分資料進行處理看邏輯是否正常，或者你可以快取一些中間結果.

catstd.nginx.log|awk'{print$8","$10}'|grep',404'>404.log
sort404.log|uniq-c|sort-nr-k1|head-n10

再比如每小時請求數量，請求耗時等等

➜Documents$head-n100000std.nginx.log|awk-F:'{print$1$2}'|cut-f3-d/|uniq-c
8237201703
15051201704
16083201705
18561201706
22723201707
19345201708

其他實際案例 ip block

案例: db資料訂正

背景: 因為某服務bug，導致插入到db的圖片路徑不對，需要將形如(安全需要已經將敏感資料替換)https://www.tanglei.name/upload/photos/129630//internal-public/shangtongdai/2017-02-19-abcdefg-eb85-4c24-883e-hijklmn.jpg替換成http://www.tanglei.me/internal-public/shangtongdai/2017-02-19-abcdefg-eb85-4c24-883e-hijklmn.jpg，因為mysql等db貌似不支援直接正則的替換，所以不能夠很方便的進行寫sql進行替換（就算支援，直接改也有風險的，還是先備份再修改留個“後悔藥”）。

當然將資料匯出，然後寫 python 等指令碼處理也是一種解決方案，但如果用上面的命令列處理，只需要幾十秒即可完成。

步驟:

準備資料

selectid,photo_url_1,photo_url_2,photo_url_3fromsomedb.sometablewhere
photo_url_1like'https://www.tanglei.name/upload/photos/%//internal-public/%'or
photo_url_2like'https://www.tanglei.name/upload/photos/%//internal-public/%'or
photo_url_3like'https://www.tanglei.name/upload/photos/%//internal-public/%';

替換原檔案一般在用sed替換的時候，先測試一下是否正常替換。

#測試是否OK
head-n5customers.csv|sed's|https://www.tanglei.name/upload/photos/[0-9]\{1,\}/|http://www.tanglei.me|g'
#直接替換原檔案，可以sed-i".bak"替換時保留原始備份檔案
sed-i""'s|https://www.tanglei.name/upload/photos/[0-9]\{1,\}/|http://www.tanglei.me|g'customers.csv

拼接sql，然後執行

awk-F，'{print"updatesometablesetphoto_url_1="$2,",photo_url_2="$3,",photo_url_3="$4,"whereid="$1";"}'customers.csv>customer.sql
#然後執行sql即可

其他

play framework session

老方式: 需要啟play環境，慢。新方式直接命令列解決。

sbt"projectsite"consoleQuick
importplay.api.libs._
valsec="secret...secret"
varuid="10086"
Crypto.sign(s"uid=$uid"，sec.getBytes("UTF-8"))+s"-uid=$uid"

➜Documents$~/stdcookie.sh97522
918xxxxdf64abcfcxxxxc465xx7554dxxxx21e-uid=97522
➜Documents$cat~/stdcookie.sh
#!/bin/bash##cannotremovethisline
uid=$1
hash=`echo-n"uid=$uid"|openssldgst-sha1-hmac"secret...secret"`
echo"$hash-uid=$uid"

統計文章單詞頻率: 下面案例統計了川普就職演講原文中詞頻最高的10個詞。

➜Documents$head-n3chuanpu.txt
ChiefJusticeRoberts，PresidentCarter，PresidentClinton，PresidentBush，PresidentObama，fellowAmericansandpeopleoftheworld，thankyou.

We，thecitizensofAmerica，arenowjoinedinagreatnationalefforttorebuildourcountryandrestoreitspromiseforallofourpeople.TogetherwewilldeterminethecourseofAmericaandtheworldformany，manyyearstocome.
➜Documents$catchuanpu.txt|tr-dc'a-zA-Z'|xargs-n1|sort|uniq-c|sort-nr-k1|head-n20
65the
63and
48of
46our
42will
37to
21We
20is
18we
17America
15a
14all
13in
13for
13be
13are
10your
10not
10And
10American

隨機數：比如常常新註冊一個網站，隨機生成一個密碼之類的。

➜Documents$cat/dev/urandom|LC_CTYPE=Ctr-dc'a-zA-Z0-9'|fold-w32|head-n5
cpBnvC0niwTybSSJhUUiZwIz6ykJxBvu
VDP56NlHnugAt2yDySAB9HU2Nd0LlYCW
0WEDzpjPop32T5STvR6K6SfZMyT6KvAI
a9xBwBat7tJVaad279fOPdA9fEuDEqUd
hTLrOiTH5FNP2nU3uflsjPUXJmfleI5c
➜Documents$cat/dev/urandom|head-c32|base64
WoCqUye9mSXI/WhHODHDjzLaSb09xrOtbrJagG7Kfqc=

圖片處理壓縮，可批量改圖片大小等等sips

➜linux-shell-more-effiency$sips-gallwhich-whereis.png
/Users/tanglei/Documents/linux-shell-more-effiency/which-whereis.png
pixelWidth:280
pixelHeight:81
typeIdentifier:public.png
format:png
formatOptions:default
dpiWidth:72.000
dpiHeight:72.000
samplesPerPixel:4
bitsPerSample:8
hasAlpha:yes
space:RGB
profile:DELLU2412M
➜linux-shell-more-effiency$sips-Z250which-whereis.png
/Users/tanglei/Documents/linux-shell-more-effiency/which-whereis.png
/Users/tanglei/Documents/linux-shell-more-effiency/which-whereis.png
➜linux-shell-more-effiency$sips-gallwhich-whereis.png
/Users/tanglei/Documents/linux-shell-more-effiency/which-whereis.png
pixelWidth:250
pixelHeight:72
typeIdentifier:public.png
format:png
formatOptions:default
dpiWidth:72.000
dpiHeight:72.000
samplesPerPixel:4
bitsPerSample:8
hasAlpha:yes
space:RGB
profile:DELLU2412M
➜linux-shell-more-effiency$sips-z10030which-whereis.png
/Users/tanglei/Documents/linux-shell-more-effiency/which-whereis.png
/Users/tanglei/Documents/linux-shell-more-effiency/which-whereis.png
➜linux-shell-more-effiency$sips-gpixelWidth-gpixelHeightwhich-whereis.png
/Users/tanglei/Documents/linux-shell-more-effiency/which-whereis.png
pixelWidth:30
pixelHeight:100

命令列處理 JSON 的神器：隨著 JSON 通用性，常常需要處理 JSON 資料，這裡推薦這個命令列 JSON 處理神器jq is a lightweight and flexible command-line JSON processor[1]
其他還有一個綜合應用可參考：沒想到 Shell 命令竟然還能這麼玩？| Shell 玩轉大資料分析