軟件質量與測試第4周小組作業：WordCount優化

阿新 • • 發佈：2018-04-08

結果文本一個統計 and 詞頻統計 adf highlight line

GitHub項目地址

https://github.com/Guchencc/WordCounter

組長：

　　陳佳文：負責詞頻統計模塊與其他模塊

組員：

　　屈佳燁：負責排序模塊

　　苑子尚：負責輸出模塊

　　李一凡：負責輸入模塊

PSP表格

PSP2.1	PSP階段	預估耗時（分鐘）	實際耗時（分鐘）
Planning	計劃
· Estimate	· 估計這個任務需要多少時間	20	20
Development	開發
· Analysis	· 需求分析 (包括學習新技術)	180	200
· Design Spec	· 生成設計文檔	30	30
· Design Review	· 設計復審 (和同事審核設計文檔)	30	30
· Coding Standard	· 代碼規範 (為目前的開發制定合適的規範)	30	20
· Design	· 具體設計	120	130
· Coding	· 具體編碼	120	150
· Code Review	· 代碼復審	60	60
· Test	· 測試（自我測試，修改代碼，提交修改）	60	60
Reporting	報告
· Test Report	· 測試報告	60	60
· Size Measurement	· 計算工作量	20	10
· Postmortem & Process Improvement Plan	· 事後總結, 並提出過程改進計劃	20	10
	合計	750	780

詞頻統計模塊設計與實現

該模塊由三個函數組成：

　　ArrayList<WordInfo> countFrequency(String filename)

　　功能：接受文本文件名，讀取文本內容，進行詞頻統計，並將結果存入動態數組中返回。

public static ArrayList<WordInfo> countFrequency(String filename) {
        ArrayList<WordInfo> wordInfos=new ArrayList<>();
        Pattern pattern=Pattern.compile("[a-zA-Z]+-?[a-zA-Z]*");    //定義單詞的正則表達式
        String text=Main.readFile(filename);    //調用readFile（）讀取文本內容賦值給text
        if (text==null){
            return null;
        }
        Matcher matcher=pattern.matcher(text);  //利用之前定義的單詞正則表達式匹配text中的單詞
        String word;
        int index;
        WordInfo wordInfo;
        while(matcher.find()) {     //如果匹配到單詞則進入循環處理
            word=matcher.group().toLowerCase();     //將匹配到的單詞賦值給word
            if (word.endsWith("-"))     //如果匹配到“單詞-”情況，則去除符號“-”
                word=word.substring(0,word.length()-1);
            if ((index=Main.hasWord(wordInfos,word))!=-1) {     //如果動態數組wordInfos中已經有該單詞，則將頻數加一
                wordInfos.get(index).setFrequency(wordInfos.get(index).getFrequency()+1);
            }else{      //如果動態數組wordInfos中無該單詞，則將該單詞加入動態數組
                wordInfo=new WordInfo(word, 1);
                wordInfos.add(wordInfo);
            }
        }
        return wordInfos;
    }

　　原理：

　　Pattern類用於創建一個正則表達式,也可以說創建一個匹配模式,它的構造方法是私有的,不可以直接創建,但可以通過Pattern.complie(String regex)簡單工廠方法創建一個正則表達式,輪到Matcher類登場了,Pattern.matcher(CharSequence input)返回一個Matcher對象.

　　find()對字符串進行匹配,匹配到的字符串可以在任何位置. group()返回匹配到的子字符串

　　利用Pattern類創建定義單詞的正則表達式，在本程序中即

Pattern pattern=Pattern.compile("[a-zA-Z]+-?[a-zA-Z]*");

　　調用readFile（String filename）讀取文本文件內容，將文本賦值給字符串text，再用Pattern類產生Matcher類的實例，即

String text=Main.readFile(filename);

Matcher matcher=pattern.matcher(text);

　　matcher.find（）對字符串進行匹配，若匹配到符合正則表達式的單詞則返回true進入循環。

while(matcher.find()) {   
           ......
}

　　如果匹配的單詞類型是“單詞-”，則將單詞中的“-”符號去掉。

if (word.endsWith("-"))    
   word=word.substring(0,word.length()-1);

　　如果當前匹配到的單詞，動態數組wordInfos中已經存在，則將該單詞頻數加一，否則將該單詞加入動態數組。

if ((index=Main.hasWord(wordInfos,word))!=-1) {     
                wordInfos.get(index).setFrequency(wordInfos.get(index).getFrequency()+1);
            }else{    
                wordInfo=new WordInfo(word, 1);
                wordInfos.add(wordInfo);
            }

　　int hasWord(ArrayList<WordInfo> wordInfos, String word)
　　功能：接受動態數組wordInfos和字符串word，判斷word是否存在於wordInfos中，若存在則返回其具體位置，否則返回-1。

 public static int hasWord(ArrayList<WordInfo> wordInfos, String word) {     //判斷word是否存在於動態數組wordInfos中，若存在則返回位置，負責返回-1
        for (WordInfo wordInfo:wordInfos){
            if (wordInfo.getWord().equals(word.trim().toLowerCase()))
                return wordInfos.indexOf(wordInfo);
        }
        return -1;
    }

　　原理：遍歷動態數組，尋找word，若存在則返回其index，否則返回-1。

　　String readFile(String filename)
　　功能：接受文本文件名，讀取該文本內容，並將其以字符串類型返回。

public static String readFile(String filename) {    //讀取filename文本文件
        File file=new File(filename);
        StringBuilder sb = new StringBuilder();
        try {
            FileReader reader = new FileReader(file);
            BufferedReader br = new BufferedReader(reader);
            String str;
            while ((str = br.readLine()) != null) { //逐行讀取文件內容，不讀取換行符和末尾的空格
                sb.append(str + "\n");
            }
            br.close();
            return sb.toString();
        }catch (IOException e){
            System.out.println("讀取文件失敗！");
        }
        return null;
    }

　　原理：逐行讀取文件內容，不讀取換行符和末尾的空格。將各行鏈接起來組成一個字符串。

測試用例的設計

保證設計的測試用例應至少覆蓋函數中所有的可執行語句，同時主要針對特殊字符、數字、連字符、大小寫字母等 的出現設計測試用例。

技術分享圖片

單元測試結果

下圖為單元測試截圖，由圖可知，該模塊通過了所有測試用例，且時間很短，因此該模塊測試質量還是很上乘的。

技術分享圖片

小組貢獻

作為此次小組項目的組長，負責團隊開發管理與GitHub項目的管理，並且承擔了大部分的代碼編輯工作。故給自己的小組貢獻分為0.4。

軟件質量與測試第4周小組作業：WordCount優化

結果文本一個統計 and 詞頻統計 adf highlight line GitHub項目地址 https://github.com/Guchencc/WordCounter 組長：　　陳佳文：負責詞頻統計模塊與其他模塊組員：　　屈佳燁：負責排序模塊　

軟件質量與測試第4周小組作業：WordCount優化

軟件質量與測試第4周小組作業：WordCount優化

軟件質量與測試第4周小組作業：WordCountPro

軟件質量與測試第4周小組作業:WordCount優化

軟件質量與測試第4周個人作業

第4周小組作業：WordCount優化

軟件質量與測試第二周作業 WordCount

第六周小組作業：軟件測試與評估

第6周小組作業：軟件測試和評估

軟件測試第6周小組作業

第六周小組作業：軟件測試和評估

第四周小組作業：WordCount優化

軟件開發與測試模型

吳恩達神經網路和深度學習第4周程式設計作業

第六周小組作業

第6周小組作業

第九周web作業：history of grammar

軟件質量測試第二周 wordcount 作業

軟件測試與軟件質量保證

20172321 2018-2019《Java軟件結構與數據結構》第三周學習總結

20172308《Java軟件結構與數據結構》第三周學習總結

軟件質量與測試第4周小組作業：WordCount優化

相關推薦