1. 程式人生 > >統計文字中每個單詞的個數


	 * 統計文字每個單詞的個數
	 * @param text
	 *            文字
	 * @param ignoreCase
	 *            是否忽略大小寫
	 * @return
	public static Map<String, Integer> countEachWorld(String text,
			boolean ignoreCase) {
		Matcher m = Pattern.compile("\\w+").matcher(text);
		String matcheStr = null;
		Map<String, Integer> map = new LinkedHashMap<>();
		Integer count = 0;
		while (m.find()) {
			matcheStr = m.group();
			matcheStr = ignoreCase ? matcheStr.toLowerCase() : matcheStr;
			count = map.get(matcheStr);
			map.put(matcheStr, count != null ? count + 1 : 1);
		return map;


Java provides the java.util.regex package for pattern matching with regular expressions. Java regular expressions are very similar to the Perl programming language and very easy to learn.

A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. They can be used to search, edit, or manipulate text and data.



countEachWorld(text, true);

{java=3, provides=1, the=2, util=1, regex=1, package=1, for=1, pattern=2, matching=1, with=1, regular=3, expressions=2, are=1, very=2, similar=1, to=3, perl=1, programming=1, language=1, and=2, easy=1, learn=1, a=4, expression=1, is=1, special=1, sequence=1, of=2, characters=1, that=1, helps=1, you=1, match=1, or=3, find=1, other=1, strings=2, sets=1, using=1, specialized=1, syntax=1, held=1, in=1, they=1, can=1, be=1, used=1, search=1, edit=1, manipulate=1, text=1, data=1}

countEachWorld(text, false);

{Java=2, provides=1, the=2, java=1, util=1, regex=1, package=1, for=1, pattern=2, matching=1, with=1, regular=3, expressions=2, are=1, very=2, similar=1, to=3, Perl=1, programming=1, language=1, and=2, easy=1, learn=1, A=1, expression=1, is=1, a=3, special=1, sequence=1, of=2, characters=1, that=1, helps=1, you=1, match=1, or=3, find=1, other=1, strings=2, sets=1, using=1, specialized=1, syntax=1, held=1, in=1, They=1, can=1, be=1, used=1, search=1, edit=1, manipulate=1, text=1, data=1}



