Postgresql資料庫的一些字串操作函式
今天做專案遇到客戶反映了一個麻煩的事情,有一些資料存在,但就是在程式中搜索不出來,後來分析,發現問題為資料前面有幾個空白字元,後來用SQL查詢了一下,發現八九個資料表中,數千萬條資料中有將近三百萬條資料存在相同的問題,本想著在查詢時新增匹配符'%',後來試運行了一下,發現不可行,因為尚有很多其它頁面存在類似的搜尋問題,並且這樣會極大地影響到查詢的速度,再加上客戶迫切需要解決這個問題,由於在白天程式需不間斷訪問,並且不能對其執行速度產生較大的影響,所以排除了JDBC進行修改的方案,也排除了修改程式搜尋程式碼的方案,頭痛不己
後來仔細想了一下,並嘗試去查詢相關sql字串操作函式,確認有沒有快捷的方式去解決空白字串問題,很快,我們找到了合適的解決方案,對其中一個測試資料庫進行了測試,結果非常令人滿意,三百多萬的資料只用了數分種,便去掉了所有的無用空格問題,為之興奮不己。。。用法如下
update property set memorial_no = btrim(memorial_no, ' ') where memorial_no like ' %'
或
update property set memorial_no = trim(both ' ' from memorial_no) where memorial_no like ' %'
btrim()方法為刪除字串兩邊的某類字元,可以同時指定多個,在上面為' ',意為空格
trim()方法可以實現所有btrim()能實現的功能,事實上btrim()即為刪除兩邊的某類字串,trim()可以只指定某一邊,當然也可以為兩邊
現把Postgresql的字串操作函式羅列在以,以便日方使用
函式:string || string 說明:String concatenation 字串連線操作 例子:'Post' || 'greSQL' = PostgreSQL
函式:string || non-string or non-string || string 說明:String concatenation with one non-string input 字串與非字串型別進行連線操作 例子:'Value: ' || 42 = Value: 42
函式:bit_length(string) 說明:Number of bits in string 計算字串的位數 例子:bit_length('jose') = 32
函式:char_length(string) or character_length(string) 說明:Number of characters in string 計算字串中字元個數 例子:char_length('jose') = 4
函式:lower(string) 說明:Convert string to lower case 轉換字串為小寫 例子:bit_length('jose') = 32
函式:octet_length(string) 說明:Number of bytes in string 計算字串的位元組數 例子:octet_length('jose') = 4
函式:overlay(string placing string from int [for int]) 說明:Replace substring 替換字串中任意長度的子字串為新字串 例子:overlay('Txxxxas' placing 'hom' from 2 for 4) = 4
函式:position(substring in string) 說明:Location of specified substring 子串在一字串中的位置 例子:position('om' in 'Thomas') = 3
函式:substring(string [from int] [for int]) 說明:Extract substring 擷取任意長度的子字串 例子:substring('Thomas' from 2 for 3) = hom
函式:substring(string from pattern) 說明:Extract substring matching POSIX regular expression. See Section 9.7 for more information on pattern matching. 利用正則表示式對一字串進行任意長度的字串的擷取 例子:substring('Thomas' from '...$') = mas
函式:substring(string from pattern for escape) 說明:Extract substring matching SQL regular expression. See Section 9.7 for more information on pattern matching. 利於正則表示式對某類字元進行刪除,以得到子字串 例子:trim(both 'x' from 'xTomxx') = Tom
函式:trim([leading | trailing | both] [characters] from string) 說明:Remove the longest string containing only the characters (a space by default) from the start/end/both ends of the string 去除儘可能長開始,結束或者兩邊的某類字元,預設為去除空白字元,當然可以自己指定,可同時指定多個要刪除的字串 例子:trim(both 'x' from 'xTomxx') = Tom
函式:upper(string) 說明:Convert string to uppercase 將字串轉換為大寫 例子:upper('tom') = TOM
函式:ascii(string) 說明:ASCII code of the first character of the argument. For UTF8 returns the Unicode code point of the character. For other multibyte encodings. the argument must be a strictly ASCII character. 得到某一個字元的Assii值 例子:ascii('x') = 120
函式:btrim(string text [, characters text]) 說明:Remove the longest string consisting only of characters in characters (a space by default) from the start and end of string 去除字串兩邊的所有指定的字元,可同時指定多個字元 例子:btrim('xyxtrimyyx', 'xy') = trim
函式:chr(int) 說明:Character with the given code. For UTF8 the argument is treated as a Unicode code point. For other multibyte encodings the argument must designate a strictly ASCII character. The NULL (0) character is not allowed because text data types cannot store such bytes. 得到某ACSII值對應的字元 例子:chr(65) = A
函式:convert(string bytea, src_encoding name, dest_encoding name) 說明:Convert string to dest_encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. Conversions can be defined by CREATE CONVERSION. Also there are some predefined conversions. See Table 9-7 for available conversions. 轉換字串編碼,指定源編碼與目標編碼 例子:convert('text_in_utf8', 'UTF8', 'LATIN1') = text_in_utf8 represented in ISO 8859-1 encoding
函式:convert_from(string bytea, src_encoding name) 說明:Convert string to the database encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. 轉換字串編碼,自己要指定源編碼,目標編碼預設為資料庫指定編碼, 例子:convert_from('text_in_utf8', 'UTF8') = text_in_utf8 represented in the current database encoding
函式:convert_to(string text, dest_encoding name) 說明:Convert string to dest_encoding.轉換字串編碼,源編碼預設為資料庫指定編碼,自己要指定目標編碼, 例子:convert_to('some text', 'UTF8') = some text represented in the UTF8 encoding
函式:decode(string text, type text) 說明:Decode binary data from string previously encoded with encode. Parameter type is same as in encode. 對字串按指定的型別進行解碼 例子:decode('MTIzAAE=', 'base64') = 123\000\001
函式:encode(data bytea, type text) 說明:Encode binary data to different representation. Supported types are: base64, hex, escape. Escape merely outputs null bytes as \000 and doubles backslashes. 與decode相反,對字串按指定型別進行編碼 例子:encode(E'123\\000\\001', 'base64') = MTIzAAE=
函式:initcap(string) 說明:Convert the first letter of each word to uppercase and the rest to lowercase. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. 將字串所有的單詞進行格式化,首字母大寫,其它為小寫 例子:initcap('hi THOMAS') = Hi Thomas
函式:length(string) 說明:Number of characters in string 講算字串長度 例子:length('jose') = 4
函式:length(stringbytea, encoding name ) 說明:Number of characters in string in the given encoding. The string must be valid in this encoding. 計算字串長度,指定字串使用的編碼 例子:length('jose', 'UTF8') = 4
函式:lpad(string text, length int [, fill text]) 說明:Fill up the string to length length by prepending the characters fill (a space by default). If the string is already longer than length then it is truncated (on the right). 對字串左邊進行某類字元自動填充,即不足某一長度,則在左邊自動補上指定的字串,直至達到指定長度,可同時指定多個自動填充的字元 例子:lpad('hi', 5, 'xy') = xyxhi
函式:ltrim(string text [, characters text]) 說明:Remove the longest string containing only characters from characters (a space by default) from the start of string 刪除字串左邊某一些的字元,可以時指定多個要刪除的字元 例子:trim
函式:md5(string) 說明:Calculates the MD5 hash of string, returning the result in hexadecimal 將字串進行md5編碼 例子:md5('abc') = 900150983cd24fb0 d6963f7d28e17f72
函式:pg_client_encoding() 說明:Current client encoding name 得到pg客戶端編碼 例子:pg_client_encoding() = SQL_ASCII
函式:quote_ident(string text) 說明:Return the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary (i.e., if the string contains non-identifier characters or would be case-folded). Embedded quotes are properly doubled. 對某一字串加上兩引號 例子:quote_ident('Foo bar') = "Foo bar"
函式:quote_literal(string text) 說明:Return the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single-quotes and backslashes are properly doubled. 對字串裡兩邊加上單引號,如果字串裡面出現sql編碼的單個單引號,則會被表達成兩個單引號 例子:quote_literal('O\'Reilly') = 'O''Reilly'
函式:quote_literal(value anyelement) 說明:Coerce the given value to text and then quote it as a literal. Embedded single-quotes and backslashes are properly doubled. 將一數值轉換為字串,併為其兩邊加上單引號,如果數值中間出現了單引號,也會被表示成兩個單引號 例子:quote_literal(42.5) = '42.5'
函式:regexp_matches(string text, pattern text [, flags text]) 說明:Return all captured substrings resulting from matching a POSIX regular expression against the string. See Section 9.7.3 for more information. 對字串按正則表示式進行匹配,如果存在則會在結果陣列中表示出來 例子:regexp_matches('foobarbequebaz', '(bar)(beque)') = {bar,beque}
函式:regexp_replace(string text, pattern text, replacement text [, flags text]) 說明:Replace substring(s) matching a POSIX regular expression. See Section 9.7.3 for more information. 利用正則表示式對字串進行替換 例子:regexp_replace('Thomas', '.[mN]a.', 'M') = ThM
函式:regexp_split_to_array(string text, pattern text [, flags text ]) 說明:Split string using a POSIX regular expression as the delimiter. See Section 9.7.3 for more information. 利用正則表示式將字串分割成陣列 例子:regexp_split_to_array('hello world', E'\\s+') = {hello,world}
函式:regexp_split_to_table(string text, pattern text [, flags text]) 說明:Split string using a POSIX regular expression as the delimiter. See Section 9.7.3 for more information. 利用正則表示式將字串分割成表格 例子:regexp_split_to_table('hello world', E'\\s+') = hello world (2 rows)
函式:repeat(string text, number int) 說明:Repeat string the specified number of times 重複字串一指定次數 例子:repeat('Pg', 4) = PgPgPgPg
函式:replace(string text, from text, to text) 說明:Replace all occurrences in string of substring from with substring to 將字元的某一子串替換成另一子串 例子:('abcdefabcdef', 'cd', 'XX') = abXXefabXXef
函式:rpad(string text, length int [, fill text]) 說明:Fill up the string to length length by appending the characters fill (a space by default). If the string is already longer than length then it is truncated. 對字串進行填充,填充內容為指定的字串 例子:rpad('hi', 5, 'xy') = hixyx
函式:rtrim(string text [, characters text]) 說明:Remove the longest string containing only characters from characters (a space by default) from the end of string 去除字串右邊指定的字元 例子:rtrim('trimxxxx', 'x') = trim
函式:split_part(string text, delimiter text, field int) 說明:Split string on delimiter and return the given field (counting from one) 對字串按指定子串進行分割,並返回指定的數值位置的值 例子:split_part(mailto:'[email protected][email protected]~ghi', mailto:'[email protected]~', 2) = def
函式:strpos(string, substring) 說明:Location of specified substring (same as position(substring in string), but note the reversed argument order) 指定字串在目標字串的位置 例子:strpos('high', 'ig') = 2
函式:substr(string, from [, count]) 說明:Extract substring (same as substring(string from from for count)) 擷取子串 例子:substr('alphabet', 3, 2) = ph
函式:to_ascii(string text [, encoding text]) 說明:Convert string to ASCII from another encoding (only supports conversion from LATIN1, LATIN2, LATIN9, and WIN1250 encodings) 將字串轉換成ascii編碼字串 例子:to_ascii('Karel') = Karel
函式:to_hex(number int or bigint) 說明:Convert number to its equivalent hexadecimal representation 對數值進行十六進位制編碼 例子:to_hex(2147483647) = 7fffffff
函式:translate(string text, from text, to text) 說明:Any character in string that matches a character in the from set is replaced by the corresponding character in the to set 將字串中某些匹配的字元替換成指定字串,目標字元與源字元都可以同時指定多個 例子:translate('12345', '14', 'ax') = a23x5