自己寫(xiě)的字符串切割工具類,性能提升2倍!
目錄
工作中常用的 split() 切割字符串效率高嗎?
JDK 提供字符串切割工具類 StringTokenizer
手把手帶你實(shí)現(xiàn)一個(gè)更高效的字符串切割工具類
總結(jié)
今天給大家介紹一個(gè)小知識(shí)點(diǎn),但是會(huì)非常的實(shí)用,就是平時(shí)我們寫(xiě) Java 代碼的時(shí)候,如果要對(duì)字符串進(jìn)行切割,我們巧妙的運(yùn)用一些技巧,可以把性能提升 5~10 倍。下面不說(shuō)廢話,直接來(lái)給大家上干貨!
工作中常用的 split() 切割字符串效率高嗎?
首先,我們用下面的一段代碼,去拼接出來(lái)一個(gè)用逗號(hào)分隔的超長(zhǎng)字符串,把從 0 開(kāi)始一直到 9999 的每個(gè)數(shù)字都用逗號(hào)分隔,拼接成一個(gè)超長(zhǎng)的字符串,以便于我們可以進(jìn)行實(shí)驗(yàn)。
public?class?StringSplitTest?{
????public?static?void?main(String[]?args)?{
????????String?string?=?null;
????????StringBuffer?stringBuffer?=?new?StringBuffer();
????????int?max?=?10000;
????????for(int?i?=?0;?i?????????????stringBuffer.append(i);
????????????if(i?1)?{
????????????????stringBuffer.append(",");
????????????}
????????}
????????string?=?stringBuffer.toString();
????}
}
public?class?StringSplitTest?{
????public?static?void?main(String[]?args)?{
????????String?string?=?null;
????????StringBuffer?stringBuffer?=?new?StringBuffer();
????????int?max?=?10000;
????????for(int?i?=?0;?i?????????????stringBuffer.append(i);
????????????if(i?1)?{
????????????????stringBuffer.append(",");
????????????}
????????}
????????string?=?stringBuffer.toString();
????????long?start?=?System.currentTimeMillis();
????????for(int?i?=?0;?i?10000;?i++)?{
????????????string.split(",");
????????}
????????long?end?=?System.currentTimeMillis();
????????System.out.println(end?-?start);
????}
}
經(jīng)過(guò)上面代碼的測(cè)試,最終發(fā)現(xiàn)用 split 方法對(duì)字符串按照逗號(hào)進(jìn)行切割,切割 1w 次是耗時(shí) 2000 多毫秒,這個(gè)不太固定,大概是 2300 毫秒左右。
JDK 提供字符串切割工具類 StringTokenizer
接著給大家介紹另外一個(gè)性能更加好的專門(mén)用于字符串切割的工具類,就是 StringTokenizer,這個(gè)工具是 JDK 提供的,也是專門(mén)用來(lái)進(jìn)行字符串切割的,他的性能會(huì)更好一些。
import?java.util.StringTokenizer;
public?class?StringSplitTest?{
????public?static?void?main(String[]?args)?{
????????String?string?=?null;
????????StringBuffer?stringBuffer?=?new?StringBuffer();
????????int?max?=?10000;
????????for(int?i?=?0;?i?????????????stringBuffer.append(i);
????????????if(i?1)?{
????????????????stringBuffer.append(",");
????????????}
????????}
????????string?=?stringBuffer.toString();
????????long?start?=?System.currentTimeMillis();
????????for(int?i?=?0;?i?10000;?i++)?{
????????????string.split(",");
????????}
????????long?end?=?System.currentTimeMillis();
????????System.out.println(end?-?start);
????????start?=?System.currentTimeMillis();
????????StringTokenizer?stringTokenizer?=
????????????????new?StringTokenizer(string,?",");
????????for(int?i?=?0;?i?10000;?i++)?{
????????????while(stringTokenizer.hasMoreTokens())?{
????????????????stringTokenizer.nextToken();
????????????}
????????????stringTokenizer?=?new?StringTokenizer(string,?",");
????????}
????????end?=?System.currentTimeMillis();
????????System.out.println(end?-?start);
????}
}
大家看上面的代碼,用 StringTokenizer 可以通過(guò) hasMoreTokens() 方法判斷是否有切割出的下一個(gè)元素,如果有就用 nextToken() 拿到這個(gè)切割出來(lái)的元素,一次全部切割完畢后,就重新創(chuàng)建一個(gè)新的 StringTokenizer 對(duì)象。
這樣連續(xù)切割 1w 次,經(jīng)過(guò)測(cè)試之后,會(huì)發(fā)現(xiàn)用 StringTokenizer 切割字符串 1w 次的耗時(shí)大概是 1900 毫秒左右。
大家感覺(jué)如何?是不是看到差距了?換一下切割字符串的方式,就可以讓耗時(shí)減少 400~500ms,性能目前已經(jīng)可以提升 20% 了。
手把手帶你實(shí)現(xiàn)一個(gè)更高效的字符串切割工具類
接著我們來(lái)自己封裝一個(gè)切割字符串的函數(shù),用這個(gè)函數(shù)再來(lái)做一次字符串切割看看。
private?static?void?split(String?string)?{
??String?remainString?=?string;
??int?startIndex?=?0;
??int?endIndex?=?0;
??while(true)?{
????endIndex?=?remainString.indexOf(",",?startIndex);
????if(endIndex?<=?0)?{
??????break;
????}
????remainString.substring(startIndex,?endIndex);
????startIndex?=?endIndex?+?1;
??}
}
上面那段代碼是我們自定義的字符串切割函數(shù),大概意思是說(shuō),每一次切割都走一個(gè) while 循環(huán),startIndex 初始值是 0,然后每一次循環(huán)都找到從 startIndex 開(kāi)始的下一個(gè)逗號(hào)的 index,就是 endIndex,基于 startIndex 和 endIndex 截取一個(gè)字符串出來(lái)。
然后 startIndex 可以推進(jìn)到本次 endIndex + 1 即可,下一次循環(huán)就會(huì)截取下一個(gè)逗號(hào)之前的子字符串了。
import?java.util.StringTokenizer;
public?class?StringSplitTest?{
????public?static?void?main(String[]?args)?{
????????String?string?=?null;
????????StringBuffer?stringBuffer?=?new?StringBuffer();
????????int?max?=?10000;
????????for(int?i?=?0;?i?????????????stringBuffer.append(i);
????????????if(i?1)?{
????????????????stringBuffer.append(",");
????????????}
????????}
????????string?=?stringBuffer.toString();
????????long?start?=?System.currentTimeMillis();
????????for(int?i?=?0;?i?10000;?i++)?{
????????????string.split(",");
????????}
????????long?end?=?System.currentTimeMillis();
????????System.out.println(end?-?start);
????????start?=?System.currentTimeMillis();
????????StringTokenizer?stringTokenizer?=
????????????????new?StringTokenizer(string,?",");
????????for(int?i?=?0;?i?10000;?i++)?{
????????????while(stringTokenizer.hasMoreTokens())?{
????????????????stringTokenizer.nextToken();
????????????}
????????????stringTokenizer?=?new?StringTokenizer(string,?",");
????????}
????????end?=?System.currentTimeMillis();
????????System.out.println(end?-?start);
????????start?=?System.currentTimeMillis();
????????for(int?i?=?0;?i?10000;?i++)?{
????????????split(string);
????????}
????????end?=?System.currentTimeMillis();
????????System.out.println(end?-?start);
????}
????private?static?void?split(String?string)?{
????????String?remainString?=?string;
????????int?startIndex?=?0;
????????int?endIndex?=?0;
????????while(true)?{
????????????endIndex?=?remainString.indexOf(",",?startIndex);
????????????if(endIndex?<=?0)?{
????????????????break;
????????????}
????????????remainString.substring(startIndex,?endIndex);
????????????startIndex?=?endIndex?+?1;
????????}
????}
}
總結(jié)
經(jīng)過(guò)上述代碼測(cè)試之后,我們自己寫(xiě)的字符串切割函數(shù)的耗時(shí)大概是在 1000ms 左右,相比較之下,比 String.split 方法的性能提升了 2 倍多,比 StringTokenizer 的性能也提升了 2 倍,如果要是字符串更大呢?
其實(shí)字符串越大,性能差距就會(huì)越多,可能會(huì)呈更大的倍數(shù)提升我們的性能!
? END??-------------
點(diǎn)個(gè)在看你最好看

