JDK中線程池的一個(gè)BUG,被我發(fā)現(xiàn)了!
你知道的越多,不知道的就越多,業(yè)余的像一棵小草!
你來(lái),我們一起精進(jìn)!你不來(lái),我和你的競(jìng)爭(zhēng)對(duì)手一起精進(jìn)!
編輯:業(yè)余草
segmentfault.com/a/1190000021109130
推薦:https://www.xttblog.com/?p=5290
問(wèn)題描述
前幾天在幫同事排查生產(chǎn)一個(gè)線上偶發(fā)的線程池錯(cuò)誤
邏輯很簡(jiǎn)單,線程池執(zhí)行了一個(gè)帶結(jié)果的異步任務(wù)。但是最近有偶發(fā)的報(bào)錯(cuò):
java.util.concurrent.RejectedExecutionException:?Task?java.util.concurrent.FutureTask@a5acd19?rejected?from?java.util.concurrent.ThreadPoolExecutor@30890a38[Terminated,?pool?size?=?0,?active?threads?=?0,?queued?tasks?=?0,?completed?tasks?=?0]
本文中的模擬代碼已經(jīng)問(wèn)題都是在HotSpot java8 (1.8.0_221)版本下模擬&出現(xiàn)的
下面是模擬代碼,通過(guò)Executors.newSingleThreadExecutor創(chuàng)建一個(gè)單線程的線程池,然后在調(diào)用方獲取Future的結(jié)果:
public?class?ThreadPoolTest?{
????public?static?void?main(String[]?args)?{
????????final?ThreadPoolTest?threadPoolTest?=?new?ThreadPoolTest();
????????for?(int?i?=?0;?i?8;?i++)?{
????????????new?Thread(new?Runnable()?{
????????????????@Override
????????????????public?void?run()?{
????????????????????while?(true)?{
????????????????????????Future?future?=?threadPoolTest.submit();
????????????????????????try?{
????????????????????????????String?s?=?future.get();
????????????????????????}?catch?(InterruptedException?e)?{
????????????????????????????e.printStackTrace();
????????????????????????}?catch?(ExecutionException?e)?{
????????????????????????????e.printStackTrace();
????????????????????????}?catch?(Error?e)?{
????????????????????????????e.printStackTrace();
????????????????????????}
????????????????????}
????????????????}
????????????}).start();
????????}
????????
????????//子線程不停gc,模擬偶發(fā)的gc
????????new?Thread(new?Runnable()?{
????????????@Override
????????????public?void?run()?{
????????????????while?(true)?{
????????????????????System.gc();
????????????????}
????????????}
????????}).start();
????}
????/**
?????*?異步執(zhí)行任務(wù)
?????*?@return
?????*/
????public?Future?submit()? {
????????//關(guān)鍵點(diǎn),通過(guò)Executors.newSingleThreadExecutor創(chuàng)建一個(gè)單線程的線程池
????????ExecutorService?executorService?=?Executors.newSingleThreadExecutor();
????????FutureTask?futureTask?=?new?FutureTask(new?Callable()?{
????????????@Override
????????????public?Object?call()?throws?Exception?{
????????????????Thread.sleep(50);
????????????????return?System.currentTimeMillis()?+?"";
????????????}
????????});
????????executorService.execute(futureTask);
????????return?futureTask;
????}
}
分析&疑問(wèn)
第一個(gè)思考的問(wèn)題是:線程池為什么關(guān)閉了,代碼中并沒(méi)有手動(dòng)關(guān)閉的地方。看一下Executors.newSingleThreadExecotor的源碼實(shí)現(xiàn):
public?static?ExecutorService?newSingleThreadExecutor()?{
????return?new?FinalizableDelegatedExecutorService
????????????(new?ThreadPoolExecutor(1,?1,
????????????????????0L,?TimeUnit.MILLISECONDS,
????????????????????new?LinkedBlockingQueue()));
}
這里創(chuàng)建的實(shí)際上是一個(gè)FinalizableDelegatedExecutorService,這個(gè)包裝類重寫(xiě)了finalize函數(shù),也就是說(shuō)這個(gè)類會(huì)在被GC回收之前,先執(zhí)行線程池的shutdown方法。
問(wèn)題來(lái)了,「GC只會(huì)回收不可達(dá)(unreachable)的對(duì)象」,在submit函數(shù)的棧幀未執(zhí)行完出棧之前,executorService應(yīng)該是可達(dá)的才對(duì)。
對(duì)于此問(wèn)題,先拋出結(jié)論:
「當(dāng)對(duì)象仍存在于作用域(stack frame)時(shí),finalize也可能會(huì)被執(zhí)行」
oracle jdk文檔中有一段關(guān)于finalize的介紹:https://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.6.1。
?A reachable object is any object that can be accessed in any potential continuing computation from any live thread.
Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
?
「大概意思是:可達(dá)對(duì)象(reachable object)是可以從任何活動(dòng)線程的任何潛在的持續(xù)訪問(wèn)中的任何對(duì)象;java編譯器或代碼生成器可能會(huì)對(duì)不再訪問(wèn)的對(duì)象提前置為null,使得對(duì)象可以被提前回收」
也就是說(shuō),在jvm的優(yōu)化下,可能會(huì)出現(xiàn)對(duì)象不可達(dá)之后被提前置空并回收的情況
舉個(gè)例子來(lái)驗(yàn)證一下(摘自https://stackoverflow.com/questions/24376768/can-java-finalize-an-object-when-it-is-still-in-scope):
class?A?{
????@Override?protected?void?finalize()?{
????????System.out.println(this?+?"?was?finalized!");
????}
????public?static?void?main(String[]?args)?throws?InterruptedException?{
????????A?a?=?new?A();
????????System.out.println("Created?"?+?a);
????????for?(int?i?=?0;?i?1_000_000_000;?i++)?{
????????????if?(i?%?1_000_00?==?0)
????????????????System.gc();
????????}
????????System.out.println("done.");
????}
}
//打印結(jié)果
Created?A@1be6f5c3
A@1be6f5c3?was?finalized!//finalize方法輸出
done.
從例子中可以看到,如果a在循環(huán)完成后已經(jīng)不再使用了,則會(huì)出現(xiàn)先執(zhí)行finalize的情況;雖然從對(duì)象作用域來(lái)說(shuō),方法沒(méi)有執(zhí)行完,棧幀并沒(méi)有出棧,但是還是會(huì)被提前執(zhí)行。
現(xiàn)在來(lái)增加一行代碼,在最后一行打印對(duì)象a,讓編譯器/代碼生成器認(rèn)為后面有對(duì)象a的引用
...
System.out.println(a);
//打印結(jié)果
Created?A@1be6f5c3
done.
A@1be6f5c3
從結(jié)果上看,finalize方法都沒(méi)有執(zhí)行(因?yàn)閙ain方法執(zhí)行完成后進(jìn)程直接結(jié)束了),更不會(huì)出現(xiàn)提前finalize的問(wèn)題了
基于上面的測(cè)試結(jié)果,再測(cè)試一種情況,在循環(huán)之前先將對(duì)象a置為null,并且在最后打印保持對(duì)象a的引用
A?a?=?new?A();
System.out.println("Created?"?+?a);
a?=?null;//手動(dòng)置null
for?(int?i?=?0;?i?1_000_000_000;?i++)?{
????if?(i?%?1_000_00?==?0)
????????System.gc();
}
System.out.println("done.");
System.out.println(a);
//打印結(jié)果
Created?A@1be6f5c3
A@1be6f5c3?was?finalized!
done.
null
從結(jié)果上看,手動(dòng)置null的話也會(huì)導(dǎo)致對(duì)象被提前回收,雖然在最后還有引用,但此時(shí)引用的也是null了。
現(xiàn)在再回到上面的線程池問(wèn)題,根據(jù)上面介紹的機(jī)制,在分析沒(méi)有引用之后,對(duì)象會(huì)被提前finalize
可在上述代碼中,return 之前明明是有引用的executorService.execute(futureTask),為什么也會(huì)提前 finalize 呢?
猜測(cè)可能是由于在 execute 方法中,會(huì)調(diào)用 threadPoolExecutor,會(huì)創(chuàng)建并啟動(dòng)一個(gè)新線程,這時(shí)會(huì)發(fā)生一次主動(dòng)的線程切換,導(dǎo)致在活動(dòng)線程中對(duì)象不可達(dá)。
結(jié)合上面Oracle Jdk文檔中的描述“可達(dá)對(duì)象(reachable object)是可以從任何活動(dòng)線程的任何潛在的持續(xù)訪問(wèn)中的任何對(duì)象”,可以認(rèn)為可能是因?yàn)橐淮物@示的線程切換,對(duì)象被認(rèn)為不可達(dá)了,導(dǎo)致線程池被提前finalize了
下面來(lái)驗(yàn)證一下猜想:
//入口函數(shù)
public?class?FinalizedTest?{
????public?static?void?main(String[]?args)?{
????????final?FinalizedTest?finalizedTest?=?new?FinalizedTest();
????????for?(int?i?=?0;?i?8;?i++)?{
????????????new?Thread(new?Runnable()?{
????????????????@Override
????????????????public?void?run()?{
????????????????????while?(true)?{
????????????????????????TFutureTask?future?=?finalizedTest.submit();
????????????????????}
????????????????}
????????????}).start();
????????}
????????new?Thread(new?Runnable()?{
????????????@Override
????????????public?void?run()?{
????????????????while?(true)?{
????????????????????System.gc();
????????????????}
????????????}
????????}).start();
????}
????public?TFutureTask?submit(){
????????TExecutorService?TExecutorService?=?Executors.create();
????????TExecutorService.execute();
????????return?null;
????}
}
//Executors.java,模擬juc的Executors
public?class?Executors?{
????/**
?????*?模擬Executors.createSingleExecutor
?????*?@return
?????*/
????public?static?TExecutorService?create(){
????????return?new?FinalizableDelegatedTExecutorService(new?TThreadPoolExecutor());
????}
????static?class?FinalizableDelegatedTExecutorService?extends?DelegatedTExecutorService?{
????????FinalizableDelegatedTExecutorService(TExecutorService?executor)?{
????????????super(executor);
????????}
????????
????????/**
?????????*?析構(gòu)函數(shù)中執(zhí)行shutdown,修改線程池狀態(tài)
?????????*?@throws?Throwable
?????????*/
????????@Override
????????protected?void?finalize()?throws?Throwable?{
????????????super.shutdown();
????????}
????}
????static?class?DelegatedTExecutorService?extends?TExecutorService?{
????????protected?TExecutorService?e;
????????public?DelegatedTExecutorService(TExecutorService?executor)?{
????????????this.e?=?executor;
????????}
????????@Override
????????public?void?execute()?{
????????????e.execute();
????????}
????????@Override
????????public?void?shutdown()?{
????????????e.shutdown();
????????}
????}
}
//TThreadPoolExecutor.java,模擬juc的ThreadPoolExecutor
public?class?TThreadPoolExecutor?extends?TExecutorService?{
????/**
?????*?線程池狀態(tài),false:未關(guān)閉,true已關(guān)閉
?????*/
????private?AtomicBoolean?ctl?=?new?AtomicBoolean();
????@Override
????public?void?execute()?{
????????//啟動(dòng)一個(gè)新線程,模擬ThreadPoolExecutor.execute
????????new?Thread(new?Runnable()?{
????????????@Override
????????????public?void?run()?{
????????????}
????????}).start();
????????//模擬ThreadPoolExecutor,啟動(dòng)新建線程后,循環(huán)檢查線程池狀態(tài),驗(yàn)證是否會(huì)在finalize中shutdown
????????//如果線程池被提前shutdown,則拋出異常
????????for?(int?i?=?0;?i?1_000_000;?i++)?{
????????????if(ctl.get()){
????????????????throw?new?RuntimeException("reject!!!["+ctl.get()+"]");
????????????}
????????}
????}
????@Override
????public?void?shutdown()?{
????????ctl.compareAndSet(false,true);
????}
}
執(zhí)行若干時(shí)間后報(bào)錯(cuò):
Exception?in?thread?"Thread-1"?java.lang.RuntimeException:?reject!!![true]
從錯(cuò)誤上來(lái)看,“線程池”同樣被提前shutdown了,那么一定是由于新建線程導(dǎo)致的嗎?
下面將新建線程修改為Thread.sleep測(cè)試一下:
//TThreadPoolExecutor.java,修改后的execute方法
public?void?execute()?{
????try?{
????????//顯式的sleep?1?ns,主動(dòng)切換線程
????????TimeUnit.NANOSECONDS.sleep(1);
????}?catch?(InterruptedException?e)?{
????????e.printStackTrace();
????}
????//模擬ThreadPoolExecutor,啟動(dòng)新建線程后,循環(huán)檢查線程池狀態(tài),驗(yàn)證是否會(huì)在finalize中shutdown
????//如果線程池被提前shutdown,則拋出異常
????for?(int?i?=?0;?i?1_000_000;?i++)?{
????????if(ctl.get()){
????????????throw?new?RuntimeException("reject!!!["+ctl.get()+"]");
????????}
????}
}
執(zhí)行結(jié)果一樣是報(bào)錯(cuò)
Exception?in?thread?"Thread-3"?java.lang.RuntimeException:?reject!!![true]
「由此可得,如果在執(zhí)行的過(guò)程中,發(fā)生一次顯式的線程切換,則會(huì)讓編譯器/代碼生成器認(rèn)為外層包裝對(duì)象不可達(dá)」
總結(jié)
雖然GC只會(huì)回收不可達(dá)GC ROOT的對(duì)象,但是在編譯器(沒(méi)有明確指出,也可能是JIT)/代碼生成器的優(yōu)化下,可能會(huì)出現(xiàn)對(duì)象提前置null,或者線程切換導(dǎo)致的“提前對(duì)象不可達(dá)”的情況。
所以如果想在finalize方法里做些事情的話,一定在最后顯示的引用一下對(duì)象(toString/hashcode都可以),保持對(duì)象的可達(dá)性(reachable)
上面關(guān)于線程切換導(dǎo)致的對(duì)象不可達(dá),沒(méi)有官方文獻(xiàn)的支持,只是個(gè)人一個(gè)測(cè)試結(jié)果,如有問(wèn)題歡迎指出
「綜上所述,這種回收機(jī)制并不是JDK的bug,而算是一個(gè)優(yōu)化策略,提前回收而已;但Executors.newSingleThreadExecutor的實(shí)現(xiàn)里通過(guò)finalize來(lái)自動(dòng)關(guān)閉線程池的做法是有Bug的,在經(jīng)過(guò)優(yōu)化后可能會(huì)導(dǎo)致線程池的提前shutdown,從而導(dǎo)致異常。」
線程池的這個(gè)問(wèn)題,在JDK的論壇里也是一個(gè)公開(kāi)但未解決狀態(tài)的問(wèn)題(https://bugs.openjdk.java.net/browse/JDK-8145304)。
不過(guò)在JDK11下,該問(wèn)題已經(jīng)被修復(fù):
JUC??Executors.FinalizableDelegatedExecutorService
public?void?execute(Runnable?command)?{
????try?{
????????e.execute(command);
????}?finally?{?reachabilityFence(this);?}
}
參考
Can java finalize an object when it is still in scope? https://stackoverflow.com/questions/24376768/can-java-finalize-an-object-when-it-is-still-in-scopeExecutors.newSingleThreadExecutor().submit(runnable) throws RejectedExecutionException https://bugs.openjdk.java.net/browse/JDK-8145304Implementing Finalization https://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.6.1Scope of a Declaration https://docs.oracle.com/javase/specs/jls/se8/html/jls-6.html#jls-6.3RejectedExecutionException inside single executor service https://stackoverflow.com/questions/58714980/rejectedexecutionexception-inside-single-executor-service
