獨(dú)家 | 人工神經(jīng)網(wǎng)絡(luò)中發(fā)現(xiàn)了人類大腦擁有的多模態(tài)神經(jīng)元(附鏈接)
作者:Gabriel Goh, Chelsea Voss, Daniela Amodei, Shan Carter, Michael Petrov, Justin Jay Wang, Nick Cammarata, and Chris Olah 翻譯:歐陽(yáng)錦
校對(duì):王可汗
本文約4000字,建議閱讀12分鐘
本文探討了OpenAI在CLIP模型中發(fā)現(xiàn)人類大腦多模態(tài)神經(jīng)元的發(fā)現(xiàn),并對(duì)這項(xiàng)發(fā)現(xiàn)進(jìn)行了深入的思考和研究。
標(biāo)簽:神經(jīng)網(wǎng)絡(luò) 通用人工智能 語(yǔ)言模型
內(nèi)容
CLIP中的多模態(tài)神經(jīng)元
不存在的概念(concepts)
多模態(tài)神經(jīng)元的構(gòu)成
抽象(abstraction)的悖論
野生攻擊
偏差與過(guò)度泛化
總結(jié)
《科學(xué)美國(guó)人》
https://www.scientificamerican.com/article/one-face-one-neuron/
《紐約時(shí)報(bào)》
https://www.nytimes.com/2005/07/05/science/a-neuron-with-halle-berrys-name-on-it.html
CLIP
https://openai.com/blog/clip/

CLIP中的多模態(tài)神經(jīng)元

由神經(jīng)元(Neuron)展示的不同效果

選擇的神經(jīng)元來(lái)自四個(gè)CLIP模型的最后一層。每個(gè)神經(jīng)元由帶有人為選擇的概念標(biāo)簽的特征可視化進(jìn)行表示,這些標(biāo)簽為快速提供每個(gè)神經(jīng)元的感覺(jué)提供幫助。不僅是特征可視化,這些標(biāo)簽是在查看激活神經(jīng)元的數(shù)百種刺激后被選擇的。我們?cè)谶@里通過(guò)一些例子說(shuō)明了模型對(duì)區(qū)域、感情和其他概念的刻板描述傾向。除此之外,我們還看到了神經(jīng)元分辨率水平的差異:雖然某些國(guó)家(如美國(guó)和印度)與定義明確的神經(jīng)元有關(guān),但非洲國(guó)家的情況并非如此。在非洲,神經(jīng)元傾向于在整個(gè)地區(qū)進(jìn)行激活。這部分偏差及其含義將在后面的部分中進(jìn)行討論。
不存在的概念
多模態(tài)神經(jīng)元的構(gòu)成

如上圖,存錢(qián)罐類別似乎是“finance”神經(jīng)元和瓷器(porcelain)神經(jīng)元的組合。前文提到的“Spider-Man”俠神經(jīng)元也是一個(gè)蜘蛛(spider)檢測(cè)器,在“barn spider”(谷倉(cāng)蜘蛛)類別的分類中起到了重要作用。

抽象(abstraction)的悖論
如上圖所示,通過(guò)在圖像上渲染文本,研究人員人為地刺激了1330號(hào)神經(jīng)元,該神經(jīng)元在線性探針的“存錢(qián)罐(piggy bank)”類中具有很高的權(quán)重。這導(dǎo)致分類器將貴賓犬錯(cuò)誤地分類為存錢(qián)罐。
野生攻擊

偏差與過(guò)度泛化
總結(jié)
論文鏈接:
https://distill.pub/2021/multimodal-neurons/
代碼鏈接:
https://github.com/openai/CLIP-featurevis
腳注
https://github.com/openai/CLIP/blob/main/model-card.md
引用
1. Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., & Fried, I. (2005). Invariant visual epresentation by single neurons in the human brain. Nature, 435(7045), 1102-1107.
https://www.nature.com/articles/nature03687
2. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf
3. Erhan, D., Bengio, Y., Courville, A., & Vincent, P. (2009). Visualizing higher-layer features of a deep network. University of Montreal, 1341(3), 1.
https://www.researchgate.net/profile/Aaron_Courville/publication/265022827_Visualizing_Higher-
Layer_Features_of_a_Deep_Network/links/53ff82b00cf24c81027da530.pdf
4. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
https://arxiv.org/abs/1312.6199
5. Mahendran, A., & Vedaldi, A. (2014). Understanding Deep Image Representations by Inverting Them. arXiv preprint arXiv:1412.0035.
https://arxiv.org/abs/1412.0035
6. Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 427-436).
https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Nguyen_Deep_Neural_Networks_2015_CVPR_paper.html
7. ?ygard, A. (2015). Visualizing GoogLeNet Classes. Accessed in.
https://www.auduno.com/2015/07/29/visualizing-googlenet-classes/
8. Mordvintsev, A., Olah, C., & Tyka, M. (2015). Inceptionism: Going deeper into neural networks.
https://ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html
9. Nguyen, A., Dosovitskiy, A., Yosinski, J., Brox, T., & Clune, J. (2016). Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. arXiv preprint arXiv:1605.09304.
https://arxiv.org/abs/1605.09304
10. Nguyen, A., Clune, J., Bengio, Y., Dosovitskiy, A., & Yosinski, J. (2017). Plug & play generative networks: Conditional iterative generation of images in latent space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4467-4477).
http://openaccess.thecvf.com/content_cvpr_2017/html/Nguyen_Plug__Play_CVPR_2017_paper.html
11. Nguyen, A., Yosinski, J., & Clune, J. (2016). Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks. arXiv preprint arXiv:1602.03616.
https://arxiv.org/abs/1602.03616
12. Olah, C., Mordvintsev, A., & Schubert, L. (2017). Feature visualization. Distill, 2(11), e7.
https://distill.pub/2017/feature-visualization
13. Goh, G., et al. (2021). Multimodal Neurons in Artificial Neural Networks. Distill.
https://mp.weixin.qq.com/cgi-bin/appmsg?t=media/appmsg_edit_v2&action=edit&isNew=1&type=10&createType=0&token=395857631&lang=zh_CN#imagenet-challenge
14. Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39-41.
https://dl.acm.org/doi/abs/10.1145/219717.219748
15. Crawford, K. & Paglen, T. (2019). Excavating AI: the politics of images in machine learning training sets. Excavating AI.
https://excavating.ai/
16. Hanna, A., Denton, E., Amironesei, R,, Smart A., Nicole, H. Lines of Sight. Logic Magazine.
https://logicmag.io/commons/lines-of-sight/
17. Fried, I., MacDonald, K. A., & Wilson, C. L. (1997). Single neuron activity in human hippocampus and amygdala during recognition of faces and objects. Neuron, 18(5), 753-765.
https://www.sciencedirect.com/science/article/pii/S0896627300803153
18. Kreiman, G., Koch, C., & Fried, I. (2000). Category-specific visual responses of single neurons in the human medial temporal lobe. Nature neuroscience, 3(9), 946-953.
https://www.nature.com/articles/nn0900_946
19. Radford, A., Jozefowicz, R., & Sutskever, I. (2017). Learning to generate reviews and discovering sentiment. arXiv preprint arXiv:1704.01444.
https://arxiv.org/abs/1704.01444
20. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
https://arxiv.org/abs/1301.3781
21. Brown, T. B., Mané, D., Roy, A., Abadi, M., & Gilmer, J. (2017). Adversarial patch. arXiv preprint arXiv:1712.09665.
https://arxiv.org/abs/1712.09665
22. Crawford, K. & Paglen, T. (2019). Excavating AI: the politics of images in machine learning training sets. Excavating AI.
https://excavating.ai/
譯者簡(jiǎn)介
歐陽(yáng)錦,我是一名即將去埃因霍芬理工大學(xué)繼續(xù)攻讀數(shù)據(jù)科學(xué)專業(yè)的碩士生。本科畢業(yè)于華北電力大學(xué),自己喜歡的科研方向是隱私安全中的數(shù)據(jù)科學(xué)算法。有很多愛(ài)好和興趣(攝影、運(yùn)動(dòng)、音樂(lè)),對(duì)生活中的事情充滿興趣,是個(gè)熱愛(ài)鉆研、開(kāi)朗樂(lè)觀的人。為了更好地學(xué)習(xí)自己喜歡的專業(yè)領(lǐng)域,希望能夠接觸到更多相關(guān)的事物以開(kāi)拓自己的眼界和思路。
翻譯組招募信息
工作內(nèi)容:需要一顆細(xì)致的心,將選取好的外文文章翻譯成流暢的中文。如果你是數(shù)據(jù)科學(xué)/統(tǒng)計(jì)學(xué)/計(jì)算機(jī)類的留學(xué)生,或在海外從事相關(guān)工作,或?qū)ψ约和庹Z(yǔ)水平有信心的朋友歡迎加入翻譯小組。
你能得到:定期的翻譯培訓(xùn)提高志愿者的翻譯水平,提高對(duì)于數(shù)據(jù)科學(xué)前沿的認(rèn)知,海外的朋友可以和國(guó)內(nèi)技術(shù)應(yīng)用發(fā)展保持聯(lián)系,THU數(shù)據(jù)派產(chǎn)學(xué)研的背景為志愿者帶來(lái)好的發(fā)展機(jī)遇。
其他福利:來(lái)自于名企的數(shù)據(jù)科學(xué)工作者,北大清華以及海外等名校學(xué)生他們都將成為你在翻譯小組的伙伴。
點(diǎn)擊文末“閱讀原文”加入數(shù)據(jù)派團(tuán)隊(duì)~
轉(zhuǎn)載須知
如需轉(zhuǎn)載,請(qǐng)?jiān)陂_(kāi)篇顯著位置注明作者和出處(轉(zhuǎn)自:數(shù)據(jù)派ID:DatapiTHU),并在文章結(jié)尾放置數(shù)據(jù)派醒目二維碼。有原創(chuàng)標(biāo)識(shí)文章,請(qǐng)發(fā)送【文章名稱-待授權(quán)公眾號(hào)名稱及ID】至聯(lián)系郵箱,申請(qǐng)白名單授權(quán)并按要求編輯。
發(fā)布后請(qǐng)將鏈接反饋至聯(lián)系郵箱(見(jiàn)下方)。未經(jīng)許可的轉(zhuǎn)載以及改編者,我們將依法追究其法律責(zé)任。
點(diǎn)擊“閱讀原文”擁抱組織


