Python實現(xiàn)社交網(wǎng)絡(luò)可視化,看看你的人脈影響力如何
Python的第三方庫來進行社交網(wǎng)絡(luò)的可視化
數(shù)據(jù)來源
pandas模塊讀取
數(shù)據(jù)的讀取和清洗
當(dāng)然我們先導(dǎo)入需要用到的模塊
import?pandas?as?pd
import?janitor
import?datetime
from?IPython.core.display?import?display,?HTML
from?pyvis?import?network?as?net
import?networkx?as?nx
讀取所需要用到的數(shù)據(jù)集
df_ori?=?pd.read_csv("Connections.csv",?skiprows=3)
df_ori.head()
df?=?(
????df_ori
????.clean_names()?#?去除掉字符串中的空格以及大寫變成小寫
????.drop(columns=['first_name',?'last_name',?'email_address'])?#?去除掉這三列
????.dropna(subset=['company',?'position'])?#?去除掉company和position這兩列當(dāng)中的空值
????.to_datetime('connected_on',?format='%d?%b?%Y')
??)
output
????????????????????company????????????position?connected_on
0????????????????xxxxxxxxxx??Talent?Acquisition???2021-08-15
1???????????????xxxxxxxxxxxx???Associate?Partner???2021-08-14
2??????????????????????xxxxx????????????????獵頭顧問???2021-08-14
3??xxxxxxxxxxxxxxxxxxxxxxxxx??????????Consultant???2021-07-26
4????xxxxxxxxxxxxxxxxxxxxxx?????Account?Manager???2021-07-19
數(shù)據(jù)的分析與可視化
先來看一下小編認識的這些人脈中,分別都是在哪些公司工作的
df['company'].value_counts().head(10).plot(kind="barh").invert_yaxis()
output

df['position'].value_counts().head(10).plot(kind="barh").invert_yaxis()
output

節(jié)點:社交網(wǎng)絡(luò)當(dāng)中的每個參與者 邊緣:代表著每一個參與者的關(guān)系以及關(guān)系的緊密程度
networkx模塊以及pyvis模塊,g?=?nx.Graph()
g.add_node(0,?label?=?"root")?#?intialize?yourself?as?central?node
g.add_node(1,?label?=?"Company?1",?size=10,?title="info1")
g.add_node(2,?label?=?"Company?2",?size=40,?title="info2")
g.add_node(3,?label?=?"Company?3",?size=60,?title="info3")
size代表著節(jié)點的大小,然后我們將這些個節(jié)點相連接g.add_edge(0,?1)
g.add_edge(0,?2)
g.add_edge(0,?3)

df_company?=?df['company'].value_counts().reset_index()
df_company.columns?=?['company',?'count']
df_company?=?df_company.sort_values(by="count",?ascending=False)
df_company.head(10)
output
????????????????????????????company??count
0????????????????????????????Amazon?????xx
1????????????????????????????Google?????xx
2??????????????????????????Facebook?????xx
3???Stevens?Institute?of?Technology?????xx
4?????????????????????????Microsoft?????xx
5??????????????JPMorgan?Chase?&?Co.?????xx
6?????????Amazon?Web?Services?(AWS)?????xx
9?????????????????????????????Apple??????x
10????????????????????Goldman?Sachs??????x
8????????????????????????????Oracle??????x
然后我們來繪制社交網(wǎng)絡(luò)的圖表
#?實例化網(wǎng)絡(luò)
g?=?nx.Graph()
g.add_node('myself')?#?將自己放置在網(wǎng)絡(luò)的中心
#?遍歷數(shù)據(jù)集當(dāng)中的每一行
for?_,?row?in?df_company_reduced.iterrows():
????#?將公司名和統(tǒng)計結(jié)果賦值給新的變量
????company?=?row['company']
????count?=?row['count']
????title?=?f"{company}?–?{count}"
????positions?=?set([x?for?x?in?df[company?==?df['company']]['position']])
????positions?=?''.join('{} '.format(x)?for?x?in?positions)
????position_list?=?f"{positions}
"
????hover_info?=?title?+?position_list
????g.add_node(company,?size=count*2,?title=hover_info,?color='#3449eb')
????g.add_edge('root',?company,?color='grey')
#?生成網(wǎng)絡(luò)圖表
nt?=?net.Network(height='700px',?width='700px',?bgcolor="black",?font_color='white')
nt.from_nx(g)
nt.hrepulsion()
nt.show('company_graph.html')
display(HTML('company_graph.html'))
output

df_position?=?df['position'].value_counts().reset_index()
df_position.columns?=?['position',?'count']
df_position?=?df_position.sort_values(by="count",?ascending=False)
df_position.head(10)
output
???????????????????????????position??count
0?????????????????Software?Engineer?????xx
1????????????????????Data?Scientist?????xx
2??????????Senior?Software?Engineer?????xx
3??????????????????????Data?Analyst?????xx
4?????????????Senior?Data?Scientist?????xx
5?????Software?Development?Engineer?????xx
6??Software?Development?Engineer?II?????xx
7???????????????????????????Founder?????xx
8?????????????????????Data?Engineer?????xx
9??????????????????Business?Analyst?????xx
然后進行網(wǎng)絡(luò)圖的繪制
g?=?nx.Graph()
g.add_node('myself')?#?將自己放置在網(wǎng)絡(luò)的中心
for?_,?row?in?df_position_reduced.iterrows():
????#?將崗位名和統(tǒng)計結(jié)果賦值給新的變量
????position?=?row['position']
????count?=?row['count']
????title?=?f"{position}?–?{count}"
????positions?=?set([x?for?x?in?df[position?==?df['position']]['position']])
????positions?=?''.join('{} '.format(x)?for?x?in?positions)
????position_list?=?f"{positions}
"
????hover_info?=?title?+?position_list
????g.add_node(position,?size=count*2,?title=hover_info,?color='#3449eb')
????g.add_edge('root',?position,?color='grey')
#?生成網(wǎng)絡(luò)圖表
nt?=?net.Network(height='700px',?width='700px',?bgcolor="black",?font_color='white')
nt.from_nx(g)
nt.hrepulsion()
nt.show('position_graph.html')
output

各位伙伴們好,詹帥本帥搭建了一個個人博客和小程序,匯集各種干貨和資源,也方便大家閱讀,感興趣的小伙伴請移步小程序體驗一下哦!(歡迎提建議)
推薦閱讀
牛逼!Python常用數(shù)據(jù)類型的基本操作(長文系列第①篇)
牛逼!Python的判斷、循環(huán)和各種表達式(長文系列第②篇)
評論
圖片
表情

