Python 实现社交网络可视化,看看你的人脉影响力如何

Python的第三方库来进行社交网络的可视化
数据来源
pandas模块读取
数据的读取和清洗
import pandas as pd
import janitor
import datetime
from IPython.core.display import display, HTML
from pyvis import network as net
import networkx as nx
df_ori = pd.read_csv("Connections.csv", skiprows=3)
df_ori.head()
df = (
df_ori
.clean_names() # 去除掉字符串中的空格以及大写变成小写
.drop(columns=['first_name', 'last_name', 'email_address']) # 去除掉这三列
.dropna(subset=['company', 'position']) # 去除掉company和position这两列当中的空值
.to_datetime('connected_on', format='%d %b %Y')
)
company position connected_on
0 xxxxxxxxxx Talent Acquisition 2021-08-15
1 xxxxxxxxxxxx Associate Partner 2021-08-14
2 xxxxx 猎头顾问 2021-08-14
3 xxxxxxxxxxxxxxxxxxxxxxxxx Consultant 2021-07-26
4 xxxxxxxxxxxxxxxxxxxxxx Account Manager 2021-07-19
数据的分析与可视化
df['company'].value_counts().head(10).plot(kind="barh").invert_yaxis()

df['position'].value_counts().head(10).plot(kind="barh").invert_yaxis()

节点:社交网络当中的每个参与者 边缘:代表着每一个参与者的关系以及关系的紧密程度
networkx模块以及pyvis模块,g = nx.Graph()
g.add_node(0, label = "root") # intialize yourself as central node
g.add_node(1, label = "Company 1", size=10, title="info1")
g.add_node(2, label = "Company 2", size=40, title="info2")
g.add_node(3, label = "Company 3", size=60, title="info3")
size代表着节点的大小,然后我们将这些个节点相连接g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(0, 3)

df_company = df['company'].value_counts().reset_index()
df_company.columns = ['company', 'count']
df_company = df_company.sort_values(by="count", ascending=False)
df_company.head(10)
company count
0 Amazon xx
1 Google xx
2 Facebook xx
3 Stevens Institute of Technology xx
4 Microsoft xx
5 JPMorgan Chase & Co. xx
6 Amazon Web Services (AWS) xx
9 Apple x
10 Goldman Sachs x
8 Oracle x
# 实例化网络
g = nx.Graph()
g.add_node('myself') # 将自己放置在网络的中心
# 遍历数据集当中的每一行
for _, row in df_company_reduced.iterrows():
# 将公司名和统计结果赋值给新的变量
company = row['company']
count = row['count']
title = f"<b>{company}</b> – {count}"
positions = set([x for x in df[company == df['company']]['position']])
positions = ''.join('<li>{}</li>'.format(x) for x in positions)
position_list = f"<ul>{positions}</ul>"
hover_info = title + position_list
g.add_node(company, size=count*2, title=hover_info, color='#3449eb')
g.add_edge('root', company, color='grey')
# 生成网络图表
nt = net.Network(height='700px', width='700px', bgcolor="black", font_color='white')
nt.from_nx(g)
nt.hrepulsion()
nt.show('company_graph.html')
display(HTML('company_graph.html'))

df_position = df['position'].value_counts().reset_index()
df_position.columns = ['position', 'count']
df_position = df_position.sort_values(by="count", ascending=False)
df_position.head(10)
position count
0 Software Engineer xx
1 Data Scientist xx
2 Senior Software Engineer xx
3 Data Analyst xx
4 Senior Data Scientist xx
5 Software Development Engineer xx
6 Software Development Engineer II xx
7 Founder xx
8 Data Engineer xx
9 Business Analyst xx
g = nx.Graph()
g.add_node('myself') # 将自己放置在网络的中心
for _, row in df_position_reduced.iterrows():
# 将岗位名和统计结果赋值给新的变量
position = row['position']
count = row['count']
title = f"<b>{position}</b> – {count}"
positions = set([x for x in df[position == df['position']]['position']])
positions = ''.join('<li>{}</li>'.format(x) for x in positions)
position_list = f"<ul>{positions}</ul>"
hover_info = title + position_list
g.add_node(position, size=count*2, title=hover_info, color='#3449eb')
g.add_edge('root', position, color='grey')
# 生成网络图表
nt = net.Network(height='700px', width='700px', bgcolor="black", font_color='white')
nt.from_nx(g)
nt.hrepulsion()
nt.show('position_graph.html')




分享

点收藏

点点赞

点在看
关注公众号:拾黑(shiheibook)了解更多
[广告]赞助链接:
四季很好,只要有你,文娱排行榜:https://www.yaopaiming.com/
让资讯触达的更精准有趣:https://www.0xu.cn/
关注网络尖刀微信公众号随时掌握互联网精彩
赞助链接
排名
热点
搜索指数
- 1 从“水之道”感悟“国之交” 7904322
- 2 日方挑衅中国收割民意非常危险 7809144
- 3 泰国总理:做好一切准备维护国家主权 7712847
- 4 全国冰雪季玩法大盘点 7617512
- 5 日本记者街头采访找不到中国游客 7521394
- 6 周星驰《鹿鼎记》重映首日票房仅18万 7428197
- 7 净网:网民造谣汽车造成8杀被查处 7333429
- 8 原国务委员王丙乾逝世 7238570
- 9 退学北大考上清华小伙被欠家教费 7136166
- 10 流感自救抓住“黄金48小时” 7048191







AI100
