本文最后更新于:2022年4月1日 凌晨
最近看到微信公众号推荐了些文章,有关于用Python爬取自己的微信好友,然后做了一些分析。其实之前我也有过这样的想法,一直没去实现。刚好今天元旦,回公司写了这么一个小项目。
其实获取微信好友很简单,有现成的模块直接使用,这是 itchat
的官网https://itchat.readthedocs.io/zh/latest/ 。首先通过 pip3
进行安装
然后导入 itchat
模块,通过 get_friends()
方法获取所有微信好友,
| import itchat
itchat.auto_login(True) friends = itchat.get_friends()
|
为了后面方便数据分析,我将微信好友信息入库处理,首先创建数据库,
| create table t_friends ( id int auto_increment primary key, user_name varchar(255) null, nick_name varchar(20) null, remark_name varchar(20) null, sex int null, head_img_url varchar(255) null, province varchar(20) null, city varchar(20) null, signature varchar(255) null );
|
将获取的微信好友插入数据库,
| import pymysql connect = pymysql.connect(host='localhost', user='root', password='root1234', db='itchat_db', charset='utf8mb4') cursor = connect.cursor() for friend in friends: sql = "INSERT INTO t_friends (`user_name`, `nick_name`, `remark_name`, `sex`, `head_img_url`, `province`, `city`, `Signature`) VALUES (%s, %s, %s, %s, %s, %s, %s, %s) " cursor.execute(sql, (friend['UserName'], friend['NickName'], friend['RemarkName'], friend['Sex'], friend['HeadImgUrl'], friend['Province'], friend['City'], friend['Signature'])) connect.commit() connect.close()
|
有了数据之后,就可以进行分析了。我使用的是基于图像处理库的 pylab
接口模块matplotlib
,还是通过 pip3
进行安装,
先分析一下好友的男女比例,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| import pymysql import matplotlib.pyplot as plt connect = pymysql.connect(host='localhost', user='root', password='root1234', db='itchat_db', charset='utf8mb4') cursor = connect.cursor() sql = "select case when sex = 1 then '男' when sex = 2 then '女' else '其它' end as '性别', count(sex) from t_friends group by sex;" cursor.execute(sql) results = cursor.fetchall() fig, ax = plt.subplots(figsize=(15, 8), subplot\_kw=dict(aspect="equal")) data = [val[1] for val in results] sex = [key[0] for key in results] def func(pct, allvals): absolute = int(pct/100.*np.sum(allvals)) return "{:.1f}%\n({:d} 人)".format(pct, absolute) wedges, texts, autotexts = ax.pie(data, autopct=lambda pct: func(pct, data), textprops=dict(color="w")) ax.legend(wedges, sex, title="男女比例", loc="cneter left", bbox_to_anchor=(1, 0, 0.5, 1)) plt.setp(autotexts, size=8, weight="bold") ax.set_title("微信好友男女比例分布") plt.show()
|
效果展示,
![微信好友男女比例](/images/Screen Shot 2019-01-01 at 19.35.46.png)
然后分析一下微信好友都是分布在哪些省份和城市,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| import pymysql import matplotlib.pyplot as plt connect = pymysql.connect(host='localhost', user='root', password='root1234', db='itchat_db', charset='utf8mb4') cursor = connect.cursor()
sql = "select province, count(1) counts from t_friends where province != '' group by province order by counts desc limit 20;" cursor.execute(sql) results = cursor.fetchall() cities = [city[0] for city in results] counts = [count[1] for count in results] fig, axs = plt.subplots(1, 1, figsize=(15, 8), sharey=True) axs.bar(cities, counts) for x, y in zip(cities, counts): plt.text(x, y+0.05, '%.0f' % y, ha='center', va='bottom', fontsize=11) axs.set_title('微信好友所在省份前20分布') plt.show()
|
效果展示,
![微信好友所在省份前20分布](/images/Screen Shot 2019-01-01 at 19.36.11.png)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| import pymysql import matplotlib.pyplot as plt connect = pymysql.connect(host='localhost', user='root', password='root1234', db='itchat_db', charset='utf8mb4') cursor = connect.cursor()
sql1 = "select city, count(1) counts from t_friends where city != '' group by province, city order by counts desc limit 25;" cursor.execute(sql1) results1 = cursor.fetchall() cities1 = [city[0] for city in results1] counts1 = [count[1] for count in results1] fig, axs = plt.subplots(1, 1, figsize=(15, 8), sharey=True) axs.bar(cities, counts) for x, y in zip(cities1, counts1): plt.text(x, y+0.05, '%.0f' % y, ha='center', va='bottom', fontsize=11) axs.set_title('微信好友所在城市前25分布') plt.show()
|
效果展示,
![微信好友所在城市前25分布](/images/Screen Shot 2019-01-01 at 22.07.24.png)
通过上面的饼图和柱状图来看,我的微信好友还是以男性居多,还有部分是未知性别的,啊哈哈哈(邪恶😈)。因为我是安徽人,所以安徽人居多是肯定的啦,大部分都是我从小学到大学的同学,朋友及家人等等。然后河南人占了第二的位置,也是能理解的,毕竟从毕业后,由于工作原因在郑州待了一年,唉,还是有点想念郑州的伙伴啊。剩下的比如江苏、浙江、上海是不少人向往、打拼的城市吧。其他的话有在脸书、推特上认识的一些朋友,就不细说了。
人生很短,为了梦想加油吧!
itchat
是一个开源的微信个人号接口项目,它支持 python2
以及 python3
,很方便的扩展个人的微信号、方便自己的生活。如果你很感兴趣,那就去官网探索吧。