欢迎访问 生活随笔!

生活随笔

当前位置: 首页 > 编程语言 > python >内容正文

python

python 安卓app 缺点_用python对android APP进行分析2

发布时间:2023/12/29 python 46 豆豆
生活随笔 收集整理的这篇文章主要介绍了 python 安卓app 缺点_用python对android APP进行分析2 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

文章接着前一篇文章《用python对android APP进行分析1》的内容

转换其他列数据类型

data.Reviews=data['Reviews'].astype(np.int,inpalce=True)

data.Reviews.head()

0 159

1 967

2 87510

3 215644

4 967

Name: Reviews, dtype: int32

print(data[~data.Size.str.contains('M')].head())

App Category Rating Reviews \

37 Floor Plan Creator ART_AND_DESIGN 4.1 36639

42 Textgram - write on photos ART_AND_DESIGN 4.4 295221

52 Used Cars and Trucks for Sale AUTO_AND_VEHICLES 4.6 17057

58 Restart Navigator AUTO_AND_VEHICLES 4.0 1403

67 Ulysse Speedometer AUTO_AND_VEHICLES 4.3 40211

Size Installs Type Price Content Rating Genres \

37 Varies with device 5000000 Free 0 Everyone Art & Design

42 Varies with device 10000000 Free 0 Everyone Art & Design

52 Varies with device 1000000 Free 0 Everyone Auto & Vehicles

58 201k 100000 Free 0 Everyone Auto & Vehicles

67 Varies with device 5000000 Free 0 Everyone Auto & Vehicles

Last Updated Current Ver Android Ver installs_range

37 July 14, 2018 Varies with device 2.3.3 and up 百万+

42 July 30, 2018 Varies with device Varies with device 百万+

52 July 30, 2018 Varies with device Varies with device 十万+

58 August 26, 2014 1.0.1 2.2 and up 万+

67 July 30, 2018 Varies with device Varies with device 百万+

大体发现有三种大小,k级的,m级的,不确定的

#定义改变大小统一单位的函数

def size_normal(x):

if 'M' in x.upper():

return float(x.replace('M',''))*1000

elif 'k' in x.lower():

return float(x.replace('k',''))

else:

return np.nan

data.Size.map(size_normal)[[1,146,10595]]#检验是否装换好

1 14000.0

146 NaN

10595 470.0

Name: Size, dtype: float64

data['size_k']=data.Size.map(size_normal)

print(data.head())

App Category Rating \

0 Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN 4.1

1 Coloring book moana ART_AND_DESIGN 3.9

2 U Launcher Lite – FREE Live Cool Themes, Hide ... ART_AND_DESIGN 4.7

3 Sketch - Draw & Paint ART_AND_DESIGN 4.5

4 Pixel Draw - Number Art Coloring Book ART_AND_DESIGN 4.3

Reviews Size Installs Type Price Content Rating \

0 159 19M 10000 Free 0 Everyone

1 967 14M 500000 Free 0 Everyone

2 87510 8.7M 5000000 Free 0 Everyone

3 215644 25M 50000000 Free 0 Teen

4 967 2.8M 100000 Free 0 Everyone

Genres Last Updated Current Ver \

0 Art & Design January 7, 2018 1.0.0

1 Art & Design;Pretend Play January 15, 2018 2.0.0

2 Art & Design August 1, 2018 1.2.4

3 Art & Design June 8, 2018 Varies with device

4 Art & Design;Creativity June 20, 2018 1.1

Android Ver installs_range size_k

0 4.0.3 and up 千+ 19000.0

1 4.0.3 and up 十万+ 14000.0

2 4.0.3 and up 百万+ 8700.0

3 4.2 and up 千万+ 25000.0

4 4.4 and up 万+ 2800.0

更新时间转换

from dateutil.parser import parse

def time_normal(time):

return parse(time)

data['Last Updated']=data['Last Updated'].map(time_normal)

print(data.head())

App Category Rating \

0 Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN 4.1

1 Coloring book moana ART_AND_DESIGN 3.9

2 U Launcher Lite – FREE Live Cool Themes, Hide ... ART_AND_DESIGN 4.7

3 Sketch - Draw & Paint ART_AND_DESIGN 4.5

4 Pixel Draw - Number Art Coloring Book ART_AND_DESIGN 4.3

Reviews Size Installs Type Price Content Rating \

0 159 19M 10000 Free 0 Everyone

1 967 14M 500000 Free 0 Everyone

2 87510 8.7M 5000000 Free 0 Everyone

3 215644 25M 50000000 Free 0 Teen

4 967 2.8M 100000 Free 0 Everyone

Genres Last Updated Current Ver Android Ver \

0 Art & Design 2018-01-07 1.0.0 4.0.3 and up

1 Art & Design;Pretend Play 2018-01-15 2.0.0 4.0.3 and up

2 Art & Design 2018-08-01 1.2.4 4.0.3 and up

3 Art & Design 2018-06-08 Varies with device 4.2 and up

4 Art & Design;Creativity 2018-06-20 1.1 4.4 and up

installs_range size_k

0 千+ 19000.0

1 十万+ 14000.0

2 百万+ 8700.0

3 千万+ 25000.0

4 万+ 2800.0

更新时间转换为时间格式,此处如果把时间装换为索引,通时间序列方法进行操作,但不做本次分析探讨内容。

检查异常值

print(data.describe())

Rating Reviews Installs size_k

count 10841.000000 1.084100e+04 1.084100e+04 9146.000000

mean 4.190739 4.441119e+05 1.546291e+07 21514.504975

std 0.479738 2.927629e+06 8.502557e+07 22588.342683

min 1.000000 0.000000e+00 0.000000e+00 8.500000

25% 4.100000 3.800000e+01 1.000000e+03 4900.000000

50% 4.200000 2.094000e+03 1.000000e+05 13000.000000

75% 4.500000 5.476800e+04 5.000000e+06 30000.000000

max 5.000000 7.815831e+07 1.000000e+09 100000.000000

发现数值类型列没有异常值,price将会在后面内容进行装换

删除重复值

data.duplicated().sum()

483

data.drop_duplicates(inplace=True)

data.info()

Int64Index: 10358 entries, 0 to 10840

Data columns (total 15 columns):

App 10358 non-null object

Category 10358 non-null object

Rating 10358 non-null float64

Reviews 10358 non-null int32

Size 10358 non-null object

Installs 10358 non-null int32

Type 10358 non-null object

Price 10358 non-null object

Content Rating 10358 non-null object

Genres 10358 non-null object

Last Updated 10358 non-null datetime64[ns]

Current Ver 10350 non-null object

Android Ver 10356 non-null object

installs_range 10358 non-null category

size_k 8832 non-null float64

dtypes: category(1), datetime64[ns](1), float64(2), int32(2), object(9)

memory usage: 1.1+ MB

data.to_csv(r'C:\Users\19078\Desktop\中级\第三关\android_data.csv',sep=',',encoding='utf_8_sig')#保存数据到csv格式

数据分析

分类对评论数数的影响

a=pd.pivot_table(data,columns='Type',index='Category',values='Reviews',aggfunc='mean').sort_values(by='Free',ascending=False)[:10]

b=pd.pivot_table(data,columns='Type',index='Category',values='Reviews',aggfunc='mean').sort_values(by='Paid',ascending=False)[:10]

a['Free'].plot(kind='bar',rot=60)

b['Paid'].plot(kind='bar',rot=60)

从两个图对比发现,不同类型app平均评论数相差较大,免费方面以游戏,社交,聊天居多,而付费中家庭,游戏,天气app评论居多,所以app种类和付费类型对评论数有一定影响。

类别与app软件大小的关系

a=pd.pivot_table(data,index='Category',values='size_k',aggfunc='mean').sort_values(by='size_k',ascending=False)[:15]

print(a)

size_k

Category

GAME 44126.850000

FAMILY 27930.435770

TRAVEL_AND_LOCAL 24515.994413

SPORTS 24181.192568

ENTERTAINMENT 22638.805970

PARENTING 22512.962963

FOOD_AND_DRINK 22056.122449

HEALTH_AND_FITNESS 21643.216667

EDUCATION 20076.895833

AUTO_AND_VEHICLES 20037.146667

MEDICAL 19383.681579

FINANCE 17937.730263

SOCIAL 16875.827586

PHOTOGRAPHY 16832.045267

MAPS_AND_NAVIGATION 16614.712963

可以看出不同类型软件大小也不同,游戏会比较大。同时也发现app普遍大小都是几十兆,所以可以了解app趋向的大小也是十几到及时兆比较合适。

付费软件中什么类别价格更高

data_paid=data[data.Type.isin(['Paid'])]

print(data_paid.head())

App Category Rating \

234 TurboScan: scan documents and receipts in PDF BUSINESS 4.7

235 Tiny Scanner Pro: PDF Doc Scan BUSINESS 4.8

427 Puffin Browser Pro COMMUNICATION 4.0

476 Moco+ - Chat, Meet People DATING 4.2

477 Calculator DATING 2.6

Reviews Size Installs Type Price Content Rating \

234 11442 6.8M 100000 Paid $4.99 Everyone

235 10295 39M 100000 Paid $4.99 Everyone

427 18247 Varies with device 100000 Paid $3.99 Everyone

476 1545 Varies with device 10000 Paid $3.99 Mature 17+

477 57 6.2M 1000 Paid $6.99 Everyone

Genres Last Updated Current Ver Android Ver installs_range \

234 Business 2018-03-25 1.5.2 4.0 and up 万+

235 Business 2017-04-11 3.4.6 3.0 and up 万+

427 Communication 2018-07-05 7.5.3.20547 4.1 and up 万+

476 Dating 2018-06-19 2.6.139 4.1 and up 千+

477 Dating 2017-10-25 1.1.6 4.0 and up 百+

size_k

234 6800.0

235 39000.0

427 NaN

476 NaN

477 6200.0

data_paid.Price=data_paid.Price.str.replace('$','').astype('float')

a=data_paid.groupby('Category')['Price'].agg(['mean','count']).sort_values(by='mean',ascending=False)[:15]

print(a)

mean count

Category

FINANCE 170.637059 17

LIFESTYLE 124.256316 19

EVENTS 109.990000 1

BUSINESS 14.607500 12

FAMILY 12.945561 187

MEDICAL 12.151071 84

PRODUCTIVITY 8.961786 28

PHOTOGRAPHY 6.111500 20

MAPS_AND_NAVIGATION 5.390000 5

SOCIAL 5.323333 3

PARENTING 4.790000 2

DATING 4.490000 7

EDUCATION 4.490000 4

AUTO_AND_VEHICLES 4.490000 3

HEALTH_AND_FITNESS 4.290000 15

C:\Users\19078\Anaconda3\envs\py\lib\site-packages\pandas\core\generic.py:4405: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

self[name] = value

从上述结果看出,金融理财,生活类和事件类软件收费较高。

不同类型软件付费比率

data.size

155370

def p_f_rate(group):

rate=(group[group['Type'].isin(['Paid'])].size)/(group[group['Type'].isin(['Free'])].size)

return rate.round(2)

data.groupby('Category').apply(p_f_rate).sort_values(ascending=False)[:15]

Category

PERSONALIZATION 0.27

MEDICAL 0.26

BOOKS_AND_REFERENCE 0.14

WEATHER 0.11

FAMILY 0.11

TOOLS 0.10

COMMUNICATION 0.08

GAME 0.08

SPORTS 0.07

PRODUCTIVITY 0.07

PHOTOGRAPHY 0.07

LIFESTYLE 0.05

FINANCE 0.05

HEALTH_AND_FITNESS 0.05

ART_AND_DESIGN 0.05

dtype: float64

可以看出付费率高的个性化和医疗的app,纵观所有,发现app不管什么类型,多数都是免费的,所以互联网的免费思维对于运营很关键

总结

以上是生活随笔为你收集整理的python 安卓app 缺点_用python对android APP进行分析2的全部内容,希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。