【数据分析可视化】Series和DataFrame的排序

2020-07-07 19:57:55 浏览数 (1)

代码语言:javascript复制
import numpy as np
import pandas as pd
from pandas import Series, DataFrame

Series的排序

代码语言:javascript复制
s1 = Series(np.random.rand(10))
s1
代码语言:javascript复制
0    0.324583
1    0.528829
2    0.922022
3    0.050265
4    0.069271
5    0.447179
6    0.595703
7    0.518557
8    0.695466
9    0.685736
dtype: float64
代码语言:javascript复制
s1.values
代码语言:javascript复制
array([0.32458288, 0.52882927, 0.92202246, 0.05026548, 0.06927059,
       0.44717888, 0.59570299, 0.51855686, 0.69546586, 0.68573564])
代码语言:javascript复制
s1.index
代码语言:javascript复制
RangeIndex(start=0, stop=10, step=1)
代码语言:javascript复制
# value排序 升降可调ascending默认升序
s2 = s1.sort_values()
s2
代码语言:javascript复制
3    0.050265
4    0.069271
0    0.324583
5    0.447179
7    0.518557
1    0.528829
6    0.595703
9    0.685736
8    0.695466
2    0.922022
dtype: float64
代码语言:javascript复制
# 索引排序
s2.sort_index()
代码语言:javascript复制
0    0.324583
1    0.528829
2    0.922022
3    0.050265
4    0.069271
5    0.447179
6    0.595703
7    0.518557
8    0.695466
9    0.685736
dtype: float64

DataFrame的排序

代码语言:javascript复制
df1 = DataFrame(np.random.randn(40).reshape(8,5),columns=['A','B','C','D','E'])
df1

A

B

C

D

E

0

1.069063

0.266594

-0.129437

-0.361949

-1.491594

1

1.520675

1.673761

0.310567

-1.535689

0.388416

2

1.828228

0.221382

-0.092250

-0.111522

-1.187931

3

-1.049244

-0.093515

0.175138

0.627553

-0.357136

4

0.572511

-0.871314

1.142248

-0.489059

0.677733

5

0.088234

-0.786141

-0.222611

0.087407

-0.221874

6

2.199338

0.191928

0.278917

-0.388502

0.611719

7

1.260192

-0.001860

0.144536

-0.312155

1.664181

代码语言:javascript复制
# 列排序 没法显示全部
df1['A'].sort_values()
代码语言:javascript复制
3   -1.049244
5    0.088234
4    0.572511
0    1.069063
7    1.260192
1    1.520675
2    1.828228
6    2.199338
Name: A, dtype: float64
代码语言:javascript复制
# 对指定列排序 显示全部
df2 = df1.sort_values('A')
df2

A

B

C

D

E

3

-1.049244

-0.093515

0.175138

0.627553

-0.357136

5

0.088234

-0.786141

-0.222611

0.087407

-0.221874

4

0.572511

-0.871314

1.142248

-0.489059

0.677733

0

1.069063

0.266594

-0.129437

-0.361949

-1.491594

7

1.260192

-0.001860

0.144536

-0.312155

1.664181

1

1.520675

1.673761

0.310567

-1.535689

0.388416

2

1.828228

0.221382

-0.092250

-0.111522

-1.187931

6

2.199338

0.191928

0.278917

-0.388502

0.611719

代码语言:javascript复制
df2.sort_index()

A

B

C

D

E

0

1.069063

0.266594

-0.129437

-0.361949

-1.491594

1

1.520675

1.673761

0.310567

-1.535689

0.388416

2

1.828228

0.221382

-0.092250

-0.111522

-1.187931

3

-1.049244

-0.093515

0.175138

0.627553

-0.357136

4

0.572511

-0.871314

1.142248

-0.489059

0.677733

5

0.088234

-0.786141

-0.222611

0.087407

-0.221874

6

2.199338

0.191928

0.278917

-0.388502

0.611719

7

1.260192

-0.001860

0.144536

-0.312155

1.664181

读取csv文件,电影评分降序,输出新的csv

代码语言:javascript复制
# 读取数据
csv_input = '/Users/bennyrhys/Desktop/数据分析可视化-数据集/homework/movie_metadata.csv'
pd.read_csv(csv_input).head()

color

director_name

num_critic_for_reviews

duration

director_facebook_likes

actor_3_facebook_likes

actor_2_name

actor_1_facebook_likes

gross

genres

...

num_user_for_reviews

language

country

content_rating

budget

title_year

actor_2_facebook_likes

imdb_score

aspect_ratio

movie_facebook_likes

0

Color

James Cameron

723.0

178.0

0.0

855.0

Joel David Moore

1000.0

760505847.0

Action|Adventure|Fantasy|Sci-Fi

...

3054.0

English

USA

PG-13

237000000.0

2009.0

936.0

7.9

1.78

33000

1

Color

Gore Verbinski

302.0

169.0

563.0

1000.0

Orlando Bloom

40000.0

309404152.0

Action|Adventure|Fantasy

...

1238.0

English

USA

PG-13

300000000.0

2007.0

5000.0

7.1

2.35

0

2

Color

Sam Mendes

602.0

148.0

0.0

161.0

Rory Kinnear

11000.0

200074175.0

Action|Adventure|Thriller

...

994.0

English

UK

PG-13

245000000.0

2015.0

393.0

6.8

2.35

85000

3

Color

Christopher Nolan

813.0

164.0

22000.0

23000.0

Christian Bale

27000.0

448130642.0

Action|Thriller

...

2701.0

English

USA

PG-13

250000000.0

2012.0

23000.0

8.5

2.35

164000

4

NaN

Doug Walker

NaN

NaN

131.0

NaN

Rob Walker

131.0

NaN

Documentary

...

NaN

NaN

NaN

NaN

NaN

NaN

12.0

7.1

NaN

0

5 rows × 28 columns

代码语言:javascript复制
pd.read_csv(csv_input)[['movie_title','imdb_score']].sort_values('imdb_score',ascending=False).head()

movie_title

imdb_score

2765

Towering Inferno

9.5

1937

The Shawshank Redemption

9.3

3466

The Godfather

9.2

4409

Kickboxer: Vengeance

9.1

2824

Dekalog

9.1

代码语言:javascript复制
# 一行代码排序并输出新csv
pd.read_csv(csv_input)[['movie_title','imdb_score']].sort_values('imdb_score',ascending=False).to_csv('imdb.csv')
代码语言:javascript复制
!ls
代码语言:javascript复制
02file.ipynb
4-1 DataFrame的简单数学计算.ipynb
4-2 Series和DataFrame的排序.ipynb
4-3 重命名Dataframe的index.ipynb
7B4349AB-7282-428F-A780-CB538E0517A3.dmp
Applications
Creative Cloud Files
Desktop
Documents
Downloads
Hadoop_VM
Java.gitignore
Library
Movies
Music
NumPy-排序.ipynb
Numpy-3.4数组读写.ipynb
Numpy1.ipynb
Pandas.ipynb
Pictures
Postman
PromotionRes
Public
Untitled Folder
Untitled Folder 1
Untitled.ipynb
Untitled1.ipynb
Virtual Machines.localized
WeChatProjects
ap.plist
apps.plist
bt.plist
eclipse-workspace
history.plist
iCloud 云盘(归档)
imdb.csv
install
nadarray.ipynb
opt
sell
vue-demo01
vue-sell-cube
vue-selll
输出1.spv
数据分析-分组 聚合 可视化.ipynb
班级成绩.ipynb
代码语言:javascript复制
!more imdb.csv
代码语言:javascript复制
,movie_title,imdb_score
2765,Towering Inferno             ,9.5
1937,The Shawshank Redemption ,9.3
3466,The Godfather ,9.2
4409,Kickboxer: Vengeance ,9.1
2824,Dekalog             ,9.1
3207,Dekalog             ,9.1
66,The Dark Knight ,9.0
2837,The Godfather: Part II ,9.0
3481,Fargo             ,9.0
339,The Lord of the Rings: The Return of the King ,8.9
4822,12 Angry Men ,8.9
4498,"The Good, the Bad and the Ugly ",8.9
3355,Pulp Fiction ,8.9
1874,Schindler's List ,8.9
683,Fight Club ,8.8
836,Forrest Gump ,8.8
270,The Lord of the Rings: The Fellowship of the Ring ,8.8
2051,Star Wars: Episode V - The Empire Strikes Back ,8.8
97,Inception ,8.8
1842,It's Always Sunny in Philadelphia             ,8.8
459,Daredevil             ,8.8
1620,Friday Night Lights             ,8.7
imdb.csv

0 人点赞