pandas 有两种数据结构
series:一维列表,带有标签的同构类型数组 ;
DataFrame:二维列表,带有标签的可包含异构类型、大小可变的数据列,表格结构;
In [2]:
代码语言:javascript复制# series创建
import pandas as pd
import numpy as np
series1 = pd.Series([1, 2, 3, 4])
series1
Out[2]:
代码语言:javascript复制0 1
1 2
2 3
3 4
dtype: int64
输出的最后一行是Series中数据的类型,这里的数据都是int64类型的。 数据在第二列输出,第一列是数据的索引,在pandas中称之为Index。
In [3]:
代码语言:javascript复制series1.index
Out[3]:
代码语言:javascript复制RangeIndex(start=0, stop=4, step=1)
In [4]:
代码语言:javascript复制series1.values
Out[4]:
代码语言:javascript复制array([1, 2, 3, 4], dtype=int64)
默认情况下,index是[0,n-1]的形式。我们可以自定义索引值,索引值可以是任意类型
In [5]:
代码语言:javascript复制series2 = pd.Series([1, 2, 3, 4],
index=['a', 'b', 'c', 'd'])
series2
Out[5]:
代码语言:javascript复制a 1
b 2
c 3
d 4
dtype: int64
In [6]:
代码语言:javascript复制# Create DataFrame from Dictionary using default Constructor
# 通过字典创建DataFrame
studentData = {
'name' : ['jack', 'Riti', 'Aadi'],
'age' : [34, 30, 16],
'city' : ['Sydney', 'Delhi', 'New york']
}
In [8]:
代码语言:javascript复制df = pd.DataFrame(studentData)
df
Out[8]:
name | age | city | |
---|---|---|---|
0 | jack | 34 | Sydney |
1 | Riti | 30 | Delhi |
2 | Aadi | 16 | New york |
In [9]:
代码语言:javascript复制# 创建时自定义索引
df = pd.DataFrame(studentData, index=['a', 'b', 'c'])
df
Out[9]:
name | age | city | |
---|---|---|---|
a | jack | 34 | Sydney |
b | Riti | 30 | Delhi |
c | Aadi | 16 | New york |
In [15]:
代码语言:javascript复制# Create DataFrame from not compatible dictionary
# 单列字典创建DataFrame
studentAgeData = {
'Jack' : 12,
'Roma' : 13,
'Ritika' : 10,
'Aadi' : 11
}
# df = pd.DataFrame(studentAgeData)
df = pd.DataFrame(list(studentAgeData.items()), index=['a', 'b', 'c', 'd'])
df
Out[15]:
0 | 1 | |
---|---|---|
a | Jack | 12 |
b | Roma | 13 |
c | Ritika | 10 |
d | Aadi | 11 |
In [16]:
代码语言:javascript复制# Create DataFrame from Dictionary and skip data
# 跳过某列创建DataFrame
studentData = {
'name' : ['jack', 'Riti', 'Aadi'],
'age' : [34, 30, 16],
'city' : ['Sydney', 'Delhi', 'New york']
}
In [19]:
代码语言:javascript复制# Creating Dataframe from Dictionary by Skipping 2nd Item from dict
# 跳过某列
dfObj = pd.DataFrame(studentData, columns=['name', 'city'])
dfObj
Out[19]:
name | city | |
---|---|---|
0 | jack | Sydney |
1 | Riti | Delhi |
2 | Aadi | New york |
In [20]:
代码语言:javascript复制# Create DataFrame from Dictionary with different Orientation
# 不同方向
studentData = {
'name' : ['jack', 'Riti', 'Aadi'],
'age' : [34, 30, 16],
'city' : ['Sydney', 'Delhi', 'New york']
}
In [21]:
代码语言:javascript复制# Create dataframe from dic and make keys, index in dataframe
dfObj = pd.DataFrame.from_dict(studentData, orient='index')
dfObj
Out[21]:
0 | 1 | 2 | |
---|---|---|---|
name | jack | Riti | Aadi |
age | 34 | 30 | 16 |
city | Sydney | Delhi | New york |
In [24]:
代码语言:javascript复制# Create DataFrame from nested Dictionary
# 包含嵌套的字典
studentData = {
}
0 : {
'name' : 'Aadi',
'age' : 16,
'city' : 'New york'
},
1 : {
'name' : 'Jack',
'age' : 34,
'city' : 'Sydney'
},
2 : {
'name' : 'Riti',
'age' : 30,
'city' : 'Delhi'
}
In [25]:
代码语言:javascript复制# Create dataframe from nested dictionary
# 包含嵌套的字典
dfObj = pd.DataFrame(studentData)
dfObj
Out[25]:
0 | 1 | 2 | |
---|---|---|---|
age | 16 | 34 | 30 |
city | New york | Sydney | Delhi |
name | Aadi | Jack | Riti |
In [26]:
代码语言:javascript复制# Transpose dataframe object
# 行列转换
dfObj = dfObj.transpose()
dfObj
Out[26]:
age | city | name | |
---|---|---|---|
0 | 16 | New york | Aadi |
1 | 34 | Sydney | Jack |
2 | 30 | Delhi | Riti |