R语言dataframe数据转换

2020-09-14 14:21:12 浏览数 (1)

将宽数据转换为长数据​

1 构建数据框df

x为factor变量,2010和2011位数值变量x为factor变量,2010和2011位数值变量

2. 用reshape2::melt将2维数据转换为一维数据

df_melt<-reshape2::melt(df,id.vars="x",variable.name="year",value.name="value")

Arguments

data

data frame to melt

id.vars

vector of id variables. Can be integer (variable position) or string (variable name). If blank, will use all non-measured variables.指明分组变量,该列为原来df的factor列

measure.vars

vector of measured variables. Can be integer (variable position) or string (variable name)If blank, will use all non id.vars指明测量值列,如果不明确说明,默认出id.var列以外的所有列为measure,上列中就没有明确支出measure.vars,则默认2010和2011均为measure value

variable.name

name of variable used to store measured variable names 指定measure value的变量名字

...

further arguments passed to or from other methods.

na.rm

Should NA values be removed from the data set? This will convert explicit missings to implicit missings.去除NA值

value.name

name of variable used to store values 指定变量值名字

factorsAsStrings

Control whether factors are converted to character when melted as measure variables. When FALSE, coercion is forced if levels are not identical across the measure.vars.

经过melt变换之后的df_melt经过melt变换之后的df_melt

将长数据转换为宽数据

将上述df_melt转化为宽数据框df

df_cast<-reshape2::dcast(df_melt,x~year,value.var="value")

dcast参数说明

Arguments

formula

casting formula, see details for specifics.

fun.aggregate

aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified.

...

further arguments are passed to aggregating function

margins

vector of variable names (can include "grand_col" and "grand_row") to compute margins for, or TRUE to compute all margins . Any variables that can not be margined over will be silently dropped.

subset

quoted expression used to subset data prior to reshaping, e.g. subset = .(variable=="length").

fill

value with which to fill in structural missings, defaults to value from applying fun.aggregate to 0 length vector

drop

should missing combinations dropped or kept?

value.var

name of column which stores values,

0 人点赞