r - Subset variables in data frame based on column type -
i need subset data frame based on column type - example data frame 100 columns need keep column type factor
or integer
. i've written short function this, there simpler solution or built-in function or package on cran?
my current solution variable names requested types:
varlist <- function(df=null, vartypes=null) { type_function <- c("is.factor","is.integer","is.numeric","is.character","is.double","is.logical") names(type_function) <- c("factor","integer","numeric","character","double","logical") names(df)[as.logical(sapply(lapply(names(df), function(y) sapply(type_function[names(type_function) %in% vartypes], function(x) do.call(x,list(df[[y]])))),sum))] }
the function varlist
works follows:
- for every requested type , every column in data frame call "is.type" function
- sum tests every variable (boolean casted integer automatically)
- cast result logical vector
- subset names in data frame
and data test it:
df <- read.table(file="http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data", sep=" ", header=false, stringsasfactors=true) names(df) <- c('ca_status','duration','credit_history','purpose','credit_amount','savings', 'present_employment_since','installment_rate_income','status_sex','other_debtors','present_residence_since','property','age','other_installment','housing','existing_credits', 'job','liable_maintenance_people','telephone','foreign_worker','gb') df$gb <- ifelse(df$gb == 2, false, true) df$property <- as.character(df$property) varlist(df, c("integer","logical"))
i'm asking because code looks cryptic , hard understand (even me , i've finished function 10 minutes ago).
subset_colclasses <- function(df, colclasses="numeric") { df[,sapply(df, function(vec, test) class(vec) %in% test, test=colclasses)] } str(subset_colclasses(df, c("factor", "integer")))
Comments
Post a Comment