data.table making a copy of table in R -
i doing this:
myfun <- function(inputvar_vec){ # inputvar_vec input vector # # result = output vector return(result) } dt[, result := lapply(.sd, myfun), = byvar, .sdcols = inputvar]
i getting following warning:
warning message: `in `[.data.table`(df1, , `:=`(prop, lapply(.sd, propeventinlastk)), : invalid .internal.selfref detected , fixed taking copy of whole table, := can add new column reference. @ earlier point, data.table has been copied r (or been created manually using structure() or similar). (and more stuff) .... `
my guess because stacking result
vectors (after operation), copy being made?
can suggest method remove warning? have done using apply functions , thought should extendable here too.
my other question is: can pass chunk of rows data frame (subsetted using statement), , call function myfun operate on that?
adding example requested
# generate data n = 10000 default=na value = 1 df = data.table(id = sample(1:5000, n, replace=true), trial = sample(c(0,1,2), n, replace=true), ts = sample(1:200, n, replace=true)) #set keys setkeyv(df, c("id", "ts")) df[["trial"]] = as.numeric(df[["trial"]]==value) testfun <- function(x){ l=length(x) x = x[l:1] x = fts(data=x) y = rep(default, l) if(l>=k){ y1 = as.numeric(moving.sum(x,k)) y = c(y1, rep(default,l-length(y1))) } return(y[l:1]/k) } df[, prop:= lapply(.sd, testfun), = id, .sdcols = "trial"]
still getting same warning message:
warning message: in `[.data.table`(df, , `:=`(prop, lapply(.sd, testfun)), = id, : invalid .internal.selfref detected , fixed taking copy of whole table, := can add new column reference. @ earlier point, data.table has been copied r (or been created manually using structure() or similar). avoid key<-, names<- , attr<- in r (and oddly) may copy whole data.table. use set* syntax instead avoid copying: setkey(), setnames() , setattr(). also, list(dt1,dt2) copy entire dt1 , dt2 (r's list() copies named objects), use reflist() instead if needed (to implemented). if message doesn't help, please report datatable-help root cause can fixed.
the issue arises in
df[["trial"]] = as.numeric(df[["trial"]]==value)
which not data.table
approach
a data.table approach use :=
df[, trial := as.numeric(trial == value)]
should avoid issue.
understanding why copies made (and internal self references may voided) see understanding when data.table reference (vs copy of) data.table
it important realize there no [[<-
method data.table
s , [[<-.data.frame
called, copy entire object , not of careful things data.table
method (such [<-.data.table
) (returning valid data.table
.
Comments
Post a Comment