indexing - detecting jumps on pandas index dates -

August 15, 2014

i managed load historical data on data series on large set of financial instruments, indexed date.

i plotting volume , price information without issue.

what want achieve determine if there big jump in dates, see if missing large chunks of data.

the idea had in mind somehow plot difference in between 2 consecutive dates in index , if number superior 3 or 4 ( bigger week end , bank holiday on friday or monday ) there issue.

problem can figure out how compute df[next day]-df[day], df indexed day

you can use shift series method (note datetimeindex method shifts freq):

in [11]: rng = pd.datetimeindex(['20120101', '20120102', '20120106']) # datetimeindex df.index  in [12]: s = pd.series(rng)  # df.index instead of rng  in [13]: s - s.shift() out[13]: 0                nat 1   1 days, 00:00:00 2   4 days, 00:00:00 dtype: timedelta64[ns]  in [14]: s - s.shift() > pd.offsets.day(3).nanos out[14]: 0    false 1    false 2     true dtype: bool

depending on want, perhaps either any, or find problematic values...

in [15]: (s - s.shift() > pd.offsets.day(3).nanos).any() out[15]: true  in [16]: s[s - s.shift() > pd.offsets.day(3).nanos] out[16]: 2   2012-01-06 00:00:00 dtype: datetime64[ns]

or perhaps find maximum jump (and is):

in [17]: (s - s.shift()).max()  # it's weird returns series... out[17]: 0   4 days, 00:00:00 dtype: timedelta64[ns]  in [18]: (s - s.shift()).idxmax() out[18]: 2

if wanted plot this, plotting difference work:

(s - s.shift()).plot()

Search This Blog

Error

indexing - detecting jumps on pandas index dates -

Comments

Post a Comment

Popular posts from this blog

basic authentication with http post params android -

c++ - End of file on pipe magic during open -

vb.net - Virtual Keyboard commands -