indexing - detecting jumps on pandas index dates -


i managed load historical data on data series on large set of financial instruments, indexed date.

i plotting volume , price information without issue.

what want achieve determine if there big jump in dates, see if missing large chunks of data.

the idea had in mind somehow plot difference in between 2 consecutive dates in index , if number superior 3 or 4 ( bigger week end , bank holiday on friday or monday ) there issue.

problem can figure out how compute df[next day]-df[day], df indexed day

you can use shift series method (note datetimeindex method shifts freq):

in [11]: rng = pd.datetimeindex(['20120101', '20120102', '20120106']) # datetimeindex df.index  in [12]: s = pd.series(rng)  # df.index instead of rng  in [13]: s - s.shift() out[13]: 0                nat 1   1 days, 00:00:00 2   4 days, 00:00:00 dtype: timedelta64[ns]  in [14]: s - s.shift() > pd.offsets.day(3).nanos out[14]: 0    false 1    false 2     true dtype: bool 

depending on want, perhaps either any, or find problematic values...

in [15]: (s - s.shift() > pd.offsets.day(3).nanos).any() out[15]: true  in [16]: s[s - s.shift() > pd.offsets.day(3).nanos] out[16]: 2   2012-01-06 00:00:00 dtype: datetime64[ns] 

or perhaps find maximum jump (and is):

in [17]: (s - s.shift()).max()  # it's weird returns series... out[17]: 0   4 days, 00:00:00 dtype: timedelta64[ns]  in [18]: (s - s.shift()).idxmax() out[18]: 2 

if wanted plot this, plotting difference work:

(s - s.shift()).plot() 

Comments

Popular posts from this blog

basic authentication with http post params android -

vb.net - Virtual Keyboard commands -

css - Firefox for ubuntu renders wrong colors -