python - MultiIndex and DateTime -
i have sorted csv dataset 4 columns want use multiindex, including 2 datetime columns:
alex,beta,2011-03-01 00:00:00,2011-03-03 00:00:00,a,8,11.4 alex,beta,2011-03-03 00:00:00,2011-03-05 00:00:00,b,10,17.2 alex,beta,2011-03-05 00:00:00,2011-03-07 00:00:00,a,3,11.4 alex,beta,2011-03-07 00:00:00,2011-03-09 00:00:00,b,7,17.2 alex,orion,2011-03-02 00:00:00,2011-03-04 00:00:00,a,4,11.4 alex,orion,2011-03-03 00:00:00,2011-03-05 00:00:00,b,6,17.2 alex,orion,2011-03-04 00:00:00,2011-03-06 00:00:00,a,3,11.4 alex,orion,2011-03-05 00:00:00,2011-03-07 00:00:00,b,11,17.2 alex,zzyzx,2011-03-02 00:00:00,2011-03-05 00:00:00,a,10,11.4 alex,zzyzx,2011-03-04 00:00:00,2011-03-07 00:00:00,a,15,11.4 alex,zzyzx,2011-03-06 00:00:00,2011-03-09 00:00:00,b,20,17.2 alex,zzyzx,2011-03-08 00:00:00,2011-03-11 00:00:00,b,5,17.2 i can load read_csv , display dataframe hierarchically. indexing matter. nearest can tell pandas doesn't using datetime indexes here. if comment out datetime labels in index_col corresponding entries in indexing statement (df.loc), works fine.
any ideas?
#!/usr/bin/env python import numpy np import pandas pd pd.set_option('display.height', 400) pd.set_option('display.width', 400) pd.set_option('display.max_rows', 1000) pd.set_option('display.max_columns', 30) pd.set_option('display.line_width', 200) try: df = pd.read_csv( './sales.csv', header = none, na_values = ['null'], names = [ 'salesperson', 'customer', 'invoice_date', 'ship_date', 'product', 'quantity', 'price', ], index_col = [ 'salesperson', 'customer', 'invoice_date', 'ship_date', ], parse_dates = [ 'invoice_date', 'ship_date', ], ) except exception e: print(e) try: print(df) print(df.loc[( 'alex', # salesperson 'zzyzx', # customer '2011-03-02 00:00:00', # invoice_date '2011-03-05 00:00:00', # ship_date )]) except exception e: print(e)
it seems work fine, im getting proper df. although try avoiding empty entries in every list.
if use parse_dates should access columns proper datetime object:
df.loc[('alex','zzyzx',pd.datetime(2011,3,2),pd.datetime(2011,3,5))] product quantity 10 price 11.4 name: (alex, zzyzx, 2011-03-02 00:00:00, 2011-03-05 00:00:00), dtype: object
Comments
Post a Comment