getsizeof always 40 in python for each field in a csv -
this question has answer here:
so reading csv file row row. inside each row go field field , try size of each field in bytes suing .getsizeof. code below :
for row in reader: temp1 = [] temp2 = [] if type(row[0]) inttype: feed = feed + 1 print feed # total number of columns in feed should 61. # 61st column account last , after 60th column, blank. #if len(row) == 61: # total number of columns in feed should 61 field in row: if type(field) == 'int': field.encode('ascii', 'ignore') temp1.append(sys.getsizeof(field)) temp2.append(str(field)) else: field = [unicode(field)] #field = field.encode('ascii', 'ignore') temp1.append(sys.getsizeof(field)) temp2.append(str(field))
for reason size foe every field accross rows coming out 40. idea why ?
first, this:
field.encode('ascii', 'ignore')
… doesn't useful. doesn't change field
; returns new bytes
holding ascii-encoded version of field
, don't store anywhere.
on top of that, call when field
int
, in case you'll attributeerror
, because int
objects don't have encode
method.
so, clearly, you're hitting else
case. , do?
well, makes 1-element list. you're not asking size of element, size of list. so, they'll same size.
as the docs explain:
only memory consumption directly attributed object accounted for, not memory consumption of objects refers to.
so, if want know size of list, plus 1 of elements? couple paragraphs down there's general-purpose solution:
see recursive sizeof recipe example of using
getsizeof()
recursively find size of containers , contents.
however, in case, there's simpler solution: sys.getsizeof(field) + sys.getsizeof(field[0])
do.
but note may not wanted anyway. example, if have 1000 rows, , 900 of them have value '0'
, you're going count u'0'
900 times… but really, you're storing 1 copy of u'0'
, 900 references it.
Comments
Post a Comment