algorithm - Python recursive setattr()-like function for working with nested dictionaries -


there lot of getattr()-like functions parsing nested dictionary structures, such as:

i make parallel setattr(). essentially, given:

cmd = 'f[0].a' val = 'whatever' x = {"a":"stuff"} 

i'd produce function such can assign:

x['f'][0]['a'] = val 

more or less, work same way as:

setattr(x,'f[0].a',val) 

to yield:

>>> x {"a":"stuff","f":[{"a":"whatever"}]} 

i'm calling setbydot():

setbydot(x,'f[0].a',val) 

one problem if key in middle doesn't exist, need check , make intermediate key if doesn't exist---ie, above:

>>> x = {"a":"stuff"} >>> x['f'][0]['a'] = val traceback (most recent call last):   file "<stdin>", line 1, in <module> keyerror: 'f' 

so, first have make:

>>> x['f']=[{}] >>> x {'a': 'stuff', 'f': [{}]} >>> x['f'][0]['a']=val >>> x {'a': 'stuff', 'f': [{'a': 'whatever'}]} 

another keying when next item lists different keying when next item string, ie:

>>> x = {"a":"stuff"} >>> x['f']=[''] >>> x['f'][0]['a']=val traceback (most recent call last):   file "<stdin>", line 1, in <module> typeerror: 'str' object not support item assignment 

...fails because assignment null string instead of null dict. null dict right assignment every non-list in dict until last one---which may list, or value.

a second problem, pointed out in comments below @tokenmacguy, when have create list not exist, may have create awful lot of blank values. so,

setattr(x,'f[10].a',val) 

---may mean algorithm have make intermediate like:

>>> x['f']=[{},{},{},{},{},{},{},{},{},{},{}] >>> x['f'][10]['a']=val 

to yield

>>> x  {"a":"stuff","f":[{},{},{},{},{},{},{},{},{},{},{"a":"whatever"}]} 

such setter associated getter...

>>> getbydot(x,"f[10].a") "whatever" 

more importantly, intermediates should /not/ overwrite values exist.

below junky idea have far---i can identify lists versus dicts , other data types, , create them not exist. however, don't see (a) put recursive call, or (b) how 'build' deep object iterate through list, , (c) how distinguish /probing/ i'm doing construct deep object /setting/ have when reach end of stack.

def setbydot(obj,ref,newval):     ref = ref.replace("[",".[")     cmd = ref.split('.')     numkeys = len(cmd)     count = 0     c in cmd:         count = count+1         while count < numkeys:             if c.find("["):                 idstart = c.find("[")                 numend = c.find("]")                 try:                     deep = obj[int(idstart+1:numend-1)]                 except:                     obj[int(idstart+1:numend-1)] = []                     deep = obj[int(idstart+1:numend-1)]             else:                 try:                     deep = obj[c]                 except:                     if obj[c] isinstance(dict):                         obj[c] = {}                     else:                         obj[c] = ''                     deep = obj[c]         setbydot(deep,c,newval) 

this seems tricky because kind of have look-ahead check type of /next/ object if you're making place-holders, , have look-behind build path go.

update

i had this question answered, too, might relevant or helpful.

i have separated out 2 steps. in first step, query string broken down series of instructions. way problem decoupled, can view instructions before running them, , there no need recursive calls.

def build_instructions(obj, q):     """     breaks down query string series of actionable instructions.      each instruction (_type, arg) tuple.     arg -- key used __getitem__ or __setitem__ call on            current object.     _type -- used determine data type value of              obj.__getitem__(arg)      if key/index missing, _type used initialize empty value.     in way _type provides ability     """     arg = []     _type = none     instructions = []     i, ch in enumerate(q):         if ch == "[":             # begin list query             if _type not none:                 arg = "".join(arg)                 if _type == list , arg.isalpha():                     _type = dict                 instructions.append((_type, arg))                 _type, arg = none, []             _type = list         elif ch == ".":             # begin dict query             if _type not none:                 arg = "".join(arg)                 if _type == list , arg.isalpha():                     _type = dict                 instructions.append((_type, arg))                 _type, arg = none, []              _type = dict         elif ch.isalnum():             if == 0:                 # query begins alphanum, assume dict access                 _type = type(obj)              # fill out args             arg.append(ch)         else:             typeerror("unrecognized character: {}".format(ch))      if _type not none:         # finish last query         instructions.append((_type, "".join(arg)))      return instructions 

for example

>>> x = {"a": "stuff"} >>> print(build_instructions(x, "f[0].a")) [(<type 'dict'>, 'f'), (<type 'list'>, '0'), (<type 'dict'>, 'a')] 

the expected return value _type (first item) of next tuple in instructions. important because allows correctly initialize/reconstruct missing keys.

this means our first instruction operates on dict, either sets or gets key 'f', , expected return list. similarly, our second instruction operates on list, either sets or gets index 0 , expected return dict.

now let's create our _setattr function. gets proper instructions , goes through them, creating key-value pairs necessary. finally, sets val give it.

def _setattr(obj, query, val):     """     special setattr function take in string query,     interpret it, add appropriate data structure obj, , set val.      define 2 actions available in our query string:     .x -- dict.__setitem__(x, ...)     [x] -- list.__setitem__(x, ...) or dict.__setitem__(x, ...)            calling context determines how interpreted.     """     instructions = build_instructions(obj, query)     i, (_, arg) in enumerate(instructions[:-1]):         _type = instructions[i + 1][0]         obj = _set(obj, _type, arg)      _type, arg = instructions[-1]     _set(obj, _type, arg, val)  def _set(obj, _type, arg, val=none):     """     helper function calling obj.__setitem__(arg, val or _type()).     """     if val not none:         # time set our value         _type = type(val)      if isinstance(obj, dict):         if arg not in obj:             # if key isn't in obj, initialize _type()             # or set val             obj[arg] = (_type() if val none else val)         obj = obj[arg]     elif isinstance(obj, list):         n = len(obj)         arg = int(arg)         if n > arg:             obj[arg] = (_type() if val none else val)         else:             # need amplify our list, initialize empty values _type()             obj.extend([_type() x in range(arg - n + 1)])         obj = obj[arg]     return obj 

and because can, here's _getattr function.

def _getattr(obj, query):     """     similar _setattr. instead of setting attributes     returned. expected, error raised if __getitem__ call     fails.     """     instructions = build_instructions(obj, query)     i, (_, arg) in enumerate(instructions[:-1]):         _type = instructions[i + 1][0]         obj = _get(obj, _type, arg)      _type, arg = instructions[-1]     return _get(obj, _type, arg)   def _get(obj, _type, arg):     """     helper function calling obj.__getitem__(arg).     """     if isinstance(obj, dict):         obj = obj[arg]     elif isinstance(obj, list):         arg = int(arg)         obj = obj[arg]     return obj 

in action:

>>> x = {"a": "stuff"} >>> _setattr(x, "f[0].a", "test") >>> print x {'a': 'stuff', 'f': [{'a': 'test'}]} >>> print _getattr(x, "f[0].a") "test"  >>> x = ["one", "two"] >>> _setattr(x, "3[0].a", "test") >>> print x ['one', 'two', [], [{'a': 'test'}]] >>> print _getattr(x, "3[0].a") "test" 

now cool stuff. unlike python, our _setattr function can set unhashable dict keys.

x = [] _setattr(x, "1.4", "asdf") print x [{}, {'4': 'asdf'}]  # list, isn't hashable  >>> y = {"a": "stuff"} >>> _setattr(y, "f[1.4]", "test")  # we're indexing f 1.4, list! >>> print y {'a': 'stuff', 'f': [{}, {'4': 'test'}]} >>> print _getattr(y, "f[1.4]")  # works _getattr "test" 

we aren't really using unhashable dict keys, looks in our query language cares, right!

finally, can run multiple _setattr calls on same object, give try yourself.


Comments

Popular posts from this blog

basic authentication with http post params android -

vb.net - Virtual Keyboard commands -

css - Firefox for ubuntu renders wrong colors -