python - Efficiently select rows that match one of several values in Pandas DataFrame -


problem

given data in pandas dataframe following:

name     amount --------------- alice       100 bob          50 charlie     200 alice        30 charlie      10 

i want select rows name 1 of several values in collection {alice, bob}

name     amount --------------- alice       100 bob          50 alice        30 

question

what efficient way in pandas?

options see them

  1. loop through rows, handling logic python
  2. select , merge many statements following

    merge(df[df.name = specific_name] specific_name in names) # 
  3. perform sort of join

what performance trade-offs here? when 1 solution better others? solutions missing?

while example above uses strings actual job uses matches on 10-100 integers on millions of rows , fast numpy operations may relevant.

you can use isin series method:

in [11]: df['name'].isin(['alice', 'bob']) out[11]:  0     true 1     true 2    false 3     true 4    false name: name, dtype: bool  in [12]: df[df.name.isin(['alice', 'bob'])] out[12]:      name  amount 0  alice     100 1    bob      50 3  alice      30 

Comments

Popular posts from this blog

basic authentication with http post params android -

vb.net - Virtual Keyboard commands -

How to get multiresult with multicondition in Sql Server -