Posted By: Anonymous
I would like to create views or dataframes from an existing dataframe based on column selections.
For example, I would like to create a dataframe
df2 from a dataframe
df1 that holds all columns from it except two of them. I tried doing the following, but it didn’t work:
import numpy as np import pandas as pd # Create a dataframe with columns A,B,C and D df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD')) # Try to create a second dataframe df2 from df with all columns except 'B' and D my_cols = set(df.columns) my_cols.remove('B').remove('D') # This returns an error ("unhashable type: set") df2 = df[my_cols]
What am I doing wrong? Perhaps more generally, what mechanisms does pandas have to support the picking and exclusions of arbitrary sets of columns from a dataframe?
You can either Drop the columns you do not need OR Select the ones you need
# Using DataFrame.drop df.drop(df.columns[[1, 2]], axis=1, inplace=True) # drop by Name df1 = df1.drop(['B', 'C'], axis=1) # Select the ones you want df1 = df[['a','d']]