Skip to content
Fix Code Error

python pandas – add unique Ids in column from master df back in to processed dfs stored in list of dataframes

June 13, 2021 by Code Error
Posted By: Anonymous

I have a single df that includes multiple json strings per row that need reading and normalizing.

I can read out the json info and normalize the columns by storing each row as a new dataframe in a list – which i have done with the code below.

However I need to append the original unique Id in the original df (i.e. ‘id’: [‘9clpa’,’g659am’]) – which is lost in my current code.

The expected output is a list of dataframes per Id that include the exploded json info, with an additional column including Id (which will be repeated for each row of the final df).

I hope that makes sense, any suggestions are very welcome. thanks so much

dataframe

df = pd.DataFrame(data={'id': ['9clpa','g659am'],'i2': [('{"t":"unique678","q":[{"qi":"01","answers":[{"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"G","value":"3"},{"answer":"V","value":"4"}]},{"qi":"02","answers":[{"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"A","value":"3"},{"answer":"B","value":"4"},{"answer":"G","value":"5"},{"answer":"NC","value":"6"},{"answer":"O","value":"7"} ]}]}'),('{"t":"unique428","q":[{"qi":"01","answers":[{"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"G","value":"3"},{"answer":"V","value":"4"}]},{"qi":"02","answers":[{"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"A","value":"3"},{"answer":"B","value":"4"},{"answer":"G","value":"5"},{"answer":"NC","value":"6"},{"answer":"O","value":"7"} ]}]}')]})

current code

out={}
for i in range(len(df)):
    out[i] = pd.read_json(df.i2[i])
    out[i] = pd.json_normalize(out[i].q)

expected output

pd.DataFrame(data={'id': ['9clpa','9clpa'],'qi': ['01','02'], 'answers': ['{"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"G","value":"3"},{"answer":"V","value":"4"}', '"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"A","value":"3"},{"answer":"B","value":"4"},{"answer":"G","value":"5"},{"answer":"NC","value":"6"},{"answer":"O","value":"7"']})
pd.DataFrame(data={'id': ['g659am','g659am'],'qi': ['01','02'], 'answers': ['{"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"G","value":"3"},{"answer":"V","value":"4"}', '"answer":"M","value":"1"},{"answer":"F","value":"2"},{"answer":"A","value":"3"},{"answer":"B","value":"4"},{"answer":"G","value":"5"},{"answer":"NC","value":"6"},{"answer":"O","value":"7"']})

Solution

You are just missing on assigning the id to your dataframe after your normalize columns:

out={}
for i in range(len(df)):
    out[i] = pd.read_json(df.i2[i])
    out[i] = pd.json_normalize(out[i].q)
    out[i]['id'] = df.id[i]
    out[i] = out[i].loc[:, ['id','qi','answers']]

Output:

>>> out[0]
      id  qi                                                                                                                                                                                                                     answers
0  9clpa  01                                                                                                [{'answer': 'M', 'value': '1'}, {'answer': 'F', 'value': '2'}, {'answer': 'G', 'value': '3'}, {'answer': 'V', 'value': '4'}]
1  9clpa  02  [{'answer': 'M', 'value': '1'}, {'answer': 'F', 'value': '2'}, {'answer': 'A', 'value': '3'}, {'answer': 'B', 'value': '4'}, {'answer': 'G', 'value': '5'}, {'answer': 'NC', 'value': '6'}, {'answer': 'O', 'value': '7'}]
Answered By: Anonymous

Related Articles

  • Change column type in pandas
  • Can't install via pip because of egg_info error
  • Pandas Merging 101
  • What's the difference between eval, exec, and compile?
  • "Large data" workflows using pandas
  • How to combine the data from two different dataframes with…
  • What are the undocumented features and limitations of the…
  • Ukkonen's suffix tree algorithm in plain English
  • how to split dataframe cells using delimiter into different…
  • How to return a list from a pos tag column?

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

ES6 way to sort an array by value descending, skip undefined values

Next Post:

How do I check the list elements satisfy a given condition?

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Get code errors & solutions at akashmittal.com
© 2022 Fix Code Error