0

I have a very large sample dataframe (~640,000 rows), and I'm currently testing to see if a parser I've built can detect specific phrases. This dataframe is full of text strings.

I'm trying to find a way to insert a specific number of rows into random places within the dataframe.

This the basic structure of the dataframe:

                                            Comments            code  
0  The stupidity of it is that gamed to total def...            NaN  
1  called poker face she s actually...                          WP  
2  Example not identifying the fundamental scarci...            NaN  
3  No tol is bait That s the point...                           NaN 

The imputed rows have the same structure as the rows in the dataframe.

0

1 Answer 1

1

If the following was the structure of your input:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Comments':['Text1','Text2','Text3','Text4'], 'code':['WP', np.nan, np.nan, np.nan]})
newrow = pd.DataFrame({"Comments":'Text_new', 'code':np.nan}, index=[0])

The Initial Dataframe:

  Comments code
0    Text1   WP
1    Text2  NaN
2    Text3  NaN
3    Text4  NaN

The new row to be added:

   Comments  code
0  Text_new   NaN

You can use this line of code to add the new row into a random location in the dataframe

from numpy.random import randint
random_row = randint(len(df)+1)

df = pd.concat([df.iloc[:random_row], newrow, df.iloc[random_row:]]).reset_index(drop=True)

Output:

   Comments code
0     Text1   WP
1     Text2  NaN
2     Text3  NaN
3  Text_new  NaN
4     Text4  NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.