2

I want to do the same as pandas dataframe and multi line values, except with multiple columns of multi-line text:

import pandas as pd

data = [
       {'id': 1, 'col_one': 'very long text\ntext line 2\ntext line 3', 'col_two': 'very long text\ntext line 4\ntext line 5'},
       {'id': 2, 'col_one': 'short text', 'col_two': 'very long text\ntext line 6\ntext line 7'}
       ]
df = pd.DataFrame(data)
df.set_index('id', inplace=True)
print(df)

This prints as:

                                     col_one                                   col_two
id
1   very long text\ntext line 2\ntext line 3  very long text\ntext line 4\ntext line 5
2                                 short text  very long text\ntext line 6\ntext line 7

... and my desired output is:

id            col_one          col_two
1      very long text   very long text
       text line 2      text line 4
       text line 3      text line 5
2      short text       very long text
                        text line 6
                        text line 7

However, two of the answers there mention .stack(), which will add extra 1s in the id column which I do not want; ... actually, this:

print(df.col_one.str.split("\n", expand=True).stack())

# prints:
id
1   0    very long text
    1       text line 2
    2       text line 3
2   0        short text
dtype: object

... might sort of work (would have to suppress the printout of the new row index somehow) - but its one column only, and I want the entire table.

And, the remaining answer mentions this:

from IPython.display import display, HTML

def pretty_print(df):
    return display(HTML(df.to_html().replace("\\n","<br>")))

... which would seemingly do what I want - but the problem is, that display apparently refers to an interactive environment (such as Jupyter notebook). However, I want to use this in a PyQt5 application; and when I try the above function, I simply get:

<IPython.core.display.HTML object>

... printed in the terminal from where I run the PyQt5 application - and the plainTextEdit which was supposed to contain this text shows nothing.

So, how can I do the same as the above pretty_print function - but get a plain, multiline, formatted string as output, which I can use elsewhere?

1 Answer 1

1

Well, went the hard way, and coded a function for this - with the caveat that it loses the index, so the column titles/names will not be printed in the row above where the index title/name is - but good enough for me, I guess.

import pandas as pd

data = [
       {'id': 1, 'col_one': 'very long text\ntext line 2\ntext line 3', 'col_two': 'very long text\ntext line 4'},
       {'id': 2, 'col_one': 'short text', 'col_two': 'very long text\ntext line 6\ntext line 7'}
       ]
df = pd.DataFrame(data)
df.set_index('id', inplace=True)

def get_df_multiline_printstring(indf_in):
  broken_dfs = []
  #orig_index_name = indf_in.index.name
  #orig_index_dtype = indf_in.index.dtype
  #print("orig index", orig_index_name, orig_index_dtype)
  indf = indf_in.reset_index() #get back the index column? if so, pd.concat will fail with 'TypeError: object of type 'int' has no len()'; only way is to cast, then
  # iterate all columns
  for icol in range(indf.shape[1]):
    # Select column by index position using iloc[]; note, dtype is 'object' for the string columns here!
    columnSeriesObj = indf.iloc[: , icol]
    #print(icol, columnSeriesObj.name, columnSeriesObj.dtype)
    #columnSeriesObj = columnSeriesObj.astype(object) # cast column does not work
    columnSeriesObj = columnSeriesObj.apply(str) # converting all elements to str does;
    broken_dfs.append( columnSeriesObj.str.split("\n", expand=True).stack() ) # "AttributeError: Can only use .str accessor with string values!" here, if we do not have strings everywhere
  # note: without keys=, column names in the concat become 0, 1
  df_concat = pd.concat( broken_dfs, axis=1, keys=indf.columns )
  # "breaking" the short text will result with NaN's - clear them
  df_concat = df_concat.fillna("")
  # do not print index with index=False
  return df_concat.to_string(index=False)

print( get_df_multiline_printstring(df) )

This prints:

id         col_one         col_two
 1  very long text  very long text
       text line 2     text line 4
       text line 3
 2      short text  very long text
                       text line 6
                       text line 7
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.