Missing spaces between words when parsing with python-docx

python-docx has a very simple object model: Document -* Paragraph -* Run, and is very easy to work with.

However there's one showstopper issue: in some cases consequent runs (e.g. single words) do not contain any whitespace, yet the paragraph.text attribute contains whitespace between said runs.

This is a major headache, because I have to concatenate a subset of runs based on their color and style properties, and filter out the rest. Because of the issue, some words get crammed together.

I tried inserting spaces and then removing redundant whitespace at the end, but it proved to be a very error prone approach.

Has anyone stumbled upon this? Any suggestions would be very appreciated.

asked Aug 23 at 11:49

andy

233 bronze badges

please, follow the guidelines outlined in minimal reproducible example and update your enquiry, show example input , output and expected output along with the code doing the work.

ticktalk
– ticktalk

2025-08-24 11:22:15 +00:00
Commented Aug 24 at 11:22

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Missing spaces between words when parsing with python-docx

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest