Convert a Cell's Value/Formats to an HTML String in Excel with VBA

Question

I have had this issue in multiple applications now and I am wondering if anyone has come up with a more efficient solution than mine. Essentially, my goal is to convert the content within a cell, to an HTML string to include all of its formatting. My workaround up to this point has been to loop through each character in the string to determine the font size, weight, and style, however, this can prove to be extremely slow when converting a lot of data at once.

You haven't provided any specific examples of the data you're working with, but excel has the ability to save as HTML. If time really is a bottleneck, it could well be worthwhile to save as html, then analyse the resulting file to extract the relevant information. I'd recommend you first save your spreadsheet as html and look at the output source yourself, to see if it might help. — mkingston
– mkingston, Commented Oct 16, 2012 at 23:42
I think if you want all your style info to be inline and you need precise control over what gets output, then what you're already doing is going to give you the best result (saying that not having seen any of your code...) — Tim Williams
– Tim Williams, Commented Oct 17, 2012 at 3:55

Gary McGill · Accepted Answer · 2012-10-21 20:26:00Z

1

Going through each character in turn will be very slow, but should only be necessary in extreme cases. I've tackled this same problem quite successfully using the following method.

For each relevant property (bold, italic, etc.) I build up an array that stores the position of each change in the value of that property. Then when generating the HTML, I can spit out all the text up until the next change (in any property). Where changes are infrequent, this is clearly faster.

Now, to arrive at the position of the changes in each property, I first test whether there are in fact any changes, and this is easy - for example, Font.Bold will return true if all the text is bold, false if it's all non bold, and null (or some other value - I can't remember) if there are both bold and non-bold parts.

So, if there's no change in the value at all, we're done already. If there is a change in the value, then I do a binary sub-division of the text into two halves and start again. Again, I might find that one half is all the same, and the other half contains a change, so I do another sub-division of the second half as before, and so on.

Since very few cells tend to have lots of changes, and many have none at all, this ends up being quite efficient. Or at least much more efficient than the character by character method.

answered Oct 21, 2012 at 20:26

Gary McGill

27.8k27 gold badges126 silver badges210 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Rob Over a year ago

@Gary_McGill - great theory on how it should work, but how do you ensure the HTML doesn't have mismatches eg you wouldn't want <font name="Arial">Hello <b>World</font></b>

Gary McGill Over a year ago

@Rob: if you have a list of all the points of change, and if you keep track of which tags are active as you output the HTML, then it's not at all hard to deal with that. If it's html4 you're outputting, then I wouldn't bother though - browsers are very forgiving of that sort of thing. Html5, maybe less so.

Rob Over a year ago

We have a strange use case were the in house propitiatory browser only takes strict html - thanks for the perspective - would love it if you had an example :)

Collectives™ on Stack Overflow

Convert a Cell's Value/Formats to an HTML String in Excel with VBA

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related