1

I generate HTML file programmatically. As imagined its quite ugly but works perfectly. I was wondering if there is a github action or a workflow that I can write that will convert the file into a pretty looking html file.

Writing a workflow that uses Python is fine too. However I must point out that BeautifulSoup fails to correctly indent my file(output misses some tags - perhaps because the generated html is untidy due to line breaks etc) - moreover it uses a single space indenting system, I need 4 spaces.

Some other tools I looked into -

  • html5print - Isn't maintained it seems - idle since 5 years
  • HTML Tidy - Doesn't seem to work with Python 3.X

Don't know if I will be able to run the following in a workflow file via actions -

I haven't explored other languages, but I am open to them, especially Go and Ruby.

1 Answer 1

1

You can install tidy via apt and just run it directly in CI pipeline. So, assuming that you have some python script that generates page.html for you. Here is the configuration that runs tidy with generated file:

page.html ("generated" file)

<!DOCTYPE html><html><head><script src="bundle.js"></script><title>My Page</title></head><body><div id="app"><p>Paragraph 1</p></div></body></html>

.github/workflows/test.yml

---
name: Test

on: push

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/[email protected]
      # here is some step that generates HTML file with python
      - run: sudo apt install tidy
      - run: tidy page.html > page2.html
      - name: Print generated page
        run: cat page2.html

Result

enter image description here

tidy is pretty much customizable, so you can configure it for your needs. Just run man tidy or follow official documentation to see possible options.

Sign up to request clarification or add additional context in comments.

4 Comments

I understand that its supposed to work, but when I run it on my rendered HTML, it just left aligns everything. No indentation, nothing whatsoever. In the head, it also places things in 2 lines instead of one. I don't think it works well.
Did you play with tidy options? This solution works for me perfectly, shows formatted HTML with indentation - Print generated page.
Looks like you are onto something. Please give me sometime. Bogged down with a few things here. Will check and let you know soon.
tidy --indent auto --indent-spaces 2 --tidy-mark no --force-output yes -o index_output.html index.html seems to work for me now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.