1

Say I want to write yet another markdown parser and I want it to be thoroughly tested. I thought I'll create two folders markdown and html each containing the same filenames. To perform the tests I'll need just a single function:

def test_correct_parsing(md, html):
    assert markdown(md) == html

My questions: Is this a good strategy? If so, how can I do it with pytest?

1 Answer 1

2

You need more than that I think - as you will only prove that you can parse those specific markdown input streams. Every testing strategy I have seen is multi-layered - for instance :

  1. a test for each specific markdown tag separately - to ensure that the parser correctly translate each into the expected html fragment - these tests should take and output text - not read/write to files - easy to test - easy to automate.
  2. a test for each container tags - i.e. tags which create spans/divs/tables etc - to ensure that the container html is formed correctly, and that the contained markdown is generated correctly - again text string in/ text string out.
  3. a set of test for malformed tags - to ensure that any malfromation generates the correct html/error - - again text string in/ text string out.
  4. a set of tests for file handling - including non-existant files, broken files, files with no permission
  5. a test for your specific inputs as a final confidence.

You should also probably use code coverage stats on your code as you test it - to ensure that you are infact covering all your code paths.

I recommend this for several reasons :

  • It is just far more efficient to test each thing separately - that way if something fails you can easily work out which test failed - because each seperate test reports on it's own success/failure. If you just do one big set of tests - and something fails, you have to then work out where in your html output it went wrong (if indeed you got any output at all), and then try to identify which markdown element failed.
  • If you change something - it is dead easy to test just that thing you changed.
  • With a layered approach you can test stuff as you write it - without having to have all the code written.
  • It is simply good practice - a methodology recommended by many experts for many years for these reasons and many more. Unless you have a very reason, it is wise to stick to good practice - it just makes your life easier.
Sign up to request clarification or add additional context in comments.

4 Comments

Why can't I have all these in the different files? a header.md file that contain all forms of titles, a list.md one for list, etc. And some other files that mix multiple kind of elements.
Add some notes to the answer to explain why.
So ideally you should test all forms of a given tag separately. In real world scenarios, we unit test smaller components and try to create test cases of all possibilities that we can think. Once they pass, you can be sure that individually your component is working fine. Then we moved to functional and integration testing where we use similar forms with other things to test your component in a real-world use case. This will help you understand when your component is breaking and due to what more easily.
If you remove this line of isolation of tests, then it will get messy for you to debug failures. So having a file with all the forms of titles is fine. But test those forms of titles individually in isolation as well in separate test module for testing titles only.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.