Is there any better XPath to pick the div based on the text inside the parent or any of the child?
Use ., not text().
//*[contains(., 'Tag 1')]
text() does not give you the element's "text".
It gives you a list (!) of text nodes that are direct children of the current context node. When the context node is <div> in example #2, that list would be three text nodes containing only whitespace. I've highlighted them with brackets:
<div title='Title2'>[
]<input type='checkbox' />[
]<span>Tag 1<span>[
]</div>
'Tag 1' is a child of <span>, not of <div>.
Now, contains() does not accept node lists. If you give it a node list, it will only consider the string value of the very first node in that list. The string value of a node is the concatenation of all text nodes it contains, not just direct children.
. refers to the context node. In example #2 that's the <div> itself. contains() again converts it to string, but this time, that string actually contains Tag 1. Another way to write this is:
//*[contains(string(.), 'Tag 1')]
That's what you thought text() would do.
Now //* is recursive, this means the <div> will be selected, the <span> and all the <div> ancestors, too, because they all contain Tag 1 at some point.
Use something more specific than //* to fix this.