mf2py is a full-featured microformats2 (mf2) parser implemented in Python.
mf2py implements the full mf2 specification, including backward compatibility with microformats1.
To install mf2py, run the following command:
pip install mf2pyImport the parser using:
import mf2pyParse a file containing HTML:
with open('file/content.html','r') as file:
obj = mf2py.parse(doc=file)Parse string containing HTML content:
content = '<article class="h-entry"><h1 class="p-name">Hello</h1></article>'
obj = mf2py.parse(doc=content)Parse content from a URL:
obj = mf2py.parse(url="http://tommorris.org/")parse is a convenience method that actually delegates to
mf2py.Parser to do the real work. More sophisticated behaviors are
available by invoking the object directly.
Retrieve parsed microformats as a Python dictionary or JSON string:
p = mf2py.Parser(...)
p.to_dict() # returns a python dictionary
p.to_json() # returns a JSON stringFilter by microformat type:
p.to_dict(filter_by_type="h-entry")
p.to_json(filter_by_type="h-entry")- Pass the optional argument
img_with_alt=Trueto either theParserobject or to theparsemethod to enable parsing of thealtattribute of<img>tags according to issue: image alt text is lost during parsing. By default this isFalseto be backwards compatible.
- I passed
mf2py.parse()a BeautifulSoup document, and it got modified!
Yes, mf2py currently does that. We're working on preventing it! Hopefully soon.
A hosted live version of mf2py can be found at python.microformats.io.
We welcome contributions and bug reports via Github, and on the microformats wiki.
We to follow the IndieWebCamp code of conduct. Please be respectful of other contributors, and forge a spirit of positive co-operation without discrimination or disrespect.
mf2py is licensed under an MIT License.
