microdata/RDFa-Lite C Library
This library is designed to enable even embedded applications to extract annotations from HTML contents
with small memory foot print and nice performance. It can be easily integrated into any other programing implementation
via dedicated C interfaces (ex. Python.h, ruby.h).
Not only well-known server side use case like search engine and crawller but also web browsers could integrate this library.
- Functionality
- Light weight streaming parser
- Parse chunked HTML and extract both
microdata / RDFa-Lite
via corresponding event handlers
- Applications can reduce memory foot print because huge HTML tree isn't internally created
- Data model builder
- Create a data model based on the underlying streaming parser result
- All semantics of microdata/RDFa-Lite can be accessed via HTML5 microdata API
- Source
- API Documentation
- How to build
- Install libxml because the streaming parser works on libxml HTMLparser
- Execute configure script
$ ./configure
- Build library
$ make
- Sample Programs