Wikipedia : Java library to remove wikipedia text markup removal
- by Algorist
Hi,
I downloaded wikipedia dump and now want to remove the wikipedia markup in the contents of each page. I tried writing regular expressions but they are too many to handle. I found a python library but I need a java library because, I want to integrate into my code.
Thank you.