HTML & Diff & normalization with JSoup

We are working on an Eclipse help document for the Eclipse Scout project. Our vision is to re-use content from the existing Scout book where possible. For this, we want to be able to share text modules between the two documents. In order to do so, I am currently working on the setup of each document and I am extracting some existing text into independent modules.