The input HTML should be interspersed with anchors like this:
<a name="xyz"></a> where
xyz is the index heading for the following text. There should be one such anchor before each entry and an extra anchor at the end of the text; everything before the first anchor is counted as the "header" and everything after the last as the "footer". If these are empty, a default "mobile friendly" HTML header and footer specifying UTF-8 encoding will be added. Anchors may be linked from other entries; these links are changed as necessary.
By default, the input HTML is read from standard input, and the output is written to the current directory as a set of HTML files, each limited to 64 Kb so as not to overload a mobile browser. Opening any of these HTML files should display a textbox that lets you type the first few letters of the word you wish to look up; the browser will then jump to whatever heading is alphabetically nearest to the typed-in text. (By default, only alphabetical letters are significant and diacritical marks are stripped from the index, but this can be changed.)
ohi.py is free software distributed under the GNU General Public License.
Users of the Android platform might also wish to make an APK from the HTML. Here is a shell script to add Copy buttons to any hanzi strings to the HTML files, which should work when it's put into an APK using html2apk (but they won't work in standalone HTML).
ohi.py(see start of file for configuration). This version can also take multiple adjacent anchors, giving alternate labels to the same fragment; there should not be any whitespace between adjacent anchors.
ohi.pyand can be used to help make a printed reference. It includes a simple HTML to LaTeX converter with support for CJK (including Pinyin), Greek, Braille, IPA, Latin diacritics, miscellaneous symbols etc, and PDF features such as cross-referencing should work. Line breaks are automatically added between entries, unless their anchor names end with
*in which case they are separated by semicolons for saving paper when adding large numbers of short "see" entries. If the input has no anchors then
ohi_latexwill just convert simple HTML/Unicode into LaTeX.