Difference: GhostscriptWithTesseract (6 vs. 7)

Revision 72020-11-26 - RobinWatts

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Ghostscript with Tesseract.

Line: 47 to 47
 cd ..
Changed:
<
<
Next, you need training data for the languages you want - currently, 'eng' is used by default, but others can be used by using -sOCRLanguage="eng,ara" etc.
>
>
Next, you need training data for the languages you want - currently, 'eng' is used by default, but others can be used by using -sOCRLanguage="eng+ara" etc.
 
wget https://github.com/tesseract-ocr/tessdata_fast/raw/master/eng.traineddata tesseract/eng.traineddata
Line: 92 to 92
 and for both english and Arabic, you'd use:
Changed:
<
<
debugbin/gswin32c.exe -sOCRLanguage="eng,ara"
>
>
debugbin/gswin32c.exe -sOCRLanguage="eng+ara"
 

To get simple text extraction:

Line: 131 to 131
 

Still to do

Changed:
<
<
Passing changes upstream - in progress.
>
>
Passing changes upstream - done.
  Look into NEON simd - done.
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright 2014 Artifex Software Inc