summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJehan <jehan@girinstud.io>2022-12-16 23:28:28 +0100
committerJehan <jehan@girinstud.io>2022-12-16 23:35:17 +0100
commit6d31689632b48947f65536444682a487cab722f6 (patch)
tree20fcdcd96bdf7b291dad46685ea07eec9eb321e4
parent0974920bddfbb1eb13a8d84aa1acd96822d9bf33 (diff)
test: adding 2 tests for Hebrew/IBM862 recognition.wip/Jehan/improved-API
This is the same text, taken from this Wikipedia page, which was today's page of honor on Wikipedia in Hebrew: https://he.wikipedia.org/wiki/שתי מסכתות על ממשל מדיני I put it in 2 variants, since IBM862 can be used in logical and visual variants. The visual variant is just about inverting orders of letters (per lines, while lines stay in proper order), so that's what I did. Though note that the English title quoted in the text should likely not have been reverted, but it doesn't matter too much since anyway these are off-Hebrew alphabet and would trigger bad sequence score, whichever their order. So I didn't bother fixing these.
-rw-r--r--test/he/ibm862.logical.txt1
-rw-r--r--test/he/ibm862.visual.txt1
2 files changed, 2 insertions, 0 deletions
diff --git a/test/he/ibm862.logical.txt b/test/he/ibm862.logical.txt
new file mode 100644
index 0000000..b22fa94
--- /dev/null
+++ b/test/he/ibm862.logical.txt
@@ -0,0 +1 @@
+ (: Two Treatises of Government) - ' , -1689.[1] (). "". , . , .
diff --git a/test/he/ibm862.visual.txt b/test/he/ibm862.visual.txt
new file mode 100644
index 0000000..5ce09f3
--- /dev/null
+++ b/test/he/ibm862.visual.txt
@@ -0,0 +1 @@
+. , . , ."" .)( ]1[.9861- , \' - )tnemnrevoG fo sesitaerT owT :(