harfbuzz - HarfBuzz text shaping library (new rewritten version)

Age	Commit message (Collapse)	Author	Files	Lines
2012-11-30	Add Persian test cases from Mehran Mehr	Behdad Esfahbod	3	-0/+10

2012-11-16	[Indic] Another try to unbreak Sinhala split matras	Behdad Esfahbod	2	-0/+5
	Just read the comments...
2012-11-14	Add test cases for Thai PUA shaping	Behdad Esfahbod	2	-0/+12

2012-11-14	Adjust diff rule for the new hb-shape output format	Behdad Esfahbod	1	-1/+1

2012-11-14	Add Sinhala test case for split matra U+0DDA	Behdad Esfahbod	1	-0/+1

2012-11-14	Fix test	Behdad Esfahbod	1	-1/+1

2012-11-13	Add buffer flags	Behdad Esfahbod	1	-0/+9
	New API: hb_buffer_flags_t HB_BUFFER_FLAGS_DEFAULT HB_BUFFER_FLAG_BOT HB_BUFFER_FLAG_EOT HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES hb_buffer_set_flags() hb_buffer_get_flags() We use the BOT flag to decide whether to insert dottedcircle if the first char in the buffer is a combining mark. The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like ZWNJ/ZWJ/...
2012-11-13	Minor	Behdad Esfahbod	1	-1/+1

2012-11-13	Add hb_buffer_clear()	Behdad Esfahbod	1	-1/+21
	Which is like _reset(), but does NOT clear unicode-funcs.
2012-11-12	Add "new" Myanmar OT Script tag	Behdad Esfahbod	1	-0/+1
	Windows 8 added support for Myanmar shaping using the "mym2" script tag, even though Windows never supported the old "mymr" tag.
2012-11-12	Add Myanmar tests from UTN#11	Behdad Esfahbod	2	-0/+35

2012-11-05	Add test for non-joining Mongolian letters	Behdad Esfahbod	2	-0/+9
	For U+1880..U+1886 Uniscribe thinks they are non-joining. For U+1887 Uniscribe thinks it's joining, but looks wrong to me.
2012-11-02	Add Tifinagh test data	Behdad Esfahbod	4	-0/+15

2012-11-02	Add Mongolian and 'Phags-pa joining test cases	Behdad Esfahbod	5	-0/+20

2012-11-01	Minor build fix	Behdad Esfahbod	1	-1/+9

2012-10-29	Ignore gid0 in test results	Behdad Esfahbod	1	-0/+1

2012-10-29	Add Ethiopic test case	Behdad Esfahbod	3	-0/+3
	This sequence: U+120B,U+135F,U+120B with the Nyala font from Win7 exposes a GPOS bug in Uniscribe, in that the positioned mark is wrongly moved as a result a following kern. This is the one "failure" in the Ethiopic test suite :-). ETHIOPIC: 118900 out of 118901 tests passed. 1 failed (0.000841036%)
2012-09-07	[Indic] Find syllables before any features are applied	Behdad Esfahbod	1	-0/+1
	With FreeSerif, it seems that the 'ccmp' feature does ligature substituttions. That was then causing syllable match failures. We now find syllables before any features have been applied. Test sequence: U+0D9A,U+0DCA,U+200D,U+0DBB,U+0DCF
2012-09-05	Fixup test failure reporting	Behdad Esfahbod	1	-3/+5
	After we implemented dotted-circle, we were still ignoring any tests that had dottedcircle in it for any of the shapers. That meant that if we wrongly outputted dottedcircle, the test was being ignored. Ouch! Fixing that shows regressions across the board. Most are Uniscribe bugs: NOT inserting dotted-circle when it should. Some are arou machine bugs. This is in fact a nice way to catch Indic-machine deficiencies and when I fix the regressions, our clusters should be much closer to Uniscribe. For now, we regressed from: BENGALI: 353997 out of 354285 tests passed. 288 failed (0.0812905%) DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.00777502%) GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%) GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%) KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%) KHMER: 299106 out of 299124 tests passed. 18 failed (0.00601757%) LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%) MALAYALAM: 1048104 out of 1048416 tests passed. 312 failed (0.0297592%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271747 out of 271847 tests passed. 100 failed (0.0367854%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970558 out of 970573 tests passed. 15 failed (0.00154548%) TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%) To: BENGALI: 353990 out of 354285 tests passed. 295 failed (0.0832663%) DEVANAGARI: 707315 out of 707394 tests passed. 79 failed (0.0111678%) GUJARATI: 366447 out of 366506 tests passed. 59 failed (0.016098%) GURMUKHI: 60707 out of 60809 tests passed. 102 failed (0.167738%) KANNADA: 951042 out of 951913 tests passed. 871 failed (0.0915%) KHMER: 298962 out of 299124 tests passed. 162 failed (0.0541581%) LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%) MALAYALAM: 1048074 out of 1048416 tests passed. 342 failed (0.0326206%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%) TAMIL: 1091835 out of 1091837 tests passed. 2 failed (0.000183178%) TELUGU: 970553 out of 970573 tests passed. 20 failed (0.00206064%) TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%) Investigating.
2012-08-27	Minor	Behdad Esfahbod	1	-1/+1

2012-08-10	[test] Move around	Behdad Esfahbod	14	-1/+2

2012-08-10	[test] Add Urdu ligature sequences from CRULP	Behdad Esfahbod	13	-0/+17315

2012-07-31	Implement Unicode compatibility decompositions	Behdad Esfahbod	2	-0/+51
	Based on patch from Philip Withnall. https://bugs.freedesktop.org/show_bug.cgi?id=41095
2012-07-30	Add Hebrew test	Behdad Esfahbod	1	-0/+1

2012-07-30	[GSUB] Further adjustments to mark-attachment vs ligation interaction	Behdad Esfahbod	1	-0/+1
	The d1d69ec52e75a78575b620a1c456d528b6078170 change broke Kannada badly, since it was ligating consonants, pushing matra out, and then ligating with the matra. Adjust for that. See comments.
2012-07-29	Add Arabic tests for mark ligature component attachments	Behdad Esfahbod	2	-0/+19

2012-07-28	[GPOS] Fix mark-to-mark positioning when one of the marks is a ligature	Behdad Esfahbod	6	-0/+6
	This commit: a3313e54008167e415b72c780ca7b9cda958d07e broke MarkMarkPos when one of the marks itself is a ligature. That regressed 26 Tibetan tests (up from zero!). Fix that. Tibetan back to zero.
2012-07-24	[Indic] Reposition Gurmukhi top matras to after post	Behdad Esfahbod	1	-0/+1
	The font is forming a post-base consonant in some samples, and Uniscribe positions top matra on the post-base. Do the same. Gurmukhi failures down from 59 to 41 (0.0674242%).
2012-07-24	[Indic] Ignore Uniscribe output containing two zero-width space glyphs	Behdad Esfahbod	2	-0/+3
	Uniscribe is buggy and sometimes /eats/ a mark next to a non-joiner. Most of Malayalam failures where actually hitting this bug. Ignore test output with two zero-width space glyphs. This is a hack until we build up the test suite infrastructure better. Bengali went down by 9, Devanagari by 2, Kannada by 130, Malayalm down from 1197 to 307, Sinhala down by 16, Telugu down by 26. New stats: BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%) DEVANAGARI: 693573 out of 693628 tests passed. 55 failed (0.00792932%) GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%) GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%) KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%) KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%) MALAYALAM: 1048109 out of 1048416 tests passed. 307 failed (0.0292823%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271715 out of 271847 tests passed. 132 failed (0.0485567%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970550 out of 970573 tests passed. 23 failed (0.00236973%)
2012-07-24	[Indic] Better position left-matra in Malayalam	Behdad Esfahbod	1	-0/+1
	Just put it before base, which is what's expected. Malayalam failures down from 1559 to 1197 (0.114172%). BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%) DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%) GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%) GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%) KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%) KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%) MALAYALAM: 1047219 out of 1048416 tests passed. 1197 failed (0.114172%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24	[Indic] Implement Reph+Ya-Phalaa interaction	Behdad Esfahbod	1	-0/+4
	The sequence Ra,H,Ya in Bengali is ambigious and Unicode encoded that to get Ya-Phalaa, one would place ZWJ before Halant. Ie. a ZWJ,H sequence requests subjoining, while a H,ZWJ requests Half form. Implement that. Bengali failures go down from 377 to 297 (0.0838308%). Gujarati is down by 4 to 17 (0.0046384%). Kannada is down by 226 to 957 (0.100534%). Current status: BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%) DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%) GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%) GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%) KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%) KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%) MALAYALAM: 1046857 out of 1048416 tests passed. 1559 failed (0.148701%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24	[Indic] Unmark U+17D1 KHMER SIGN VIRIAM to NOT be a Virama	Behdad Esfahbod	1	-0/+1
	Fixes another 1 Khmer failure. Down to 30 (0.0100293%) now.
2012-07-24	[Indic] Reposition Khmer prebase-reordering Ra around split matras	Behdad Esfahbod	1	-0/+4
	In Khmer coeng model, a V,Ra can go after matras. If it goes after a split matra, it should be reordered to before the left part of such matra. Khmer failures down from 136 to 39 (0.0130381%).
2012-07-24	[Indic] Position Khmer U+17CE	Behdad Esfahbod	1	-0/+1
	Fixes another 6 Khmer failures. Now at 136 (0.0454661%).
2012-07-24	[Indic] In Sinhala, form forced Reph even if no other consonant found	Behdad Esfahbod	1	-0/+1
	Fixes another 10 Sinhala failures. Down to 148 (0.0544424%).
2012-07-24	[Indic] Further adjust base algorithm for Sinhala	Behdad Esfahbod	1	-0/+3
	Apparently if there is C,V,ZWJ,C, the first C will be base, but if it's C,ZWJ,V,C, the second one will be. Note that Uniscribe implements this differently, by breaking syllable in the case of C,ZWJ,V,C and putting the first consonant in one syllable and the rest in the next syllable. Sinhala failures down from 208 to 158 (0.0581209%). No changes to Khmer.
2012-07-24	[Indic] End Vowel-based syllable at ZWJ	Behdad Esfahbod	1	-0/+1
	One Devanagari test regressed, plus 10 Malayalam (at 1545 now). Fixed 120 Sinhala failures. Now at 208 (0.0765136%).
2012-07-23	[Indic] Improve Sinhala base algorithm and reph positioning	Behdad Esfahbod	1	-0/+1
	Sinhala does not have half forms. And most (all?) consonants can be base, except when preceded by ZWJ, which would request a subjoined form. Hence switch the base algorithm to categorize with Khmer, start search at start, and stop at a ZWJ. Also, mark all pos=base consonants after base to be subjoined. Mark base itself to have pos=base. Finally, adjust Sinhala's reph position to after-main. Brings down Sinhala failures from 455 to 328 (0.120656%).
2012-07-23	[Indic] exclude ligatures when matching on Indic category	Behdad Esfahbod	1	-0/+1
	If, say, a H,ZWJ,C ligature was formed, we don't want the code to detec that as a Halant. So, ignore ligatures when matching category in final_reordering. Sinhala failures down from 514 to 455 (0.167374%).
2012-07-23	[Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU	Behdad Esfahbod	2	-0/+17
	Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39. We do that by modifying the ccc for U+0E3A. Fixes the two remaining Thai failures (see previous commit).
2012-07-23	[Thai] Adjust SARA AM reordering to match Uniscribe	Behdad Esfahbod	5	-1/+40
	Adjust the list of marks before SARA AM that get the reordering treatment. Also adjust cluster formation to match Uniscribe. With Wikipedia test data, now I see: - For Thai, with the Angsana New font from Win7, I see 54 failures out of over 4M tests (0.00129107%). Of the 54, two are legitimate reordering issues (fix coming soon), and the other 52 are simply Uniscribe using a zero-width space char instead of an unknown character for missing glyphs. No idea why. The missing-glyph sequences include one that is a Thai character followed by an Arabic Sokun. Someone confused it with Nikhahit I assume! - For Lao, with the Dokchampa font from Win7, 33 tests fail out of 54k (0.0615167%). All seem to be insignificant mark positioning with two marks on a base. Have to investigate.
2012-07-22	[Indic] Merge in Malayalam tests	Behdad Esfahbod	1	-48/+46
	From: http://silpa.org.in/pub/tests/hb/ml/ml-harfbuzz-testdata.txt
2012-07-22	[Indic] Add extensive Sinhala tests	Behdad Esfahbod	1	-0/+4390
	Generated by: http://git.savannah.gnu.org/cgit/sinhala.git/plain/utils/gen-unicode-sinhala.py
2012-07-22	[Indic] Add Sinhala tests	Behdad Esfahbod	1	-2/+24
	Merge tests from: http://git.savannah.gnu.org/cgit/sinhala.git/plain/patches/icu-sinhala-rendering.txt
2012-07-20	Add a test case	Behdad Esfahbod	1	-0/+1

2012-07-20	[Indic] Reposition Oriya Candrabindu	Behdad Esfahbod	2	-0/+3
	Oriya failures down from 0.65% to 0.20%.
2012-07-19	[Indic] Recategorize some Kannada right matras	Behdad Esfahbod	2	-0/+8
	Kannada failures down from 3.5% to 2.93%.
2012-07-19	[Indic] Add failing test for Kannada	Behdad Esfahbod	1	-0/+1

2012-07-19	[test] Ignore tests with DOTTED CIRCLE in the output	Behdad Esfahbod	1	-0/+4

2012-07-18	[Indic] Accept a forced Rakar sequence at the end of syllable	Behdad Esfahbod	1	-0/+2
	In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra. If you put that at the end of a Consonant,Matra syllable, you get a dotted-circle from Uniscribe. Apparently adding a ZWJ before the Al-Lakuna "fixes" that. And people have been encoding that sequence... So, allow a forced "ZWJ,Virama,ZWJ,Ra" sequence at the of syllables. Fixes some 100 or more of Sinhala failures. Now at 622 only (0.23%).