Age | Commit message (Expand) | Author | Files | Lines |
2022-12-14 | test: fix test binary build for Windows. | Jehan | 1 | -2/+9 |
2022-12-14 | src: reset shortcut charset/language on Reset(). | Jehan | 1 | -0/+8 |
2022-12-14 | src: do not test with nsLatin1Prober anymore. | Jehan | 1 | -2/+9 |
2022-12-14 | src: improve confidence computation (generic and single-byte charset). | Jehan | 3 | -26/+31 |
2022-12-14 | script: generate more complete frequent characters when range is set. | Jehan | 1 | -19/+16 |
2022-12-14 | script, src: regenerate the Thai model. | Jehan | 3 | -288/+325 |
2022-12-14 | src, script: fix the order of characters for Vietnamese. | Jehan | 2 | -376/+356 |
2022-12-14 | src, script: add concept of alphabet_mapping in language models. | Jehan | 4 | -237/+192 |
2022-12-14 | script: regenerate Slovak and Slovene with better alphabet support. | Jehan | 6 | -558/+587 |
2022-12-14 | script: fix a stupid bug making same ratio for all frequent characters. | Jehan | 1 | -1/+1 |
2022-12-14 | script, src: regenerate the Vietnamese model. | Jehan | 3 | -229/+383 |
2022-12-14 | src: fix negative confidence wrapping around because of unsigned int. | Jehan | 1 | -1/+1 |
2022-12-14 | script, src: remove generated statistics data for Korean. | Jehan | 5 | -1315/+2 |
2022-12-14 | src: new nsCJKDetector specifically Chinese/Japanese/Korean recognition. | Jehan | 4 | -1/+313 |
2022-12-14 | README: fix a duplicate. | Jehan | 1 | -1/+1 |
2022-12-14 | Update README. | Jehan | 1 | -20/+105 |
2022-12-14 | src: consider any combination with a non-frequent character as sequence. | Jehan | 1 | -0/+10 |
2022-12-14 | src: add Hindi/UTF-8 support. | Jehan | 8 | -2/+501 |
2022-12-14 | src: improve confidence computation. | Jehan | 2 | -26/+108 |
2022-12-14 | script: fix a bit BuildLangModel.py when use_ascii is True. | Jehan | 1 | -3/+8 |
2022-12-14 | script, src: add generic Korean model. | Jehan | 8 | -41/+2223 |
2022-12-14 | src, test: fix the new Johab prober and add a test. | Jehan | 4 | -8/+15 |
2022-12-14 | src: build new charset prober for Johab Korean. | Jehan | 6 | -6/+8 |
2022-12-14 | add charset prober for Johab Korean | LSY | 9 | -2/+1029 |
2022-12-14 | script, src: generate the Hebrew models. | Jehan | 10 | -172/+642 |
2022-12-14 | test: 4 new tests for UTF-8. | Jehan | 4 | -0/+8 |
2022-12-14 | src: drop the SURE_YES confidence for character distribution probers. | Jehan | 1 | -1/+1 |
2022-12-14 | src: do not shortcut UTF-8 detection too early. | Jehan | 1 | -1/+3 |
2022-12-14 | src: nsEscCharsetProber also returns the correct language. | Jehan | 6 | -6/+21 |
2022-12-14 | src: make nsMBCSGroupProber report all valid candidates. | Jehan | 4 | -99/+202 |
2022-12-14 | src: allow for nsCharSetProber to return several candidates. | Jehan | 27 | -96/+110 |
2022-12-14 | src: nsMBCSGroupProber confidence weighed by language confidence. | Jehan | 1 | -2/+16 |
2022-12-14 | src: tweak again the language detection confidence. | Jehan | 1 | -13/+9 |
2022-12-14 | test: update unit test to check detected languages. | Jehan | 1 | -23/+43 |
2022-12-14 | src: reset language detectors when resetting a nsMBCSGroupProber. | Jehan | 1 | -0/+6 |
2022-12-14 | src, script: regenerate all existing language models. | Jehan | 43 | -4708/+5426 |
2022-12-14 | Using the generic language detector in UTF-8 detection. | Jehan | 29 | -42/+234 |
2022-12-14 | New generic language detector class. | Jehan | 3 | -0/+300 |
2022-12-14 | Rebuild a bunch of language models. | Jehan | 14 | -1401/+1617 |
2022-12-14 | src: add a --weight option to the CLI tool. | Jehan | 1 | -13/+72 |
2022-12-14 | src: new weight concept in the C API. | Jehan | 3 | -4/+86 |
2022-12-14 | src: fix the usage of `uchardet` tool. | Jehan | 1 | -1/+1 |
2022-12-14 | src: `uchardet` tool now shows the language code in verbose mode. | Jehan | 1 | -3/+9 |
2022-12-14 | script: update BuildLangModel.py to updated SequenceModel struct. | Jehan | 1 | -1/+2 |
2022-12-14 | src: new API to get the detected language. | Jehan | 51 | -104/+276 |
2022-12-14 | test: fix test script to use the new API and get rid of build warning. | Jehan | 1 | -1/+1 |
2022-12-14 | src: new option --verbose|-V in the `uchardet` CLI tool. | Jehan | 1 | -10/+38 |
2022-12-14 | src: new API to get all candidates and their confidence. | Jehan | 3 | -3/+51 |
2022-12-14 | src: now reporting encoding+confidence and keeping a list. | Jehan | 3 | -26/+62 |
2022-12-08 | README, doc: some README and release procedure updates. | Jehan | 2 | -9/+13 |