summaryrefslogtreecommitdiff
path: root/README
blob: b20a381b922c08e991f67b836f62c31ec114cf14 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
This is is FriBidi, a Free Implementation of the Unicode BiDi algorithm.

Background
==========
One of the missing links stopping the penetration of free software in
Israel is the lack of support for Hebrew. In order to have proper
Hebrew support, the BiDi algorithm must be implemented. It is my hope
that this library will stimulate more Hebrew free software.

Of course the BiDi algorithm is not limited to Hebrew, so I expect
that our Arab neighbors will also find this software useful. 


Audience
========

It is my hope that this library will stimulate the implementation of
Hebrew and Arabic in lots of free software. Here is a small list of
projects that would benifit from the use of the FriBidi library, but
of course there are many more: Wine, Mozilla, Gtk, Gnome, Qt, KDE,
AbiWord, lynx, StarOffice.

Downloading
===========
The latest version of FriBidi may be found at:

   http://fribidi.sourceforge.net/

Building
========
See INSTALL for a description of how to build and install this library.

Implementation
==============
The library implements most of the algorithm as described in the
"Unicode Bidirectional Algorithm, Working Draft Unicode Technical Report
#9, http://www.unicode.org/unicode/reports/tr9/". The major feature
that is currently missing in fribidi is the support for explicit overrides.

In the API I was was inspired by the document "Bi-Di languages support
- BiDi API propasal", http://www.langbox.com/AraMosaic/mozilla/BiDi_API.html,
by Franck Portaneri <franck@langbox.com> which he wrote as a proposal
for adding BiDi support to Mozilla.

Internally the library uses Unicode entirely. The character property
function was automatically created from the Unicode property list
document PropList.txt available from the Unicode ftp site. This
means that every Unicode character should be treated in strict
accordance with the Unicode specification. The same is true for the
mirroring of characters, which also works for all the characters
listed as mirrorable in the the Unicode specification.

Other character sets must be converted into Unicode before the library
may be used. In order to use e.g. iso8859-8, the function

     void
     fribidi_iso8859_8_to_unicode(guchar *s,
             	                  /* output */
	    	                  FriBidiChar *us)

must be called which translates the guchar string *s to a unicode 
string. There is also a corresponding fribidi_unicode_to_iso8859_8
that may be called to convert the string back to iso8859_8 for output.

The reordering of characters is done through the function:
      
     void
     fribidi_log2vis(/* input */
		     FriBidiChar *str,
		     int len,
		     FriBidiCharType *pbase_dir,
		     /* output */
		     FriBidiChar *visual_str,
		     gint        *position_L_to_V_list,
		     gint        *position_V_to_L_list,
		     gint8       *embedding_level_list
		     )
    

where
     str                    is the Unicode input string
     len                    is the length of the unicode string
     pbase_dir              is the input and output base direction. If 
                            base == FRIBIDI_TYPE_N then fribidi_log2vis 
                            calculates the base direction on its own
                            according to the BiDi algorithm.
     visual_str             The reordered output unicode string.
     position_L_to_V_list   Maps the positions in the logical string to 
                            positions in the visual string.
     position_V_to_L_list   Maps the positions in the visual string to 
                            the positions in the logical string.
     embedding_level_list   Returns the classification of each character. Even
                            levels indicate LTR characters, and odd levels
                            indicate RTL characters. The main use of this
                            list is in interactive applications when the
                            embedding e.g. level determines cursor display.

In any of the output pointers == NULL, then that information is not 
calculated.

A test program test_fribidi has been written to test out the algorithm.
test_fribidi currently works on iso8859-8 by default, but by adding
the flag -capital_rtl it treats capital letters as RTL, as is done
for illustration purposes in the Unicode specification. 

How it looks like
=================

Here is the output of 
    ./test_fribidi -capital_rtl tests/test-capital-rtl

car is THE CAR in arabic            => car is RAC EHT in arabic           
CAR IS the car IN ENGLISH           =>           HSILGNE NI the car SI RAC
he said "IT IS 123, 456, OK"        => he said "KO ,456 ,123 SI TI"       
he said "IT IS (123, 456), OK"      => he said "KO ,(456 ,123) SI TI"     
he said "IT IS 123,456, OK"         => he said "KO ,123,456 SI TI"        
he said "IT IS (123,456), OK"       => he said "KO ,(123,456) SI TI"      
HE SAID "it is 123, 456, ok"        =>        "it is 123, 456, ok" DIAS EH
<H123>shalom</H123>                 =>                 <123H/>shalom<123H>
<h123>SAALAM</h123>                 => <h123>MALAAS</h123>                
HE SAID "it is a car!" AND RAN      =>      NAR DNA "!it is a car" DIAS EH
HE SAID "it is a car!x" AND RAN     =>     NAR DNA "it is a car!x" DIAS EH

Executable
==========
There is a also a command line utilitity called fribidi that loops over
the text of a file and performs the BiDi algorithm on each line. Run
fribidi with the help option to learn its usage. 

Bugs and comments
=================

Report FriBidi bugs at:

	http://sourceforge.net/bugs/?group_id=2722

And send your comments to:

	fribidi-discuss@lists.sourceforge.net