KEEPING-PROGRAMS-LINEAR


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226

One of the essential peices of most UI programs is the event loop.
Blocking the event loop in one subsystem of a program will cause other
subsystems to not receive events.  This is obviously not desirable, so it
is important to make sure that operations don't block the event loop.  
Instead they must be handled asynchronously.  i.e., post a request for the
operation to get done and wait for notification that it is finished.
Factoring the code to work in these situations can be cumbersome and
unclear.  Something as simple as:

do_an_operation ()
do_another_operation ()
do_a_third_operation ()

has to be factored into a non-linear program, where one operation is
requested and the reply is waited for in the event loop then on reply
the next operation is requested and so on. 

I guess the goal should be come up with an api that makes a
program as simple as or nearly as simple as the above but still
returns to the event loop between and during steps.

Maybe something like:

transaction = transaction_new ()
transaction_add_action (transaction, do_an_operation)
transaction_add_action (transaction, do_another_operation)
transaction_add_action (transaction, do_a_third_operation)
transaction_commit (transaction)

The above api would invoke each added actions in turn after the
previously invoked action finished.  A big hole in the above
api, though, is failure handling.  If the second action fails,
there should be a way say it failed and cancel the third action
from running.

If it isn't clear, what we need is a state machine where there
are two possible state transitions: goto failure state, goto to
next action state.  Another interesting point is states have
data associated with them and there isn't any mechanism provided
in the above api for associating data with the states.  

Each action should be able to query information about the
transaction it's currently in, and also be able to keep its own 
state information.  One interesting point is the transaction
object itself is generic above.  It probably needs to be,
because having specialized transaction implementations for every
possible type of transaction would take a lot of coding time.
Transactions need to be easy to define.  Since the transaction
objects themselves are generic, transaction specific information
must be defined by the actions in the transaction.  Early
actions should define the state data needed by later
transactions.

A common idiom in glib for callbacks is to provide a "user data"
pointer and a destroy notification function.  The user data pointer is
passed to the callback when the callback is called and the
destroy notification function is invoked when the operation
associated with the callback is complete.  The idea is you can
allocate some state data for use by the callback and then free
it from the destroy notification function.  If we were to apply
that idea to the transaction example above we would end up with:

transaction = transaction_new ()
transaction_add_action (transaction, do_an_operation,
                        transaction_state_data, free_transaction_state_data)
transaction_add_action (transaction, do_another_operation,
                        transaction_state_data, free_transaction_state_data)
transaction_add_action (transaction, do_a_third_operation,
                        transaction_state_data, free_transaction_state_data)
transaction_commit (transaction)

Where transaction_state_data would encode the current state of
the transaction along with an transaction specific data.  The
problem with this approach is their is a lot of redundancy.  We
want the same data available from every action along the way, so later
actions can read data set by earlier actions.  Since we want it throughout
the transaction we should only have to specify that data at most once.

Another idea might be something like:

transaction = transaction_new (transaction_state_data, free_transaction_state_data)
transaction_add_action (transaction, do_an_operation)
transaction_add_action (transaction, do_another_operation)
transaction_add_action (transaction, do_a_third_operation)
transaction_commit (transaction)

and we then could pass the state data to each action automatically
or via an accessor function on the transaction object.  A
problem with this approach (and the approach before this one) is
it enforces a tight binding between all the actions in the
transaction. If  transaction_state_data is statically defined
then all the actions "know" about all the data that all the
other actions need and use.  It means an action is tightly
coupled to its transaction.  This is undesirable because an
action really only cares about the state data it's operating on.
It only needs to "know" about the data it depends on and on the
data its outputing for later actions to use.  By limiting the
data available to an action, we make the action more primitive.
In other words, we can make sure that it is usable in different
situations, for multiple independent transactions.

One way to do this, at the loss of type safety, is to make
transaction_state_data be a generic container type like a
GHashTable.  An action would then rely on certain named keys
being set to do its work and would set other named keys for
later actions to do their work.  This could be done in the above
suggested api without changes, but the api could be simplified
by removing the user_data argument entirely and having the hash
table be provided by the transaction object.

Maybe something like:

from do_an_operation: 
  transaction_set_state_data (transaction, "an-operation-key", key,
                              free_an_operation_key)
from do_another_operation:
  key = transaction_get_state_data (transaction, "an-operation-key")
  copy_key (key, copy)
  transaction_delete_state_data (transaction, "an-operation-key")

In this way, the current state is determined from the current
action and defined transaction state data, and all state data is
cleaned up automatically when the transaction finishes or is
explicitly removed from the transaction object with
delete_state_data.

One thought is this functionality overlaps with gobect data.  It
might be better to just use that.

Another important consideration is that actions should only
chip away at a problem, not block.  So they will need to be
invoked multiple times and notify the transaction when they are
done.  Each action handler could return one of three possible
values:

NOT_FINISHED,
FAILED,
SUCCEEDED

If an action isn't finished it will be called again until it is
finished.  If an action fails, then the whole transaction fails
and all state is cleaned up.  If it succeeds, the transaction 
transitions to the next action.

In either the failed or success cases, there should probably be
a handler invoked to note when the transaction has finished.
The handler should get passed the transaction state.

So we probably want something like

transaction = transaction_new (on_transaction_complete, user_data)
transaction_add_action (transaction, do_an_operation)
transaction_add_action (transaction, do_another_operation)
transaction_add_action (transaction, do_a_third_operation)
transaction_commit (transaction)

Should we pass user_data to transaction_new ?  I think so, because
it's per-transaction data that the user will probably always want
to get at when the transaction completes.

Other thoughts...

It would be nice to be able to not call an incomplete
action's handler more than necessary.  Maybe some sort of api
like, "don't call me again until this fd is ready" would be
useful.  We probably want to be able to set that from within
the action since the action won't have an idea what the fd is,
until its run once.  Maybe an api like:

void transaction_pause_for_fd (Transaction *transaction, int fd, GIOCondition io_condition)

Of course there should be a way to cancel a transaction as well
(maybe transaction_cancel (transaction)) and when a transaction
is canceled there should be a way to clean up the fd.  It
might be sufficient, to expect the idiom

transaction_set_state_data (transaction, "my-action-fd", fd, close_fd)
transaction_pause_for_fd (transaction, fd, G_IO_READ)

But that might be error prone, so it's probably better to
enforce the state directly in the pause call.

transaction_pause_for_fd (transaction, "my-action-fd", fd, G_IO_READ, close_fd)

We should post a g_warning if a transaction is pausing on more
than one fd also.

There are other types of things to wait on that would probably
be useful to add to the api.  For instance, 

void transaction_pause_for_a_while ((Transaction *transaction, guint a_while);

Source already have their own mechanism for destroy notification though, so
we don't need to worry about using named state data.  That means the first
example (waiting on an fd) could be:

transaction_pause_for_fd (transaction, fd, G_IO_READ, close_fd)

Another interesting point is actions may sometimes be compound.
In those cases it might be useful to allow an action to prepend
actions before it in the queue.  Perhaps something like:

transaction_add_action (transaction, do_another_operation_first,
                        rollback_the_other_operation)

Another thing to think about is main loop integration.  We
should allow transactions to be attached to arbitratry contexts.

transaction_attach (transaction, g_main_context_get_default ());

One last thing is, after a transaction is successfully completed, 
will each action need to do some sort of clean up?  Or when that's
required is it sufficient to just expect the user to add additional
actions at the end of the chain to do the clean up?

Might need some experimentation to see.

Also, it should be possible to chain transactions together
(treat an transaction as an action).

add_action should be callable from within an action as well.
If that happens then the action should get inserted into the action
queue before the current action and the current action pointer should
get rewound. Rollback handling needs to be thought through in this scenario.
Depending on the order things should get run, we may want to keep the actions 
in a tree so we can track depending information.