Nice: Design documentation
==========================

Socket ownership
----------------

For UDP candidates, one socket is created for each component and bound
to INADDR_ANY. The same local socket is used for the host candidate,
STUN candidate as well as the TURN candidate. The socket handles are
stored to the Component structure.

The library will use the source address of incoming packets in order
to identify from which remote candidates, if any (peer-derived
candidates), packets were sent.

XXX: Describe the subtle issues with ICMP error handling when one
socket is used to send to multiple destinations.

Real-time considerations
------------------------

One potential use for libnice code is providing network connectivity
for media transport in voice and video telephony applications. This
means that the libnice code is potentially run in real-time context
(for instance under POSIX SCHED_FIFO/SHCED_RR scheduling policy) and
ideally has deterministic execution time.

To be real-time friendly, operations with non-deterministic execution
time (dynamic memory allocation, file and other resource access) should
be done at startup/initialization phase. During an active session
(connectivity has been established and non-STUN traffic is being sent),
code should be as deterministic as possible.

Memory management
-----------------

To work on platforms where available memory may be constrained, libnice
should gracefully handle out of memory situations. If memory allocation
fails, the library should return an error via the originating public
library API function.

Use of glib creates some challenges to meet the above:

- A lot of glib's internal code assumes memory allocations will
  always work. Use of these glib facilities should be limited.
  While the glib default policy (see g_malloc() documentation) of terminating 
  the process is ok for applications, this is not acceptable for library 
  components.
- Glib has weak support for preallocating structures needed at
  runtime (for instance use of timers creates a lot of memory 
  allocation activity). 

To work around the above limitations, the following guidelines need
to be followed:

- Always check return values of glib functions.
- Use safe variants: g_malloc_try(), etc
- Current issues (last update 2007-05-04)
     - g_slist_append() will crash if alloc fails

Timers
------

Management of timers is handled by the 'agent' module. Other modules 
may use timer APIs to get timestamps, but they do not run timers. 

Glib's timer interface has some problems that have affected the design:

 - an expired timer will destroy the source (a potentially costly
   operation)
 - it is not possible to cancel, or adjust the timer expiration
   timer without destroying the associated source and creating 
   a new one, which again causes malloc/frees and is potentially
   a costly operation
 - on Linux, glib uses gettimeofday() which is subject to clock
   skew, and no monotonic timer API is available

Due to the above, 'agent' code runs fixed interval periodic timers
(started with g_timeout_add()) during candidate gathering, connectivity
check, and session keepalive phases. Timer frequency is set separately
for each phase of processing. A more elegant design would use dynamic
timeouts, but this would be too expensive with glib timer
infrastructure.

Control flow for NICE agent API (NiceAgentClass)
------------------------------------------------

The main library interface for applications using libnice is the
NiceAgent GObject interface defined in 'nice/agent.h'.

The rough order of control follow is as follows:

- client should initialize glib with g_type_init()
- creation of NiceAgent object instance
- setting agent properties such as STUN and TURN server addresses
- connecting the GObject signals with g_signal_connect() to application
  callback functions
- adding local interface addresses to use with
  nice_agent_add_local_address()

And continues when making an initial offer:

- creating the streams with nice_agent_add_stream()
- attach the mainloop context to connect the NiceAgent sockets to
  the application's event loop (using nice_agent_attach_recv())
- start candidate gathering by calling nice_agent_gather_candidates()
- the application should wait for the "candidate-gathering-done" signal
  before going forward (so that ICE can gather the needed set of local
  connectiviy candidates) 
- get the information needed for sending offer using
  nice_agent_get_local_candidates() and
  nice_agent_get_local_credentials()
- client should now send the session offer
- once it receives an answer, it can pass the information to NiceAgent
  using nice_agent_set_remote_candidates() and
  nice_agent_set_remote_credentials()

Alternatively, when answering to an initial offer:

- the first five steps are the same as above (making initial offer)
- pass the remote session information to NiceAgent using 
  nice_agent_set_remote_candidates() and
  nice_agent_set_remote_credentials()
- client can send the answer to session offer

Special considerations for a SIP client:

- Upon sending the initial offer/answer, client should pick one
  local candidate as the default one, and encode it to the SDP
  "m" and "c" lines, in addition to the ICE "a=candidate" lines.
- Client should connect to "new-selected-pair" signals. If this
  signal is received, a new candidate pair has been set as 
  a selected pair (highest priority nominated pair). See 
  ICE specification for a definition of "nominated pairs". 
- Once all components of a stream have reached the
  "NICE_COMPONENT_STATE_READY" state (as reported by 
  "component-state-changed" signals), the client should check
  whether its original default candidate matches the latest 
  selected pair. If not, it needs to send an updated offer
  it is in controlling mode. Before sending the offer, client
  should check the "controlling-mode" property to check that
  it still is in controlling mode (might change during ICE
  processing due to ICE role conflicts).
- The "remote-attributes" SDP attribute can be created from
  the information provided by "component-state-changed" (which
  components are ready), "new-selected-pair" (which candidates
  are selected) and "new-remote-candidate" (peer-reflexive
  candidates discovered during processing) signals.
- Supporting forked calls is not yet supported by the API (multiple
  sets of remote candidates for one local set of candidates).

Restarting ICE:

- ICE processing can be restarted by calling nice_agent_restart()
- Restart will clean the set of remote candidates, so client must
  afterwards call nice_agent_set_remote_candidates() after receiving 
  a new offer/answer for the restarted ICE session.
- Restart will reinitialize the local credentials (see 
  nice_agent_get_local_credentials()).
- Note that to modify the set of local candidates, a new stream 
  has to be created. For the remote party, this looks like a ICE
  restart as well.

Handling fallback to non-ICE operation:

- If we are the offering party, and the remote party indicates
  it doesn't support ICE, we can use nice_agent_set_selected_pair()
  to force selection of a candidate pair (for remote party, 
  the information on SDP 'm=' and 'c=' lines needs to be used
  to generate one remote candidate for each component of the
  streams). This function will halt all ICE processing (excluding
  keepalives), while still allowing to send and receive media (assuming
  NATs won't interfere).

Notes about sending media:

- Client may send media once all components of a stream have reached
  state of NICE_COMPONENT_STATE_CONNECTED or NICE_COMPONENT_STATE_READY,
  (as reported by "component-state-changed" signals), and a selected pair 
  is set for all components (as reported by "new-selected-pair" signals).

STUN API
--------

The underlying STUN library takes care of formatting and parsing STUN
messages (lower layer),

Applications should only need to use the higher layer API which then
uses the lower layer API.

The following STUN usages are currently implemented by the
transaction layer:
- Binding discovery (RFC5389 with RFC3489 backward compatibility)
- Binding keep-alive
- ICE connectivity checks
- TURN
- STUN retransmission timers


STUN message API
----------------

STUN message API provide thin wrappers to parse and format STUN
messages. To achieve maximum cross-architectures portability and retain
real-time friendliness, these functions are fully "computational" [1].
They also make no assumption about endianess or memory alignment
(reading single bytes or using memcpy()).

Message buffers are provided by the caller (so these can be
preallocated). Because STUN uses a relatively computer-friendly binary
format, STUN messages are stored in wire format within the buffers.
There is no intermediary translation, so the APIs can operate directly
with data received from or sent to the network.

[1] With one exception: The random number generated might access the
system entropy pool (/dev/urandom) if available.