Nice: Design documentation ========================== Socket ownership ---------------- For UDP candidates, one socket is created for each component and bound to INADDR_ANY. The same local socket is used for the host candidate, STUN candidate as well as the TURN candidate. The socket handles are stored to the Component structure. The library will use the source address of incoming packets in order to identify from which remote candidates, if any (peer-derived candidates), packets were sent. XXX: Describe the subtle issues with ICMP error handling when one socket is used to send to multiple destinations. Real-time considerations ------------------------ One potential use for libnice code is providing network connectivity for media transport in voice and video telephony applications. This means that the libnice code is potentially run in real-time context (for instance under POSIX SCHED_FIFO/SHCED_RR scheduling policy) and ideally has deterministic execution time. To be real-time friendly, operations with non-deterministic execution time (dynamic memory allocation, file and other resource access) should be done at startup/initialization phase. During an active session (connectivity has been established and non-STUN traffic is being sent), code should be as deterministic as possible. Memory management ----------------- To work on platforms where available memory may be constrained, libnice should gracefully handle out of memory situations. If memory allocation fails, the library should return an error via the originating public library API function. Use of glib creates some challenges to meet the above: - A lot of glib's internal code assumes memory allocations will always work. Use of these glib facilities should be limited. While the glib default policy (see g_malloc() documentation) of terminating the process is ok for applications, this is not acceptable for library components. - Glib has weak support for preallocating structures needed at runtime (for instance use of timers creates a lot of memory allocation activity). To work around the above limitations, the following guidelines need to be followed: - Always check return values of glib functions. - Use safe variants: g_malloc_try(), etc - Current issues (last update 2007-05-04) - g_slist_append() will crash if alloc fails Timers ------ Management of timers is handled by the 'agent' module. Other modules may use timer APIs to get timestamps, but they do not run timers. Glib's timer interface has some problems that have affected the design: - an expired timer will destroy the source (a potentially costly operation) - it is not possible to cancel, or adjust the timer expiration timer without destroying the associated source and creating a new one, which again causes malloc/frees and is potentially a costly operation - on Linux, glib uses gettimeofday() which is subject to clock skew, and no monotonic timer API is available Due to the above, 'agent' code runs fixed interval periodic timers (started with g_timeout_add()) during candidate gathering, connectivity check, and session keepalive phases. Timer frequency is set separately for each phase of processing. A more elegant design would use dynamic timeouts, but this would be too expensive with glib timer infrastructure. Control flow for NICE agent API (NiceAgentClass) ------------------------------------------------ The main library interface for applications using libnice is the NiceAgent GObject interface defined in 'nice/agent.h'. The rough order of control follow is as follows: - client should initialize glib with g_type_init() - creation of NiceAgent object instance - setting agent properties such as STUN and TURN server addresses - connecting the GObject signals with g_signal_connect() to application callback functions - adding local interface addresses to use with nice_agent_add_local_address() And continues when making an initial offer: - creating the streams with nice_agent_add_stream() - attach the mainloop context to connect the NiceAgent sockets to the application's event loop (using nice_agent_attach_recv()) - start candidate gathering by calling nice_agent_gather_candidates() - the application should wait for the "candidate-gathering-done" signal before going forward (so that ICE can gather the needed set of local connectiviy candidates) - get the information needed for sending offer using nice_agent_get_local_candidates() and nice_agent_get_local_credentials() - client should now send the session offer - once it receives an answer, it can pass the information to NiceAgent using nice_agent_set_remote_candidates() and nice_agent_set_remote_credentials() Alternatively, when answering to an initial offer: - the first five steps are the same as above (making initial offer) - pass the remote session information to NiceAgent using nice_agent_set_remote_candidates() and nice_agent_set_remote_credentials() - client can send the answer to session offer Special considerations for a SIP client: - Upon sending the initial offer/answer, client should pick one local candidate as the default one, and encode it to the SDP "m" and "c" lines, in addition to the ICE "a=candidate" lines. - Client should connect to "new-selected-pair" signals. If this signal is received, a new candidate pair has been set as a selected pair (highest priority nominated pair). See ICE specification for a definition of "nominated pairs". - Once all components of a stream have reached the "NICE_COMPONENT_STATE_READY" state (as reported by "component-state-changed" signals), the client should check whether its original default candidate matches the latest selected pair. If not, it needs to send an updated offer it is in controlling mode. Before sending the offer, client should check the "controlling-mode" property to check that it still is in controlling mode (might change during ICE processing due to ICE role conflicts). - The "remote-attributes" SDP attribute can be created from the information provided by "component-state-changed" (which components are ready), "new-selected-pair" (which candidates are selected) and "new-remote-candidate" (peer-reflexive candidates discovered during processing) signals. - Supporting forked calls is not yet supported by the API (multiple sets of remote candidates for one local set of candidates). Restarting ICE: - ICE processing can be restarted by calling nice_agent_restart() - Restart will clean the set of remote candidates, so client must afterwards call nice_agent_set_remote_candidates() after receiving a new offer/answer for the restarted ICE session. - Restart will reinitialize the local credentials (see nice_agent_get_local_credentials()). - Note that to modify the set of local candidates, a new stream has to be created. For the remote party, this looks like a ICE restart as well. Handling fallback to non-ICE operation: - If we are the offering party, and the remote party indicates it doesn't support ICE, we can use nice_agent_set_selected_pair() to force selection of a candidate pair (for remote party, the information on SDP 'm=' and 'c=' lines needs to be used to generate one remote candidate for each component of the streams). This function will halt all ICE processing (excluding keepalives), while still allowing to send and receive media (assuming NATs won't interfere). Notes about sending media: - Client may send media once all components of a stream have reached state of NICE_COMPONENT_STATE_CONNECTED or NICE_COMPONENT_STATE_READY, (as reported by "component-state-changed" signals), and a selected pair is set for all components (as reported by "new-selected-pair" signals). STUN API -------- The underlying STUN library takes care of formatting and parsing STUN messages (lower layer), Applications should only need to use the higher layer API which then uses the lower layer API. The following STUN usages are currently implemented by the transaction layer: - Binding discovery (RFC5389 with RFC3489 backward compatibility) - Binding keep-alive - ICE connectivity checks - TURN - STUN retransmission timers STUN message API ---------------- STUN message API provide thin wrappers to parse and format STUN messages. To achieve maximum cross-architectures portability and retain real-time friendliness, these functions are fully "computational" [1]. They also make no assumption about endianess or memory alignment (reading single bytes or using memcpy()). Message buffers are provided by the caller (so these can be preallocated). Because STUN uses a relatively computer-friendly binary format, STUN messages are stored in wire format within the buffers. There is no intermediary translation, so the APIs can operate directly with data received from or sent to the network. [1] With one exception: The random number generated might access the system entropy pool (/dev/urandom) if available.