CRUSH:
Connection Reliability
Using Stream Handoff
Ivan Gevirtz
September 23, 2005
CRUSH Extensions details the CRUSH product line
Background
The problem of Firewall Traversal
Voice and Video over IP products work best in a peer-to-peer mode with the minimum number of hops and intermediating equipment. The ability to do this is limited by the clients ability to establish direct connections. For nodes with direct, externally routable Internet addresses, this is not really a problem. However, most home and corporate users reside behind Firewalls, Network Address Translators (NATs), and/or Intrusion Detection Systems. These devices limit the ability for peers to directly communicate, especially if both peers are behind such devices.
External Addresses Discovery
When the two peers are behind certain kinds of NAT devices, all that is necessary for the peers to communicate is to let the other know its external address. The opposite peer can send to that address, and the data will get through. This external address can be determined by asking an address discovery device out in the Internet. Clique currently uses either its own proprietary "DUDP" discovery server, or a SIP compatible STUN server to fill this purpose. Once the external address is known, the address is relayed to the other client (often by using an out-of-band mechanism), and both clients use UDP Hole Punch to establish a direct connection. However, with certain kinds of network devices, such as symmetric NAT's, this address discovery approach won't work. Instead, Clique has a "shotgun" port spray technique which can effectively be used to establish a connection even in this kind of restrictive environment. Intrusion Detection Systems (IDS), and symmetric NAPT firewalls may block this approach, but at the moment the patent-pending "Shotgun Port Spray" technique is quite effective.
Traditional Solutions
The simplest solution to the problem of peer-to-peer connection establishment across NAT and firewall devices is to use a relay server stationed on the public Internet. One example of this is a TURN server. These servers have Internet routable addresses, and can provide a proxying solution that maintains the connection between the two clients. However, this is sub-optimal, especially when the connection bandwidth is large, as it is in the case of video. When the packet payloads are large, relay server technology becomes expensive. Establishing a direct connection between the peers is preferable. In addition, relay servers don't allow for end-to-end connection security, as they have to change packet headers during the relaying shuffle. Session Border Controllers (SBC) are similar to relay servers, however, they sit at the edge of a private LAN-type network, typically in the "Demilitarized Zone" (DMZ). Another traditional solution is to use UPnP; In some situations where peers are both behind certain UPnP devices, each peer can map to a known external address, and the situation becomes trivial.
Summary of the Invention
CRUSH
Connection Reliability Using Stream Handoff (CRUSH) is a technique to guarantee rapid, successful establishment of connections, while simultaneously improving the clients' chances of establishing a direct peer to peer connection. This technique works by having CRUSH clients initially establishing connectivity using a CRUSH relay server. This guarantees immediate connection establishment in almost all network topologies. Once the connection is established, each CRUSH client then tries to establish a peer to peer connection while simultaneously relaying media (or other connection data) packets through the server. These CRUSH clients attempts to establish a direct peer to peer connection using all available peer to peer NAT traversal techniques, including UPnP, UDP hole punch, and the patent-pending Clique "Shotgun Port Spray" technique. Once a peer to peer connection is established, each client begins to send the other its media (or other connection data) directly. The clients use this information to establish synchronization parameters between the two media paths. This data may be the (relatively small) audio channel, and the clients may utilize the RTP sequence and timing information to establish cross synchronization. This audio data is sent through both the relay channel and the peer to peer channel. When both clients are synchronized, they inform the other client, and initiate a soft handoff. This soft handoff may be mediated by the CRUSH server. At this point, the clients switch from sending the video from the relay server to the direct channel, thus releasing resources on the relay server. Or the synchronization may happen by simple use of industry standard jitter buffering technologies which induce delay in order to drop duplicate packets (one via peer to peer path, the other via CRUSH server), and handle out of order packets.
More Information about CRUSH
To enhance (shorten) connect time, and ensure connectivity between endpoints regardless of network configuration (i.e. for high-end paying customers), a service provider should add CRUSH relay servers to their media infrastructure. The CRUSH server may either reside on the public Internet or may be dual homed like a SBC, with one end on the Internet and another on a clients private network. This server proxies media in a manner similar to TURN. The connection seeking clients begin by sending their RTP media traffic to the CRUSH server. Then these clients attempt to establish a direct (peer to peer) connection utilizing all available mechanisms. Once this peer to peer connection is established, the CRUSH clients then begin the process of switching from using the relay server to using the peer to peer connection. This handoff process may require latency synchronization by matching timing information on RTP data packets. To save bandwidth, only the audio needs to be sent simultaneously along both paths. Once synchronization is achieved, video can be abruptly switched between the channels.
Initial use of CRUSH relay server. This CRUSH server can do many things to facilitate rapid peer to peer connection establishment. For example, the CRUSH server can determine if the two endpoints are addressable on the same internal network, and initiate Anti-Tromboning -- where each client connects peer-to-peer on the same internal network without any intermediating NAT. Thus, CRUSH server is like both a TURN and a SBC.
Simultaneous attempts to establish peer to peer connection using various methods, including UPnP, UDP hole punch, Clique's patent-pending Shotgun Port Spray, Anti-Tromboning, and ICE type heuristics. Meanwhile, media flows through the CRUSH server, thus guaranteeing a prompt and reliable connection.
Upon establishment of PEER TO PEER connection, sends audio along both paths. Once audio along new path is synchronized, either does a soft (both paths) or hard video handoff. Soft is preferable if bandwidth is available, but hard should work, because audio allows for synchronization at client side. In some cases, when bandwidth is plentiful, the handoff can use both audio and video, which may afford an even more seamless handoff.
CRUSH obviates/obsoletes the procedures outlined in ICE. Whereas ICE takes the approach of attempting to utilize the most preferable connection establishment methodology and gradually degrades to using the most expensive case (relay) for establishing connectivity, CRUSH takes the opposite approach. It guarantees connectivity by taking the most reliable approach (relay), and then attempts to find a better, more direct approach. Once one is found, it uses a soft handoff, which may involve synchronization by audio packets, to move to the better approach. This makes initiation take much less (real!) time, and guarantees reliability of connection.
One of the Netgear Firewalls had an option to "Block UDP Flood". Interestingly enough, it blocked UDP floods *inbound and outbound*. The inbound was expected -- to block port scans, but the outbound wasn't. The outbound is to prevent DDoS attacks -- to prevent others from using your computer to mount DoS attacks. CRUSH presents an interesting way to alleviate this problem -- Because CRUSH gives us more TIME to establish the peer to peer connection, we can slow the port spray down -- punch the hole and keep it alive and then punch the next port after a longer period of time. Instead of taking 2 seconds to do the port spray, take 2 minutes. It's an arms race, but CRUSH gives us the ability to still penetrate.
Benefits & Advantages
Detailed Steps
These steps align with the numbered steps in the CRUSH Visio diagram 1.
1. Connect to CRUSH Server
Client A calls Client B via TURN server functionality in CRUSH server. Each client streams their media to the CRUSH server. The CRUSH server swaps packets from A with those from B, thus acting as a relay server. This ensures a guaranteed connection, as well as fast call setup times.
2. Establish Peer To Peer Connection
Clients continue to exchange media via TURN functionality in CRUSH server. Meanwhile, clients attempt to establish a peer to peer link. Clients may follow an "ICE" like heuristic approach, trying various methodologies to establish the Peer to Peer connection. These methodologies may include shared subnet/intranet detection, "trombone" network detection, UPnP port mapping, SBC/ALG utilization, STUN external address discovery along with UDP Hole Punch, and/or Clique's patent-pending Shotgun Port Spray. Because the connection is already established using the CRUSH server, the Clique Shotgun Port Spray can use more time to intelligently fool IDS and port flood detectors. The Clique Shotgun Port Spray can randomize the port order it probes, and probe these ports slowly. All the while, the Clique Shotgun Port Spray sends "Keep Alive" packets through all the existing pinholes to the destination.
3. Synchronize Latency Differences Along Both Pathways
Once a Peer to Peer link is established, the two clients must prepare to handoff from the CRUSH relay server to the peer to peer connection. In many cases, this step may reduce into simply using a good "jitter buffer" which can reorder and drop duplicate packets, and handle some buffering of rate-changing senders. Since these buffers are designed to induce delay to compensate for variable inter-packet arrival delays (jitter), they may be ideal to accomplish the latency synchronization. The path using the CRUSH relay server will, in almost all cases, have more latency than the peer to peer path. This means that switching to the peer to peer path will not require the jitter buffer to increase in duration, however, it will likely have to buffer more data to maintain the buffer's time duration. Many jitter buffers handle this seamlessly. The key here is to make sure that all data packets enter the same jitter buffer, requardless of source (CRUSH server vs. peer to peer).
In some cases, the two peers may need to compute the differential latencies along the two paths. This is done by simultaneously sending timing packets along both paths and measuring the differential arrival times. The timing packets may be the audio already being sent via the relay server. In general, the peer to peer path will be faster, as it involves less intermediating hardware. There may be some special cases (e.g. optimized network paths) where this is not the case. Once the differential latency is known, the client adjusts its receive/jitter buffer to compensate for this difference. For example, if the relay server path uses a 220 ms jitter buffer, and the peer to peer path is 30ms faster (on average), than the peer to peer jitter buffer is set to be 250 ms, so that the packets will be released from both buffers at the same time. When this is done, the client is ready to handoff. In the case where the Peer to Peer path is longer, the jitter buffer can be reduced. There may be times when reducing the size of the jitter buffer is undesirable. In these cases, the viewer may experience a one time jump during the handoff. In some cases "Soft" handoff techniques may still be employable to minimize or eliminate this effect. The buffer lengthening may not actually happen on the jitter buffer. It may happen via a synchronization buffer which gets the packets from the peer to peer pathway and then passes them into the jitter buffer after the indicated delay. In addition, blank packets may be generated and used to compensate for timing.
4. Handoff from CRUSH Server
Once the latency differences are accounted for, the client sends a "Handoff" message to the remote client along the peer to peer path. Upon receipt of the remote clients "Handoff" message, client switches video path. In other words, client stops sending video to the CRUSH server, and instead sends it along the peer to peer path. The "Handoff" message may include suggested instructions on when to handoff, such as at next key frame, or at a specific timestamp or packet sequence number. It may also request a "soft" handoff. A "soft" handoff is when both audio and video are sent along both paths for a period of time. This is to ensure the handoff is completely seamless. Standard call signaling such as SIP re-INVITE or CONTENT-MODIFY.
5. Continue Call to Completion in Peer to Peer mode
The jitter buffers on the clients receive packets from both pathways. As proper jitter buffers do, it handles reordering packets, and dropping duplicates. As such, it has no problem dealing with duplicate packets from both pathways, such as duplicate audio during latency synchronization. From the player's point of view, nothing has changed, and no packets have been lost. However, once both clients have handed off, the CRUSH relay server stops getting packets. When it does so, it will timeout, and free appropriate resources. CRUSH clients could signal to the CRUSH server when they handoff, but this should not be necessary.
Additional capabilities
There may be times when a given media path becomes undesirable. For example, if a client switches from a wireless network to a wireline network, it may want to use the wireline network. In this case, the client can use CRUSH to switch networks. CRUSH would guartantee a quick handoff from one network or network type to another. It would use step 3 above to synchronize latency with the CRUSH server path. It would then use 4 to handoff TO the CRUSH server, and then 5 to continue in CRUSH server mode. It could then follow the whole process, to establish a peer to peer connection using the new network infrastructure.
In addition, the CRUSH server may serve to perform other functions, including external address discovery, determination of port-increment intervals, out-of-band message passing, including the other peer's addresses (internal and external). The CRUSH server may also bridge private networks, like SBC's. CRUSH server may also connect with metering and monitoring equipment and may tie in with billing systems. CRUSH servers may also be used to authenticate endpoints and pass connectiion establishment information and hints to opposite endpoints. CRUSH servers may provide conference calling capabilities.
Claims
General
Synchronization
Handoff
Use Relay server
Terminology
UDP Pinholing -- Most firewalls, by default, block inbound UDP but allow outbound UDP. They usually assume that an outbound UDP request to a given server may generate a response. To accommodate this, once an outbound UDP packet is sent, the firewall keeps a hole allowing response UDP packets to get in.
UDP Hole Punch -- When two peers create a UDP pinhole to each other to allow a bidirectional channel of UDP packets to traverse the firewall, this is called UDP Hole Punch. Usually, this involves out of band communication of externally visible addressing information.
UPnP Port Mapping -- UPnP devices allow a client to request an externally routable address that they can use to send and receive packets. A client that utilizes UPnP appears to others to be directly connected to the internet, at least on the given interface address and port.
Shotgun Port Spray -- Clique's patent-pending technique for allowing two clients behind very restrictive firewalls to connect is called the Shotgun Port Spray. In this technique, both clients repeatedly send packets from various addresses to various addresses. This has the effect of UDP Hole Punch on a lot of different addresses. Once one packet manages to have a destination address that had been previously punched open, Shotgun Port Spray manages to connect the clients. This is a heuristic approach, involving guessing good addresses to use. Multiple external address discovery methods can help determine what are good addresses to try are. This may not work with all firewalls, and also may trip ID or Flood alarms.
Keep Alive -- Once a UDP hole is punched, it will eventually time out. Sending dummy "Keep Alive" packets through this hole has the effect of maintaining it. To extend the time period over which the Shotgun Port Spray can meaningfully operate, Keep Alives should periodically be sent.
Related Technologies
TURN -- Transport Using Relay NAT (Relay Server) is a device on the public Internet. Both clients connect to this publicaly accessible relay server, and they can then communicate with each other. The server just reflects packets from one client to the other. Because each client initiates and talks to a public address, this technique is guaranteed to work. However, video utilizes large bandwidth, and thus these servers need to be fast and have massive bandwidth. In addition, this server adds latency to the connection, both because increased path lengths as well as relay server processing time. As such, this is an expensive method. CRUSH improves on TURN because it retains the guaranteed connection, but disintermediates the relay server once a peer to peer connection is established.
Session Border Controller (SBC) -- A Session Border Controler is a SIP VOIP application specific solution which, among other things, attempts to address firewall traversal. It has an Internet addressable IP address which enables it to function like a relay server for NAT traversal. It has to be installed in the DMZ (edge) of a service providers network. It has all the disadvantages of general relay servers, such as TURN.
Anti-Tromboning -- A technique used by Session Border Controllers to allow the media path to be "released" if two endpoints are on the same subnet. This allows them to have a peer-to-peer connection. CRUSH may employ techniques similar to anti-tromboning, however, they may happen after call establishment and the beginning of media flow. CRUSH then will have a "soft" handoff once the peer to peer connection is established. CRUSH allows for a faster time to media flow, while maintaining all the benefits of anti-tromboning.
STUN -- Simple Traversal of UDP NAT. STUN is a technique aimed at helping a client know its external address, and what kind of firewall it is behind. In CRUSH, it serves the same purpose as the DUDP discovery server. The advantage of STUN is that it is standards based, and based on SIP.
Clique Shotgun Port Spray -- This technique improves on standard hole punch methods because it can increases the chances of connecting even in an environment with restrictive NAT's, such as symmetrical NAT. It does this by first trying the port reported by a discover server (DUDP or STUN), and then sending connection packets to a range of ports centered around the reported port. This port range widens over time until a connection is established. Because each side attempts this technique, there is a good chance that an incomming packet will have a tuple which will match the NAT's state table and be sent through to the endpoint. In other words, say Client A sends discovery packets to client B. Client B, in turn, sends discovery packets to client A at the same time. If one of client B's packets' tuples match that of a discovery packet previously sent by client A, it will look to the firewall like a UDP response to client A's packet, and will be directed to client A. From the NAT's perspective, A sent B a UDP request, and B is now starting to send A its UDP response. As long as B keeps sending packets with the same tuple parameters to A before the NAT times out that "connection", B will be able to communicate with A. So these shotgun packets serve two purposes: To attempt to connect to the other endpoint, and to open NAT holes for the other client to get through. CRUSH uses the Shotgun Port Spray technique with audio packets to establish a peer to peer connection and thus handoff of the relay server.
ICE -- The ICE methodology is a heuristic for establishing a peer to peer connection in the presence of a NAT. It takes the opposite approach as CRUSH. ICE mandates trying the most desirable channel first, and then each approach in rank order. The last approach is to use a TURN server. ICE is less desirable than CRUSH because it is much slower for call setup, does not use the Shotgun Port Spray, and does not ever hand off of the TURN server.
Universal Plug and Play (UPnP) -- UPnP is a way to request an external address from a NAT device. If UPnP is available and utilized, the NAT traversal problem is eliminated.
This will be the first patent for Ivan Gevirtz.