FYI XMPP does text and presence very well and in a straightforward way as opposed to SIMPLE; and the Jingle specs are rather straightforward too. XMPP is especially well suited for mobile, no idea how SIP fares there but it's lightyears ahead of Skype atleast.
However every P2P internet technology these days have to worry about atleast two layers of NAT (one on your end, one on the receiving end), which makes streaming information in a smart way (e.g. without involving a proxy) a nightmare to solve. IIRC ICE solves this problem by taking the two connecting parties and connect them without actually being a proxy, but, implementation is poor...