Updated on 15 Jun, 202624 mins read 61 views

Introduction

At this point, we know:

  • Why WebRTC exists
  • How NAT works
  • Why STUn/TURN works
  • What MediaStreams are
  • What RTCPeerConnection is

However, one enormous mystery still remains.

When you see code like:

const pc = new RTCPeerConnection();

nothing happens.

No call starts.

No audio flows.

No video appears.

Why?

Because creating a PeerConnection is like buying a phone.

Owning a phone does not automatically create a conversation.

Something else must happen.

You must:

  1. Find the other person.
  2. Exchange information.
  3. Agree on communication parameters.
  4. Establish connectivity.
  5. Start Communication

This entire process is called:

Connection Establishment

The Biggest Misconception in WebRTC

Ask a beginner:

How does Alice connect to Bob?

Common answer:

Alice creates RTCPeerConnection
Bob creates RTCPeerConnection
Done

Unfortunately:

Nothing works

Why?

Because Alice and Bob are strangers.

Consider two laptops.

Alice Laptop
Bob Laptop

Questions immediately arise:

Where is Bob?
What codecs does Bob support?
What IP address should I use?
Can bob receive video?
Can bob receive audio?
What encryption method should we use?

The browsers don't know.

There information must be exchanged.

This information exchange is called: Signaling

Understanding Signaling

Before media can flow, peers must communicate.

But here's the paradox.

We need communication before communication exists.

How is that possible?

The answer:

Use another communication channel first.

Analogy: Phone Calls

Imagine Alice wants to call Bob.

Before the conversation begins:

Alice needs:

Bob's Phone Number

Without it:

No Call Possible

Similarly, WebRTC peers need a way to exchange connection information.

What Is signaling?

Signaling is:

The proces of exchanging connection metadata before media communication begins.

Notice something important.

Signaling is NOT media transport.

It does not carry:

Audio
Video

Instead it carries:

Connection Information

What Information Must Be Exchanged?

Let's think like engineers.

What does Alice need to know?

Capability Information

Example:

Alice supports:

VP8
VP9
H264

Bob supports:

VP8
H264

They must discover:

Common Codec = H264 or VP8

Network Information

Example:

Alice Pubic Address
Bob Public Address

must eventually be shared.

Security Information

Example:

Encryption Fingerprints
Keys
Certificates

must be exchanged.

Media Information

Example:

Audio Enabled
Video Enabled
Screen Sharing Enabled

must be known.

Signaling Is Not Part of WebRTC

This surprises many people.

WebRTC does NOT define signaling.

WebRTC intentionally avoids specifying:

How Signaling Works

Why?

Because every application is different.

Example:

Google Meet: Backend Signaling

Discord: Backend Signaling

WhatsApp: Custom Signaling

Gaming Platform: Custom Signaling

The WebRTC team decided:

Let developers choose their own signaling architecture.

Common Signaling Technologies

Applications typically use:

WebSockets: Most common.
	Browser <---> Server

Socket.IO: Popular abstraction

HTTP APIs: Possible but less common

gRPC: Sometimes used in backend systems

The Beginning of a Call

Let's follow a real call.

Alice clicks:

Start Call

The application creates:

const pc = new RTCPeerConnection();

At this moment:

No Connection
No Media Flow

The browser merely initialized the communication engine.

Step 1: Add Media

Alice obtains media.

const stream = await navigator.mediaDevices.getUserMedia();

Tracks are attached.

pc.addTrack(...)

Now the PeerConnection knows:

Audio Exists
Video Exists

Step 2: Create an Offer

Now comes the first mysterious WebRTC operation.

const offer = await pc.createOffer();

What Problem Does the Offer Solve?

Imagine buying a house.

You tell the seller:

I offer: $100,000

Move-in Date: July

Furniture Included

You are proposing terms.

The seller can:

Accept
Reject
Counter

The offer describes what you want.

WebRTC's offer works similarly.

Definition of Offer

An offer is:

A proposal describing how Alice would like communication to occur.

It contains:

Supported Codecs
Media Types
Security Parameters
Network Information
Transport Information
  • Offer is not media.
  • Offer does not contain Audio/Video.
  • It contains Metadata about communication.

The SDP Document

The offer is represented as: SDP (Session Description Protocol).

SDP is simply a text document.

Example:

Audio Supported
Video Supported
Codec List
Transport Details

Think:

Meeting Contract

between browsers.

Why SDP Exists

Without SDP:

Alice would need to manually explain:

I support VP8
I support H264
I support Opus
I support Encryption X
I support UDP

for dozens of parameters.

SDP Standardizes this process.

What createOffer() Actually Does

When the browser executes:

createOffer()

it examines:

Media Tracks
Available Codecs
Security Configuration
Transport Configuration

and generates an SDP document.

Example Conceptual SDP

Not real SDP.

Simplified for learning:

I Support:

Audio:
 Opus

Video:
  VP8
  H264

Transport:
  UDP

Encryption:
  DTLS

Video Sending:
  Enabled

This is essentially what an offer communicates.

Step 3: Send Offer to Bob

The application sends:

Offer SDP

through signaling.

Example:

Alice Browser
|
WebSocket
|
Signaling Server
|
Bob Browser

Not WebRTC, this is signaling infrastructure.

Step 4: Bob Receives Offer

Bob receives:

Offer SDP

The browser analyzes it.

Question include:

Can I support these codecs?
Can I suppport these transports?
Can I support this media?

Step 5: Create Answer

Now Bob generates:

const answer = await pc.createAnswer();

What Is an Answer?

An answer is:

Bob's response to Alice's proposal.

Think:

Offer
   ↓
Answer

Just like contract negotiation.

Example:

Alice says:

I support:
VP8
VP9
H264

Bob says:

I support:
VP8
H264

The answer identifies:

Common Ground

Why Answer Exists

Communication only works if both peers agree.

Example:

Alice: Only Codex X
Bob: Only Codec Y

Result:

No Communication Possible

Answer ensures compatibility.

The Negotiation Process

The complete negotiation becomes:

Alice
|
Offer
|
Bob
|
Answer
|
Alice

After this exchange.

Both peers understand:

Media Configuration
Codecs
Security
Transport

Important Observation

At this point:

Still No Video
Still No Audio

Offer/Answer only establishes:

Agreement

not connectivity.

The Missing Piece

Imagine Alice and Bob agree:

Let's talk.

But where?

What phone number?

What address?

What route?

We still need:

Connectivity Discovery

This is where:

ICE enters the story.

The Real Lifecycle So Far

Create PeerConnection
         ↓
Add Media
         ↓
Create Offer
         ↓
Send Offer
         ↓
Receive Offer
         ↓
Create Answer
         ↓
Send Answer
         ↓
Receive Answer

Now both peers agree of communication parameters.

Next they must figure out:

How to actually reach each other.

Mental Model

Think of a WebRTC call like planning a meeting.

Offer/Answer

Answers:

What are the meeting rules?

ICE

Answers:

Where is the meeting?
How do we get there?

Media Transport

Answers:

Let's start talking.

 

Buy Me A Coffee

Leave a comment

Your email address will not be published. Required fields are marked *