updated 30 Dec, 2022

TCP/IP Networking introduction

The article gives overview of main networking concepts:

  • what's a host and network
  • route between hosts
  • clients and server hosts, connecting nodes
  • IP and MAC host address
  • networks examples
  • IP routing: how the data is sent within network and between networks
and explains why something works exactly in this way.

What's a host?

Host is a device that sends data to other devices or receives data from other devices. Host examples:

  • laptop or phone that loads a web page from the Internet
  • web browser on a desktop computer that sends user data to web server when user logins on the web site
  • smart TV that streams video from the video service
  • WiFi outlet that receives commands sent from mobile phone application

A pair of devices could be connected physically to each other by wire or WiFi. To enable data exchange, hosts are connected to other hosts directly or via intermediate devices like switches or routes. Groups of connected devices are called networks.

A network is TCP / IP network if the data between devices is sent by TCP / IP network protocols. A network protocol is a set of rules for the data format and how the data is exchanged.

Route between hosts

As mentioned above, direct physical connection between two hosts is not required for data exchange: data could be sent via several intermediate connected hosts: host A could be connected to host D1, D1 connected to D2, D2 to D3, D3 to host B. In this case, hosts A and B could exchange data sending data from A to D1, from D1 to D2, from D2 to D3, from D3 to B.

Route from A to B is sequence of hosts in order passed by data sent from A to B:
A → D1 → D2 → D3 → B
Host D1 is next host in route for A, D3 is next host in route for D2 etc.

Roles in data exchange: client and server hosts

A client is a host that starts data communication to ask other host to do something: send requested data back (for example, web page or video stream) or update the data on the host. The data sent by a client is called a request.

A server is a host with installed software that can respond on client requests. The data sent back by a server is called a response. Server examples:

  • weather forecast web site: browser on laptop (client) sends the user location and time range, and the web server returns the weather forecast web page (response)
  • database server saves the user first and last name in the database
  • database server retrieves list of orders in online store of the user

Also there are devices that don't send or receive requests. They are used to connect clients and servers and only re-send the received data to other device. So these devices are always in the middle in the data route from the source to destination. Such devices are called nodes and not considered as hosts.

Host address

Each host should have an address to identify it. The sent data contains the receiver (destination) address to deliver the data to the correct destination host, and the sender (source) address to send the possible response back. The address should be unique within a group of hosts to distinguish the host from others.

32 bit IP address is used to identify hosts in TCP / IP networks. Usually it's represented as 4 decimal 8 bit numbers from 0 to 255 divided by dots:

50.3.253.2
192.168.0.1
10.20.0.159
It's assigned to a host by software by some rules, like uniqueness or its values range, which will be discussed later.

Also network devices have hardware 48 bit MAC addresses that is written as 6 groups of 8 bits divided by columns or hyphen:

98:fa:9b:37:67:3b
Such address is set in the hardware and rarely changed because it's assumed the hardware manufacturer made it unique:
  • the address space is divided to ranges without overlapping, and each range is allocated to only one networking hardware manufacturer
  • a hardware manufacturer uses an address from its range only for one piece of hardware

What's a network?

As mentioned before, network is a group of connected devices that exchange data and can access the similar resources. Let's see on examples what it means. The simplest network is 2 computers connected by wire that send data to each other.

More complex example, is all computers in a classroom that can exchange data with:

  • other computer in this classroom
  • set of public Internet web sites
  • specific school resources: printer, school web site

All classroom computers are connected to a router, and router is connected to outside world by wire. So a classroom computer sends data to the router and router sends it in 2 possible ways:

  • other computer in the classroom if the router sees the destination IP is in the classroom network
  • otherwise it's sent outside by red wire, Internet or school computers outside the classroom

How to check by IP address if a host belongs to specific network?

How does the router determines if the destination is in the classroom network? Any host (computer) in the classroom has IP address that looks like 10.50.20.NNN and with same first 3 bytes: 10.50.20 so if the IP address looks like 10.50.20.NNN then it's inside the network 10.50.20.0 (NNN variable part replaced with 0 byte to make the network IP address).
If a network address is 10.50.0.0 then all its hosts IP look like 10.50.KKK.NNN or start from 10.50, all non-zero bytes prefix from the network IP.

10.50.30.5 doesn't start from 10.50.20 so it doesn't belong to 10.50.20.0 network, or the host is outside the network.

Why did we connect all computers to a router?

Why do all computers connected to one device? For simplicity in building network, because connecting them with other require a lot of wire and connection points (the most common and robust way to connect now), that is too complex to do in case of many computers in network. Connecting 50 computers to central device require 50 wires and 100 connection points (ports) but connecting with each other requires 1225 wires and 2450 connection points.

Two networks, connected to routers

What if we have two classrooms? All computers in each classroom could be a separate network because it's possible a computer in one classroom can't send data to computer from another classroom, or access different set of web sites as from other. Again, it could be done to simplify physical structure as well.

We see, every classroom has own router to connect the class computers into it, and these routers are connected to Router M So if data is sent from classroom A computer then it passes Router A and Router M. These networks form hierarchy:

  • Network A 10.50.20.0 for classroom A
  • Network B 10.50.30.0 for classroom B
  • School network 10.50.0.0 that contains router M and networks A, B.

Router A has two IP addresses: 10.50.20.1 in the classroom A network so classroom A computers can communicate to it, and 10.50.0.2 in school network to communicate with router M. Similarly router B has two IP addresses: 10.50.30.1 in the network B and 10.50.0.3 in the school network.

If a classroom A host sends data to a host from classroom B, the data will go through router A to router M then to router B. So it's possible to restrict such data exchange by blocking requests from network A on router M or router B.

Also having such network hierarchy allows to divide address space between groups of equipment in organization f.e. classrooms, and better manage requests on router M: send it directly to appropriate router for sub-network, for example to router A if the request destination IP like 10.50.20.NNN.

To summarize, two main functions of a router:

  • Decide to which connected network re-send the data based on the data destination IP address. Devices from several networks could be connected to a router, router has own IP per connected network to enable the data exchange
  • security: block data exchange (don't re-send it further) based on the data destination IP address

Two networks, connected to switches and router

Another layout of network is to connect classroom computers to a switch. A switch doesn't have IP address and just re-sends data between two computers connected into it, for example from 10.50.30.14 to 10.50.30.1. And connect switches to a router with IP address per classroom network so classroom computers can communicate to it:

How the data is sent in this network:

  • if a classroom host sends data to another host in same classroom then it's sent to the classroom switch and the switch sends it to the destination host.
    Example: 10.50.20.7 sends data to switch A, switch A sends it to 10.50.20.23
  • if a classroom host sends data to host outside the classroom network then the data is sent to the classroom switch, the switch sends it to the router, router sends it outside by blue wire

The same rules are used to check if the host belongs to the network: check if IP address has the network IP address prefix.
Example. 10.50.20.7 and 10.50.20.23 are in the 10.50.20.0 network because their IP addresses start from 10.50.20

Example. IP route from 10.50.20.7 to 10.50.30.2:
10.50.20.7Router M10.50.30.2
doesn't include switches because they don't have IP addresses.

IP routing

Let's see how the data is sent between hosts in networks that are not connected to same router:
  • Host 10.50.10.2 from network 10.50.10.0 on the left wants to send data to the host with 10.50.40.5 IP address. The data is sent to router A. The destination IP is in another network 10.50.40.0 on the right so it should be sent outside
  • 10.50.40.0 network is not connected to the router A, so it should be sent to router B or to router C to reach other networks.
  • Router A gathered information about other networks and knows that router B provides way to network 10.50.40.0 so it sends data to router B by 10.50.20.2 IP address
  • Router B knows that the 10.50.40.5 destination IP is in the network 10.50.40.0 connected into it, so the data is sent to the destination host 10.50.40.5

So data IP route is: 10.50.10.2Router ARouter B10.50.40.5

What is the router A did is finding the next point in the data delivery route to the destination, based on the destination IP address, or shortly IP routing.

Why do we need MAC address, if we have IP address?

A software module that sends data to another host with specified IP address, uses other module (on lower level, closer to hardware) which sends data by wire to the next host in the IP route. The latter uses physical MAC addresses for the source and destination, because it doesn't know anything about IP addressing. So IP addresses should be translated to physical MAC addresses.
More precisely, sending data to IP address is done by IP protocol (layer 3), IP protocol uses data link layer protocols (layer 2) to send data physically. Data link layer protocols use MAC addresses and don't aware of IP addresses.

From above, we need MAC address of the next host in the IP route. Every two subsequent hosts in IP route are always in same network so we can look for the next host in the same network. The sender host gets destination MAC address from the destination IP address:

  • Look up in the local system cache (called ARP table) that maps hosts IP to MAC addresses
  • If the destination IP is not found in the ARP table i.e. this MAC is unknown on the sender host: sender asks ALL hosts in its network who has this IP sending ARP requests, and get response from the host with the specified IP and its MAC address.

The main point: data is sent by MAC address within same network only, between two subsequent hosts in IP route.

Why do we need IP address if we have MAC address

As we saw above, IP addresses assigned to hosts to group them into networks. So they are changable to maintain the network hierarchy and uniqueness within many networks and used in IP routing, but MAC addresses are considered constant and their values are quite random and don't contain any information about the network structure so can't be used in routing.