I’m sure you know how the web works. Some kind of client device sends a request to some kind of domain that gets translated by a DNS to an IP address where the Server lives.
After some routing karate, the request is sent to the server, the server sends back a response. End of story.
If the amount of info mentioned above is enough for you so yeah. that’s all I have for you. but I hope it’s not.
Why this can be helpful?
- Identify performance opportunities and drawbacks.
- What to compress before sending down the wire?
- How to solve network issues in fetching data from the server?
- What is costing you a lot?
And finally, understandings things help you better communicate with other software engineers at least and probably with everyone else.
What happens when you visit a website?
Let’s take my previous blog post as an example, https://algorithmyou.com/2020/02/15/artificial-intelligence/how-google-maps-work-fast-route-planning/
The link above answers 3 questions to your browser
- What kind of request is it? (https)
- Where do I send the request? (algorithmyou.com)
- What is it that I want to request? (2020/02/15/artificial-intelligence/how-google-maps-work-fast-route-planning/ which is a path to a blog about google maps)
So, HTTP/HTTPS is what we call protocol. Algorithmyou.com is the network location, and the final part is the Path and Query you asked the server to find a response for.
There is some hidden information in the image above. Let’s make it clear.
In order to perform this request, your computer must have the following:
- A network connection.
- An IP Address (18.104.22.168)
- A port (80/443 in case of the web)
but, how does TCP work?
So TCP uses the IP address to establish what we call a handshake, you probably remember it from the comics all over the internet.
Your Computer: “Hey, I’m interested in sending you some data“
The Server: “Okay, I hear you’re interested in sending me some data, I too want to send you some data“
Your Computer: “Cool! I’m ready to accept from you some data as well“
And now you started the handshake, where data is being sent back and forth in a small unit called Packets.
If one of the two machines doesn’t send data in a given amount of period, the connection can be timed out. and If one of the two machines want to end the connection, it can send a close connection message.
Internet layers overview
There are a lot of layers involved in carrying out the process. So the Application for the web is HTTP with its different methods such as GET, POST.
The Transport layer is for dealing with internet ports such as 80, 443, 8080.
The internet layer is for the IP addresses. And finally, the network access layer is for MAC addresses.
So the application layer deals with data then it gets converted into sequences then gets broken down into packets that get wrapped into frames then turned into bits to be pushed down the wire.
The internet is stateless
As a result of the internet being stateless, we need to prove a lot of extra information with our request to remind the server who we are and the current step of what we’re trying to achieve.
The web request battle
The request is submitted by you. Your browser finds the server IP by looking up a DNS that takes the domain (algorithmyou.com) and returns an IP address. The IP address might be the direct address to the application server or to a Load balancer or to any other kind of reverse proxy
Let’s keep it simple, the IP refers directly to the application server that is going to handle the operation.
A TCP connection gets established. Data gets broken down into sequences then encoded into binary and sent through that connection over the wires over the internet to the other side where the server lives.
The server picks it up then responds “Oh, Cool! I have it” and starts doing whatever it was programmed to be doing.
Then the server responds again “It’s 200! and here is the result data for your request” your browser goes in happy tears because 200 means successful.
Your computer sends back what we call “ACK” which is basically your computer telling the server that the response successfully arrived and the bits and bytes are dancing.
In current HTTP versions, there is a tendency for keeping the connection open to do more requests so we save up the cost of establishing new connections. It provides a performance benefit to skip establishing and tearing down multiple times.
How data travels down the wire?
Data is sent down the wire in cycles. that’s the main reason we break it down into chunks and sequences.
It ends up that pure data you can carry per one cycle is about 1KB.
Sometimes a sequence or packet of data gets lost or arrive incomplete or corrupted, so the server will reach back to your computer asking to send the lost packet again.
In conclusion, if you have a JS file that is 500KB, it’s going to take around 500 cycles to transfer that file, that’s IF there is no data loss and everything went smooth.
Be careful with large files they’re one of your primary problems if you’re trying to build a web application.
I’m going to reverse the direction of my post starting in the coming section. Let’s start with the Atom.
With that being said on the abstract computer science level, let’s see how it happens down on a very low level. Physics.
These are electrically conductive atoms, yeah I know I’m not good at designing.
Electrons orbit around neutrons. When they’re electrically conductive it means they have 1 electron in their outermost shell.
You can think of a shell as the white circle around the nucleus.
By adding poles of negative and positive charges. Electrons are negatively charged. They want to move toward positive poles.
As one atom’s electron moves toward the positive pole, the adjacent atom’s electron starts leaving its atom and move to the next atom towards the positive pole.
That causes a chain of reactions. And that’s how electric current happens.
Congratulations. You know how electricity happens.
The birth of the bit & byte
If I zoom out of the 3 atoms example in the image above. You will see this picture.
Notice they all fall in the same direction because this is a direct current, you might have heard of alternate current which goes in both directions.
Direct Current is what computers and batteries use.
The current goes in one direction as a result of a force called voltage.
A higher voltage means more electrons. So if you set a very tiny clock cycle to check for electrons count regularly, you can get high and low voltage which are BINARYYYYYY 0, 1
That’s how you go from atoms to bits. You can say that anything with 0.5 volts is a 1 and anything below 0.3 is a 0. Fit 8 measures side by side you get a byte.
Then binary data get converted into numbers. Numbers have meanings in Encoding lookup tables such as ASCII or UTF. The server receives binary that are converted to sequences of numbers, combined together to become a meaningful message from Your computer asking the server to return my blog post.
There are a lot more details in almost every part of the cycle I mentioned.
If your web application is slow, it highly likely requires a better policy for downloading files. you don’t need to scale up your server yet.
There is a conference talk speaking of the same topic, Actually, It inspired my blog post here: How A Web Request Work