Updated: Jun 16, 2022
In our previous article we studied Netflix system design. In this article we will discuss how to design a high performance tiny URL or URL shortener service.
Design a URL shortner service.
For a given input URL (long URL) our service should generate and return a shortened URL to the user.
When the user clicks on the shortened URL, our service should redirect the user to the original URL.
The system should be scalable, highly available.
Our system should be performant.
For security purposes, the generated short URL should be as random as possible, it should not be predictable.
A tiny URL or URL shortener is a service which takes a long URL and converts it into an equivalent short URL containing lesser characters. When the user clicks this shortened URL, he would be redirected to the same destination address as the original URL.
For example, lets say our original URL is:
If we pass this long URL to a URL shortner service,
Why URL shortening is necessary?
There are several use cases where a shortened URL is preferred over a long URL:
Short URLs look clean when they are placed on websites, files, social media etc.
Shortened URLs are useful on services that have a restriction on number of characters that can be posted on them. Ex: Tweets on Twitter.
To mask a URL. Sometimes you would not want to expose the original URL as is to the end users. Short URLs in such a case do the job of masking the original URL while preserving the destination address. Ex: For masking affiliate links.
Some URLs are just too long and it is better off having a shortened version to represent them. Short URLs in such a case can be used as a placeholder.
Shortened URLs can also be used for tracking purposes. Most URL shortner services usually provides us with additional metrics on the shortened URL like number of clicks etc, which can be extremely useful for business insights.
Social campaigns with shorter URLs perform better. People tend to trust short URLs as compared to longer ones, especially when the long URL contains special characters. As a result shortened URLs produce better social engagement.
For our design it is safe to assume that a system like this would be read heavy i.e. the probability of users creating a short URL from long URL will be less as compared to users clicking on a short URL and getting redirected to the original URL.
For the purpose of this example let us assume that the read to write ratio is 100:1 which means for every short URL created there will be 100 redirections or 100 clicks on that short URL.
Another important metric that will be useful for our design is the number of short URL generation requests that our service is expected to receive per month. It is important that we clarify all these details with the interviewer beforehand because it will give us better clarity to proceed with our design. For our case let us assume our service gets 100 million requests per month on an average.
Considering all the above assumptions:
Total no. short URL generation requests per month = 100 million.
Therefore no. short URL requests per second = 100 million /(30 days * 24 hours * 3600 seconds ) ~ 40 URLs/sec.
Total short URL clicks or redirections per second (assuming 100:1 read to write ratio) = 40 URLs/sec * 100 = 4000 URLs/sec.
Most popular browsers support 2000 characters in a URL. So, lets say our long URL will at max take up to 2000 characters or 2KB.
Most URL shortener services create a short URL with 15-16 characters (We will see more on this later in our discussion). So we can say the short URL size is ~ 16 bytes.
Additionally we might need few more bytes to store metadata like creation timestamp, user details etc, lets say 50 bytes.
So, total storage needed per shortened URL ~ 2.1 KB.
Storage needed for 1 month = 2.1 KB * 100 million = 210 GB.
Storage needed for 1 year = 210 GB * 12 months ~ 2.5 TB. For 5 years this will be 12.5 TB and 25 TB for 10 years.
At first look designing a URL shortner service may not look like a big deal. All that we need is a server that runs the logic for converting the long URL to short URL, a database to store the mapping of long URL to short URL and when the user requests for the short URL you redirect the user to its corresponding long URL using this mapping data. Simple right?
Yes, this approach works fine as long as we have to serve only a small set of users. But what if our service gets thousands of requests per second? Will our server be able to handle the load? Will our database be able to handle the volume of data? How do we ensure that the short URLs generated are all unique? How to make sure that the data stored on our database is not corrupted?
All these subtle aspects associated with designing a URL shortner service makes it such a popular system design interview question as it allows the interviewer to test the candidate on various design aspects.
Interview Tip: In most system design interviews you will be given a loose ended or generic problem statement (similar to the one given here). It is your job as an interviewee to ask relevant follow up questions and streamline the requirements.
Want to master coding? Looking to learn new skills and crack interviews? We recommend you to explore these tailor made courses:
For this particular use case, an immediate question to start with could be, what is the scale that our service needs to operate at? What are the monthly active users and what are the number of requests/second our service can expect? This will help us decide on the design aspects like server specification, type of database to use, length of the shortened URL that our service has to create and also calculate the data estimates.
We will start with a simple approach first and evolve our design incrementally as we move along. First, let us think of what components we need in order to build a service like this. Users send request to our service from a mobile or desktop browser, our service should have the logic/code to convert this long URL to short URL. We will get into the details of how we can build this logic later in our discussion, for now let us consider that we need some kind of logic and this logic has to be hosted somewhere on a server for our users to reach us. Once the request reaches the server, the code on the server runs to generate the short URL and this will be sent back to the user as a response.
We would also need to store this short URL to long URL mapping data in a database so that the next time when the user clicks on the short URL our service redirects the user to the original URL destination. We will see what type of database to use for our service later in our discussion, for now lets just assume we need some kind of a database to store this data.
With the above design, now when the user clicks on the short URL, the request reaches the server, the server then connects to the database to get the short to long URL mapping and redirects the request to long URLs destination address and there you go, we have our URL shortner service!
As you can see, the above approach is simple, we just need to have a sever and a database to make this work. However, this approach of a single server hosting the shortening logic and a database to store the mapping works fine only for a small system that has to serve less number of requests, but it wont scale. If our service becomes really popular and if we start getting thousands of request per second, this design wont work, lets see why.
As we start getting thousands of requests per second, a single server instance may not be able to handle all the load. The increased load may cause the server to crash resulting in downtime in our service and bad experience to users of our service. We can overcome this problem either by scaling up (vertical scaling) or scaling out (horizontal scaling).
In vertical scaling we use a bigger machine having more memory and CPU to cope with the increased load. As the numbers of requests grow, we add more memory and CPU to the existing machine (server instance) to meet the demand. However this approach has a limitation, we can keep adding more memory and CPU only up to a certain extent because of hardware the limits.
Another approach to scale the system is using horizontal scaling. In horizontal scaling as the load increases we add more machines as opposed to using bigger machines. This approach works really well for large systems serving huge number of requests.
To optimize this further, we could use a hybrid model having a mix of horizontal and vertical scaling approaches, i.e. we keep adding more machines to meet the demand (horizontal scaling) and each of these machines is a big machine with adequate CPU and memory depending on our cost estimates.
With all of these ideas in place our system would look as shown below:
You might be wondering looking at the above diagram, when there are n server instances, how does the system know which server has to handle which request. This is a valid point, we just cannot have multiple servers and expose them as end points to users. We need to have an intermediate component which interprets the requests and re-directs them to specific server instance using some of kind of a logic. This job is done by the load balancer. Load balancer as the name suggests, balances the load by distributing the requests across our servers. There are various kinds of load balancers, each type having its own logic on how to distribute the load, but for our use case lets keep this simple by assuming the load balancer redirects the requests depending on which server is free or available to process the request. The load balancer also acts as a single point of contact for all our users, they don't have to know the individual server IP addresses of our sever instances. All the user requests land on the load balancer and the load balancer is responsible for re-routing these requests to a specific server instance.
So far our system looks as shown below:
We can optimize this design further by adding a caching layer to our service. With the current design our servers have to talk to the database every time the user clicks on the short URL, in order to retrieve the short URL to long URL mapping. Database calls can be slow and expensive as compared to fetching the data from a cache which is essentially an in-memory storage. We can improve the response time of our APIs by caching frequently accessed short URLs so that when we get a request for a short URL our servers will first check to see if the data is available in cache, if yes it retrieves the data from cache, otherwise it fetches it from database.
Apart faster reads, cache can also help reduce the load on our database. This is because data available in cache can be served from the cache itself without having to reach to the database, this in turn reduces the number of calls that we make to the database, thereby reducing the load on our database.
There various caching solutions available in the market. Few popular ones include Redis, Memcached, Varnish etc. We can pick any one them for our use case.
If you want to learn more about caching, when to use a cache and various caching strategies, we highly encourage you to go through this article on caching.
Short URL Logic
We now have most of the components in place to build our system, but we have still not discussed the most important aspect. How do we convert the given long URL to a short URL?
Short URL Composition
Before understanding how to convert a long URL to short URL, let us go a level above and discuss how the shortened URL should look like. Most popular URL shortening services have two parts to a shortened URL. The first part is the domain name of the service and the second part is a set of random characters.
It is important we ensure that the domain name of our service is crisp, yet meaningful. For the purpose of this example, let us assume the domain name of our service is urls.com (short form for URL shortner). The second part of the short URL is a string formed by a set of random characters. This random string should be unique for each short URL created for a long URL.
Random String Length
What should be the length of the randomly generated string? Length of the random string should be such that it is not so long that it defeats the purpose of having a shortened URL, nor too small either, otherwise we would run out of combinations after a certain point.
We can generate random characters using algorithms like base62 or md5. Both base62 and MD5 algorithm outputs consist of 62 character combinations (a-z A-Z 0-9). Lets say we decide to go with random character string of length 7, we would have 62^7 ~ 3.5 trillion combinations to work with. This looks sufficient because even if our system gets 1000 requests per second, it would take 110 years to exhaust these many combinations. So if we combine both the parts (domain name+random string) the total length of the short URL that our service creates would be equal to 15 characters, , 8 characters for the first part, 7 characters for the second part, plus an additional 1 character for '/' in between, which seems acceptable (Ex: urls.com/3NVyw2j).
How to generate random string?
Now lets see which algorithm to choose for generating the random string. There are various hashing or encoding algorithms which we can be used for this purpose. Base62 and MD5 are the popular ones. Both base62 and MD5 produces output consisting of 62 characters combinations (a-z, A-Z and 0-9). Base62 takes a integer input and MD5 takes string as the input and generates a random string output. For base62 algorithm, we need to convert our long URL string into an integer first and then pass it to the algorithm and for MD5 algorithm we can directly pass the long URL as input. Most languages have ready to use implementation (core/third party libraries) for both the algorithms.
Now lets us analyze each of these algorithms and see which one would fit better for our use case.
Using MD5 Algorithm
MD5 algorithm takes a string as input and produces a fixed length output. For each input string MD5 algorithm mostly produces unique output, there is still a chance of it producing the same output for two different inputs, but this is very rare. But the thing with MD5 algorithm is that it produces a really long output. This is a problem because we have a restriction of 7 characters for our random string. We can solve this problem by taking only a part of the output produced by MD5 algorithm, say for instance we consider only the first 7 characters of the MD5 output. But there is a problem with this approach as well. The chances of collision increases if we consider only the first 7 characters of the MD5 output, meaning for different long URLs we might end up with the same 7 characters as output which is not what we want. For each long URL we always need to have a unique set of 7 characters in the short URL or it will result in data corruption.
Lets see if we can do this more efficiently using base62 algorithm.
Using Base62 Algorithm
Base62 also produces an output consisting of a combination of same 62 letters as md5. Base62 takes integer type as input. So we need to first convert the long URL to a random number then pass this random number to the base62 algorithm.
Can we use the output of base62 as is and store it into the database? The answer is no because, if you notice we first convert the long URL to random number and then pass it to base62 algorithm. So there is a possibility that we end up getting the same random number for different long URLs and therefore if we store the output of base62 algorithm directly into the database it can lead to data corruption.
This means, each time we generate random characters using base62, we first need to check if that data is already present in database, if yes we regenerate the random number and again pass it to base62 algorithm. We repeat this until we obtain a unique random string and then store it in the database. Now this is clearly not an efficient approach.
Also apart from being inefficient, the above approach works if our service is just hosted on a single server instance. If we have multiple servers running to meet the demand, this approach may not always work. Imagine having two replicas of the server running, and a request comes into to each of these servers simultaneously, assume the base62 generates the same random string for both the long URLs. Now when each of these servers try to save this data into the database which can lead to data corruption.
This can be solved by using constraints in DB query like insert if not present, and if this query fails regenerate the random string, but again we are adding to the inefficiency and complexity.
How can we make this work efficiently? If you look at the problem at its core, the reason why we end up with duplicate random string is because we pass duplicate random number to base62 algorithm. If we can somehow pass a unique random number to base62 every time which can avoid this problem completely.
And this brings us to our final approach which is the counter based approach.
Counter Based Approach
The idea is to make use of a counter instead of generating the random number each time. When the server gets a request to convert a long URL to short URL, it first talks to the counter to get a count value, this value is then passed to the base62 algorithm to generate random string. Making use of a counter to get the integer value ensures that the number we get will always be unique by default because after every request the counter increments its value.
For example, lets assume our counter starts from 1. Upon getting the first request to convert a long URL to short URL, our service will ask counter to provide a unique number, counter returns 1 to our service, it then increments its current value. Next time our service requests the counter for a unique number, it returns 2, for the 3rd request it returns 3 and so on.
As you can seen this is a much efficient compared to our previous approaches where we had to check if the random string obtained is unique or duplicate each time before we could write it to our database.
Even with the counter based approach we have to take care of few things. For instance, the here counter is a single point of failure, if the counter goes down our entire service stops working. What if the counter runs out of numbers? Also if our service has multiple server instance running to manage the load and if two or more servers request the counter for number at the same time, will the counter be able to handle this by properly incrementing and send them unique numbers?
If we plan to implement the counter ourselves, we would have to handle all these complexities ourselves and for that reason we would be making use of a third party service called Apache Zookeeper which takes care of all these complexities for us so that we just have to focus on the core business logic.
Let us briefly understand Zookeeper. Zookeeper is an open-source distributed service which can be used for maintaining configurations, managing co-ordination between services in a cluster, distributed synchronization, naming nodes in a cluster and many other things. In addition to the above functionalities zookeeper also provides a shared counter. The counter provided by zookeeper contains a range of values within it. For each new server created in our service, zookeeper assigns one of these ranges.
If a server runs out of counter range assigned to it, zookeeper assigns it a new range. Zookeeper is a distributed service, if the counter node fails, zookeeper automatically spins up a new counter instance. Also as we get more users using our service, we can add as many servers as we want to handle this load, zookeeper takes care of assigning and maintaining the counter values for all these servers.
What database would be ideal for our service? Should it be SQL or No-SQL? In order to be able to make this choice, we need to understand the type of data that we need to store in our database and also the scale.
Coming to the type of data, we only need to store the short URL to long URL mapping. When the user gives us the short URL, we just need to get the long URL using the 7 character random string in the short URL. We do not need to perform any relational queries or joins to store or get this data. So we do not need a SQL database. As far as No-SQL databases are concerned, there are various types of No-SQL databases like document database, graph database, column database, key-value database etc. As we can clearly see, our service uses a simple key value mapping. Therefore we can go with a No-SQL key-value pair database. Some of the popular ones are Amazon DynamoDB, Oracle NoSQL Database, InfinityDB, Aerospike etc. As far as the scale is concerned, most of these No-SQL solutions scale well with increasing data.
It is a good practice to keep the number of active server instances in our system elastic. During peak hours we need to have more server instances running to cope with the increased load, other times we reduce this count to a preset value.
Having this flexibility with the number of server instances is important for various reasons:
Makes sure our system does not go down when the number of requests spike during peak hours.
On cloud we are mostly billed per hour or on demand basis. So, if deploy our service on cloud, running extra server instances, more than what is needed, can have an impact on cost.
If we do not run our service on cloud, we will have to bare the cost of having extra reserved servers upfront, to ensure our service is always available.
Most popular cloud solutions provide the auto scaling feature. If we deploy our service to a popular cloud provider like AWS we can easily leverage its auto scaling capability. Once integrated, it will automatically increase or decrease the number of active server instances depending on the number of requests we get.
Now putting together everything we have discussed so far together, our URL shortner service design looks as shown below:
Lets summarize what we have studied so far and how the final design works.
URL Shortening Request
When our service gets a request for converting a long URL to short URL, for the sake of simplicity lets assume the long URL in request is www.google.com. When this request reaches our service it first hits the load balancer. Load balancer scans through all the available server instances, checks which server is free to take the incoming request and forwards the request to the appropriate server instance.
The server then performs the following steps:
Contacts zookeeper to get the counter value and passes the obtained counter value as input to the base62 algorithm.
Base62 returns a unique random string which is then appended to create a short URL. For the purpose of this example let us assume the random string obtained is 3NVyw2j, so the short URL will be urls.com/3NVyw2j.
This long URL to short URL mapping is stored in the database.
User Clicks On The Short URL
When this short URL is clicked or pasted in browser, the request will again hit the load balancer which forwards the request to an appropriate server instance.
The server now does the following steps:
Extract the random string from short URL (3NVyw2j in our case).
Using this random string, retrieve the corresponding long URL from database.
Redirect the request to the original long URL destination address.
Save the short URL to long URL mapping in cache so that the next time the user requests this short URL we can quickly get it from cache instead of querying the database.
And finally lets analyze if our design accomplishes all the functional and non-functional requirements. Our design covers both the points mentioned as part of the functional requirement.
As far as the non-functional requirements are concerned, our design ensures that the system is scalable, we have seen throughout our design discussion how the system can increase the number of server instances when there is an increased load and how the load balancers does the job of balancing the load among the available server instances. Also our choice of database is made keeping scalability in mind.
Is our system highly available? Yes, if one of our server instances goes down, the load balancer re-directs the request to one of the other available ones. Also our auto scaling policy ensures a new server instance is created to replace the failed instance. The counter that we used is a distributed counter and the zookeeper ensures it is always available and not a single point of failure.
The system is performant. The algorithm for long URL to short URL conversion is an efficient one. We have also made use of caching, to cache the most frequently requested short URLs which also helps improve the performance.
And for the third non-functional requirement, since we use a counter value as input to the base62 algorithm we are guaranteed to get a random short URL each time. And with this we have covered all the design objectives.
That is all for this article, thank you for taking your time to read this. If you have any questions or doubts, please let us know in the comments section below, we will be happy to answer you.
If you found this article useful, do not forget to subscribe to our website, your support motivates us to bring out more such articles in future (scroll down to the bottom of the page to find the subscription form).
Code Recipe Limited Time Offer: Get 100% discount on Code Recipe Membership Plan. Join now and get exclusive access to premium content for free. Hurry! Offer ends soon - Join now.