A brief overview on how bots interact websites (and how you can too)

The two main method people are using to interact with a website through code.

João Ramiro
4 min readFeb 7, 2021

Nowadays every site that is not just a simple landing page, needs to fetch information from their servers. They do this via one thing, an API (Application Programming Interface). An API is nothing more than a set of known URLs that the website accesses that will retrieve a specific piece of information or trigger a certain action.

Imagine instagram for instance, to fetch the latest stories they probably have a link such as api.instagram.com/getStories where they would send a message with the user that wants to load stories:

And some of these sites make these APIs open to the public so that developers like you and me an build on it. (Twitter lets you make tweets and a panoply of other things via and their API).

This sounds like a good thing but then why doesn’t every website have an public API? seems like it would be mutual beneficial right? Well there are two main reasons:

  1. The website is very simple and an API would not be of much use — why would your local coffee shop make an API available?
  2. The website wants to hide those type of features from others with evil intent and protect against it bots.

However, neither of these 2 are actually impediments for a computer program interact with these website programmatically.

If you think about it, when you are interacting with a site or an application, say Instagram, all you are doing is clicking on certain button or scrolling to a certain part — and that get easily be simulated by a bot / through code. Furthermore when you click on a button or some other thing, all Instagram is doing is running some code on the background , that in turn makes requests to their API to fetch some information or publish your post etc. A robot can certainly do that as well!

Here are how these two method are actually implemented:

1. Simulating user actions

One of the most popular tools for browser automation, a process on which a bot can simulated user actions, is Selenium. You can use it the simulate clicks, entering text, scrolling, reading information from the page and much much more. Lets see how that can achieved in the following python code:

What is doing is simply:

  1. Opening a browser
  2. Clicking a button that has the id seeMore.
  3. Reading the information after it is loaded onto the page.

2. Simulating Requests

Like it was stated before most actions you perform on a site trigger a request to be sent. Using your favorite browser you can easily see every request that is being sent. All there is to do after that is replicating the request you wish to simulate

If the code on the previous section, triggered a request to example.com/api with the body {'payload':'GetInfo'} using the requests package in python would achieve the same outcome:

Here’s what happening:

  1. A POST requests is being sent to the website’s servers
  2. After the request being validated as successful the data it retrieved can be read.

Which Method is Better

Now which method is best it really depends on the site and on what you are trying to achieve. If you want to read information from requests that the website is doing when you are performing a certain action, the Requests route will likely be the recommended one. However this method may not always be suitable. For example when you click a button that shows and hides some information in the page, and all the site is doing is running some javascript to show and hide it, no requests are actually being performed, and so, the Selenium route would be the suitable one.

Conclusion

Both bots and knowledgeable developers (which you are now a part of 😉) can make your code access a website in ways you were originally not intended to. When you think about it whether it is simulating a user making actions, or simulating how a key makes a lock rotate, if you are knowledgeable enough, you can always find a way to do it through some unintended way, whether you are a developer or the LockPickingLayer.

Did you like this article maybe you’ll like:

  1. How I made an instagram bot that publishes a post every day

--

--

João Ramiro

Researcher, Engineer, Entrepreneur looking to share some of his insights