<< Back By Panda-Admin | Nov. 15, 2023

Inside a Torrents Tracker bot

In this text I will shortly tell and show what Torrents Tracker bot is made of, what services, libraries, solutions were used.

Intro

Below is a sample diagram of how everything is organized on the back side of the bot.

Python 3.11 is chosen as the main programming language. To work with some libraries, for example libtorrent, it is necessary to use another version of Python, for example Python 3.10, because it is the latest supported version of Python by this library.

Let's go step by step, through all the technologies used.

To work with Telegram, a callback method is used, which means that after starting a service with a bot, it register a domain name on Telegram's server. Then Telegram sends requests to this address. As an address we use our domain - red-panda-dev.xyz, with a subdomain for easy access.

When a customer starts a bot or executes any command, a POST request with information about the customer's action is sent to the previously registered address. This request first goes to the reverse proxy, we use Nginx.

Nginx, based on the configurations defined in it, passes the request forward to the API service. For API service we use Flask - a good, time-tested mini-framework. API service verifies the request, serializes it and passes it to the Telegram bot task queue.

To work with API bots in Telegram we use a library - aiogram. It allows you to conveniently work with the bot, store the state of customer dialogs, and is actively supported by the community. The bot logic receives the processed message, searches for a suitable logic for the command and executes it, if necessary - accesses the database, or creates a task, for example, to processing a link to a torrent file.

After the customer has started working with the bot - information about the customer is added to our database - PostgreSQL. In the database is stored customer ID, in order to be able to initiate the sending of messages from the bot and to bind torrent files to a specific customer.

Various additional data are stored in the cache, Redis is used. In particular, there are stored data about the state of the customer dialog, the last actions with the bot.

To work with pending tasks serverless functions are configured and work. These functions receive information from the queue with tasks. Tasks to which the main server sends. SQS is used as a queue for tasks.

Some tasks are performed on a schedule, for example, checking the status of proxies, "Proxies checker" on the diagram. Access to services to check the torrent status is done through socks5 proxies. Periodically a task is started taking a pool of proxies and checking their status - opening various web sites through them, with which our service works. If the site does not open - the "rating" of the proxy in the database decreases, if the site opens normally - the "rating" of the proxy increases. Periodically proxies with low "rating" are deleted. Replenishment of the proxy list is currently performed in a semi-manual mode.

Also a periodic check of torrents is performed by schedule - "Page parser". The function parses pages with torrents and searches for Magnet links or torrent files.

If a Magnet link is found on the page, a message is sent to the "Torrent explorer" function, which connects to the torrent network and retrieves fresh data about the torrent.
If there is only a torrent file on the page - the function downloads it to S3 storage and also creates a task for "Torrent explorer" to download and retrieve data from the torrent network on this file.

After successful retrieval of torrent data from the network, the "Torrent explorer" function sends it to the API service to update the information in the database and send a notification to the user if there are any changes in the torrent.

The user's RSS feed is generated automatically, based on successfully processed torrents, information about which is stored in the database.

Fin

Originally there was one more service in the scheme - Celery, but launching rather long (parsing on some websites takes up to 2-3 minutes) and huge (RAM-consuming) tasks took too much computing resources. Transferring this logic into functions allowed us to save money on purchasing more powerful servers that would be idle most of the time, and also allowed us to simplify horizontal scaling. It also solved the problem of Python version requirements, all services work on the latest stable version - 3.11, with all its advantages. At the same time, the "Torrent explorer" function uses 3.10 because of the limitations of the libraries used.

Example of serverless functions cost:

Join our bot today and take your torrenting experience to the next level: @torrents_tracker_bot

torrents-tracker tg redpandadev development