In our previous blog post, we looked at different (free and paid) solutions to detect the use of anonymity tools during attacks executed on our Remote Desktop Protocol (RDP) honeypots. Confronted with inconclusive outcomes, this blog post aims to evaluate the different proxy detector tools by analyzing their results with our dataset of Truth.
In the initial phase of the project (published previously), five different tools (ip-api.com, ipapi.is, ipqualityscore.com, Virus Total and Neustar) were employed to identify proxies. Within the scope of this project, the word “proxy” encompasses a spectrum of tools used to hide IP addresses, which includes basic Virtual Private Networks (VPNs), data centers, anonymity networks, and residential proxies. The tools tested showed considerable diversity in the amount of IP addresses flagged. Two reasons might explain these discrepancies: outdated databases and the non-flagging of an IP address, which does not necessarily indicate it is not a proxy, but rather may suggest insufficient information available for conclusive determination. In order to determine the most accurate proxy detection tool, their results are compared with the information that we know is true.
The Database of Truth
RDP retrieves the timestamp along with the credentials tried by the attacker, and their IP address during login attempts. However, once the attacker is connected to our RDP honeypot, the client sends information about the real location of the computer, regardless of the information given by the IP address. Specifically, RDP fetches their internal IP address as well as the time zone. By comparing those two artifacts with the external IP address and its location, it is possible to detect whether the internal IP address has been hidden or not. In other words: whether a proxy has been used or not. The diagrams in Figure 1 below illustrate the process of comparison between the internal and external information to identify the proxies, resulting in a dataset that will be used as a point of reference to verify the APIs’ responses.
Figure 1. Process to detect the proxy IP addresses based on the internal and external information
The dataset of truth is therefore composed of all successful logins to our honeypot. Since the honeypots have been up since 2019, there were 57 million login attempts to our systems. However, only 1,774 were successful, while many of them were the same attacker connecting more than once. Narrowing this number to only include unique IP addresses and attacks with complete information about attackers’ location, the resulting database consists of 253 IP addresses.
Analyzing the Results
The results of running the APIs through the dataset are shown in Graph 1. In reality, 154 IP addresses were proxies, representing right over 60% of the tested IP addresses. The rest of the results are quite similar to the first part of the study, with Virus Total flagging the least and ipqualityscore.com detecting proxies seven times more often. The rest of the APIs stand in the middle, ranging from 40 to 63 IP addresses flagged.
Graph 1. Number of proxies identified by each tool compared to the reference dataset
However, the number of flagged items does not attest to the veracity of the information. Are they flagging the right IP addresses as being proxies? There needs to be deeper analysis. Graph 2 shows the true and false positives, along with the true and false negatives. As previously mentioned, the classification of an IP address as not being a proxy does not necessarily confirm its non-proxy status; instead, it may indicate a lack of sufficient information to reach a definitive conclusion. In other words, a designation of « 0 » signifies either « no proxy » or an indeterminate status (N/A). It is therefore more advantageous to focus only on the true (light blue bars) and false (yellow bars) positives. In this context, ip-api.com is the least accurate, as 75.93% of the IP addresses that were flagged as proxies, were in fact proxies. The two other opensource tools, ipapi.is and ipqualityscore.com, both stand a bit higher, with 81.25% and 78.33% respectively of IP addresses flagged correctly. Finally, the tools with the highest accuracy are also the costliest: Neustar (92.5%) and Virus Total (94.12%). It can also be noted that they flagged the least amount in comparison with the other APIs.
Graph 2. Representation of the True and False Positives and Negatives of the five tools
Other features
After evaluating the proxy detection function of the tools, let us explore their other features. For instance, one of Virus Total’s key functions is assessing the reputation of IP addresses. In our test, we found that 12%, or 30 of the IP addresses, were labeled as “harmless.” However, it is important to note that all the tested IP addresses had attempted to log into the honeypots at least once, so they are far from harmless.
Graph 3. Virus Total maliciousness status of attackers’ database
Another example of the inaccuracy of such tools could be ipqualityscore’s abuser score. The API provides a field for the abuser score of the Autonomous System Number (ASN, which identifies a network or group of IP addresses operated by a single organization), which are shown in Graph 3. A surprising number of IP addresses ended up falling in the ‘very low’ category, which further demonstrates the inaccuracy of some tools. For comparison, over a period of 3 months activities on our honeypots, the average number of attacks for one IP address is over 2,200. Yet, the tool classifies the majority of these IP addresses as non-abusive.
For both previous examples, it is imperative to understand how the features work: they rely on user submissions. Each user account can report an IP address as abusive. Depending on the reputation of the account and the number of reports, the website decides whether to categorize the IP address as abusive, malicious, suspicious, etc. To unflag an IP address in those cases, the owner of the IP address must make a request to the service, which may take some processing time, so it is often a cumbersome and time-consuming process to resolve.
Graph 4. Diagram of the abuser scores
Conclusion
Accessibility to reliable sources of information regarding proxy detection seems to be more complex than anticipated as results vary widely from tool to tool. Multiple reasons tend to explain this phenomenon, starting with the tools leaving the default value to not-a-proxy, or non-malicious, just like a genuine presumption of innocence. The second reason is that as hard as it is to find the proxies in the wild, it must be even more intricate to un-flag an item, which could explain the numerous false positives gotten. Consequently, perhaps the most rigorous method to identify proxies is by using a “middle-aged” API; not too recent for accuracy but not too polluted by years of data.
The detection of proxy is complex due to the broadness and constantly changing nature of the internet. Also, keep in mind that the present research did not assess all the tools on the market. Hence, defining proper use cases and remaining aware of the risks of inaccuracy is imperative now.
For research purposes, the analysis in this blog post helped us determine the optimal balance for selecting the most reliable source of information. One API flagged a high number of proxies, aligning closely with the expected number of proxies. Therefore, considering the API with the highest number of flagged proxies (ip-qualilty.score), as well as the ones with the most accurate information (Virus Total and Neustar), we are able to provide a reliable basis for analyzing the behavior of attackers who use proxies versus those who do not. We will use this approach in the next blog post to analyze the behavior of attackers in relation to their proxy status.
We would like to thank Dr. Andréanne Bergeron for the supervision of this research project and for further writing and reviewing of this blogpost.
Author: Constance Prevot
CAS D'UTILISATION
Cyberrisques
Mesures de sécurité basées sur les risques
Sociétés de financement par capitaux propres
Prendre des décisions éclairées
Sécurité des données sensibles
Protéger les informations sensibles
Conformité en matière de cybersécurité
Respecter les obligations réglementaires
Cyberassurance
Une stratégie précieuse de gestion des risques
Rançongiciels
Combattre les rançongiciels grâce à une sécurité innovante
Attaques de type « zero-day »
Arrêter les exploits de type « zero-day » grâce à une protection avancée
Consolider, évoluer et prospérer
Prenez de l'avance et gagnez la course avec la Plateforme GoSecure TitanMC.
24/7 MXDR
Détection et réponse sur les terminaux GoSecure TitanMC (EDR)
Antivirus de nouvelle génération GoSecure TitanMC (NGAV)
Surveillance des événements liés aux informations de sécurité GoSecure TitanMC (SIEM)
Détection et réponse des boîtes de messagerie GoSecure TitanMC (IDR)
Intelligence GoSecure TitanMC
Notre SOC
Défense proactive, 24h/24, 7j/7