To explore how a user’s environment influences password creation strategies, we present a blogpost series in which we consider several different perspectives – the macrosocial influence of your country (where you live), the influence of your peers (who your friends are), and a technical understanding of how they are attacked – to improve password security and mitigate the risk of poorly secured passwords.
This first blogpost of the series demonstrates how various characteristics of countries influence password strength. By analyzing the mean time to crack passwords of the 200 most common passwords by country, we demonstrate that (1) Literacy level of a population; (2) Voice and accountability; (3) Level of global cybersecurity; and (4) Level of data breaches exposure significantly predict password strength performance.
Password Habits
Passwords constitute the first line of defense for computer-based technologies but were used for millennia as, for example, the Roman military were reportedly using passwords to distinguish allies from enemies. Today, even if attack-resistant authentication mechanisms exist, passwords constitute the most popular technique for authentication.
People typically use passwords for many things in the course of their regular day-to-day, often using a password creation strategy that makes their passwords easy to remember. Past studies have shown that users also tend to choose weak passwords, which are easy to remember but as such, are also vulnerable to being guessed by others. Also, studies reveal that passwords are often reused, a practice which has been shown to increase the security threat for all scenarios where an individual uses passwords.
Researchers demonstrated the influence of a person’s environment and exposure to the Internet on their online security behavior. Several macrosocial elements might be taken into consideration when evaluating the reasons why users have different levels of performance according to their environment. First, if there is a difference in cybersecurity habits between countries, the characteristics of the government might be an element influencing users. Second, the characteristics of the population, which are directly related to users, would also be a reason to explain the impact on performance. Finally, external factors like cyber-attacks and the level of victimization of a country might also be a part of the explanation.
Analysis
Each year, the company NordPass releases a list of the 200 most common passwords by country. The list of passwords is compiled using the many cybersecurity incidents (data breaches containing users’ password) that occurred in 2021. In total, the list is created from 4 terabytes of information and contains 49 countries. The raw dataset is available here.
- The list comprises between 169,656 and 146,837,497 users per country.
- The mean time to crack passwords is 578.54 minutes (about 9.6 hours) spanning a range from 0 to 5,356,080 minutes. (5,356,080 minutes = 89,268 hours = 3,719.5 days = 10 years and 69.5 days)
- The vast majority of passwords included in the lists can be cracked in less than a minute (61%).
Please note that the mean time to crack appears to be high. The method used by Nordpass to estimate the time to crack is unspecified and some passwords that we judge to be weak (e.g., kallynlavallee) are associated with centuries to crack in their model. This is considered an important limitation of the dataset. That said, the unspecified method is used consistently across the countries, so we can rely on the metric for our comparative analysis.
Measures
While our analysis focuses on the 5 measures outlined here, selected because they held the most direct correlation with password strength, a total of 29 different measures were considered in the exploration of possible models. To account for the strength of the passwords, the mean time to crack the password from the 200 most common passwords list of each country is considered. Then, several macrosocial variables were considered to create a model explaining the level of password strength. The 29 measures were tested to avoid ending up with two highly correlated variables in the model. Then, different models were tested using an amalgam of variables from the list with an emphasis on their relative importance based on the literature. The contribution of the selected variables to the model were very stable and most were chosen because they were strongly correlated indicators of password strength. The five variables chosen for the final model are presented in this section.
One of six components of governance indicators as stipulated by the World Bank, Voice & Accountability reflects perceptions of the extent to which a country’s citizens are able to participate in selecting their government, as well as freedom of expression, freedom of association, and a free media.
The Global Cybersecurity Index (GCI) is a trusted and unbiased reference that measures the relative commitment of countries to cybersecurity at a global level and is composed of 25 indicators with regard to the five pillars – (i) Legal Measures, (ii) Technical Measures, (iii) Organizational Measures, (iv) Capacity Development, and (v) Cooperation – which are then aggregated into an overall score. It represents the most comprehensive measure of cybersecurity commitment of individual countries compared to many other measures that are frequently published by companies with industry interests.
The Cyber Exposure Index is based on data collected from publicly available sources in the dark web and deep web and from data breaches. From this data, signs of sensitive disclosures, exposed credentials and hacker-group activity against companies are identified.
This measures the percentage of adults in a country who can read and write their common language. A higher literacy rate is an indication of higher standards of education and the employability of the population in general.
Gross domestic product (GDP) per capita is gross domestic product divided by midyear population. GDP is the sum of gross value added by all resident producers in an economy plus any product taxes and less any subsidies not included in the value of the products. It is calculated without making deductions for depreciation of fabricated assets or for depletion and degradation of natural resources.
Analysis
Multiple linear regression (MLR), also known as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. Multiple regression is an extension of linear (OLS) regression that uses just one explanatory variable.
Results
The overall regression was statistically significant (R2 = 0.36, F = 23.46, p = < .002). The importance of the impact of each variable on password strength is represented below.
Among these variables, only the GDP does not significantly impact password strength (time to crack in seconds) which is surprising when compared to the depiction of GDP as having a substantial impact on users’ behaviors on the Internet. The other four variables were found to be determinant in predicting password strength with exposure to cybersecurity breaches being the most significant.
What Does it Mean for Password Strength?
The most significant finding from the results of the analysis is that a user’s environment, specifically the country in which they reside, influences their password creation strategy.
- More precisely, countries in which citizens participate in selecting their government and have freedom increase password strength performance. This might be explained by the fact that democratic countries also have greater access to the Internet. Since the Internet is synonymous with access to information and freedom, undemocratic countries are resistant to the widespread use of Internet by their citizens. When the Internet is widely accessible, people learn how to use it and structures can be developed. As a matter of fact, freedom also has been demonstrated in previous studies as impacting Cybersecurity Capacity Scale.
- The level of government investment in cybersecurity has an impact on the security of its users. The literature shows that the commitment of countries to fight against online crime is beneficial economically. The present analysis goes further suggesting this type of investment predicts better password strength performance.
- Literacy is an important aspect to consider in this study as it is directly connected to the use of technologies. To seek, evaluate, and use information found on the Internet, users must navigate via largely text-based menus & links as well as reading large volumes of text. The challenges faced by low-literacy users when creating and managing passwords are documented and research indicates that they are more prevalent than in the literate population. When a user’s level of cyber security knowledge increases, their cybersecurity behaviors improve. However, if users are not able to get this information about cybersecurity because of their inability to read, their security will be negatively impacted. The result of the study is therefore not surprising: when the level of literacy of a population increases, the strength of its passwords increases also.
- The results show that the number of cybersecurity incidents within a country is positively correlated with password strength – which is to say, the more a country is under attack, the stronger its people’s passwords are. This suggests that people might be more sensitive to the importance of protecting data with strong passwords when they are exposed to more cybersecurity incidents. In such cases, users are aware of the meaning of a data breach, and it influences their behavior and password formation strategies. This demonstrates the resilience of users when they live in a hostile environment.
Based on these results, we can conclude that the macrosocial environment influence password strength. Countries have an impact on the level of protection of their users. A user’s country of origin and/or residence necessarily has an impact on their social identity. It means that our social identity, which can be influenced by several levels (e.g., macro and micro), might have an impact on password choice. The next post of the series will go deeper into the social identity influence by exploring the impact of your network on your passwords.