Every day, millions of companies, software engineers, and average people manage their data through APIs. The Internet of Things (IoT), for example, is responsible for connecting devices, like your phone to your refrigerator. Such technological advancements wouldn’t be possible without APIs. Just look at some of the fields API developers are working right now.
Whether we realize it or not, we have become very dependent on APIs, and they handle sensitive and confidential data about our companies and our personal lives. With so many people depending on them, risk management regarding APIs is becoming an increasingly relevant topic in the software industry. To keep them secure, we must first understand how malicious users could exploit their vulnerabilities for their benefit, then determine how to prevent those actions from happening.
Broken, exposed, or hacked APIs are among the most common causes of security breaches, which can lead to leaks of highly confidential financial, medical, and personal data. However, not all data is the same, so it shouldn’t all be protected in the same way. How you implement API security should depend on the kind of data you’re handling. Security protocols like transport layer security (TLS) and web services (WS) security, are just the tip of the iceberg. Engineers need to adapt to the new frontiers that have emerged in recent years.
APIs support thousands of possible integrations with all kinds of devices and software. Being under pressure to deliver new releases as quickly as possible, responsible and well-intended programmers sometimes rush and make mistakes. In fact, in the introduction to the paper titled Uncovering Assumptions Underlying Secure Authentication and Authorization, researchers at Microsoft Research and the University of Virginia explained that even when developers follow accepted programming procedures, they deliver insecure code. The group tested three sets of apps, including apps in the Windows 8 app store, using multiple social media logins, and found that 67 to 86 percent of the apps had security vulnerabilities that could allow users to steal system credentials.
What is the solution? Rigorously scrutinizing the security of APIs (especially those that are open to the public) and setting aside time to continuously improve processes, based on new technologies and best practices. As we analyze the scenarios presented in the list below, let’s try to think creatively and visualize how a malicious user could take advantage of our API’s features or even render it unusable.
Now, without further ado, let’s see what the most common vulnerabilities of an API are, and how to prevent them.
Let’s say you’re a student at the Seal Bay College, and you just logged on to its website. Now, you click a button labeled “grades.” As you may already know, your grades are only a piece of data (also known as a resource) in the college’s database. Each bit of data has a unique ID so it can be identified and delivered to the correct user, and it’s sometimes visible in the URL once you access that data. Now, going back to the example, the university website requests the data from the server through an API using its ID. So when you receive your grades, the ID in the URL says “/accounts/ id123/grades_info.”
Suddenly, an evil voice inside you says: “What if I replaced id123 with id234? Would I be able to see everyone else’s grades?” Out of curiosity, you give it a try and, lo and behold, it works!
This strategy is referred to as an “object-level attack”. The attacker can read, manipulate, change, or delete pieces of data by altering the API call. In this example, the attacker could substitute the resource ID they accessed with the resource ID belonging to any other user. The lack of authorization and data coherence checks allows a user to view specific resources to which they shouldn’t have access. This attack is also known as an insecure direct object reference (IDOR ).
There are other ways more experienced users might misuse an object ID. Continuing with the example of a student, they could modify their grades using an application like Postman to communicate with the API directly, changing the method from GET to PUT with their ID and add new, illegitimate data. If this action is possible for users to perform without encountering problems, there’s a critical vulnerability that demands urgent attention.
There are at least two problems that have to exist for an object-level attack to be carried out. As we saw in the previous example, the first one is overly predictable IDs. Database resources can be particularly easy to identify if they’re named with sequential numbers like 1, 2, 3, 4 … and so on. The second problem is much more serious: lack of permission management. Each user should have unique permissions with data associated in their own account. When they don’t, there’s no limit to the amount of information they or others can access or modify.
How to prevent it: First, you have to implement API keys. “API key authentication” refers to a technique that requires users to access the API with a unique key. In this scheme, the password is usually a long string of characters that’s different from the account owner’s password. The key is a hidden code that’s assigned to the user automatically. With the API key, the server knows that it must allow the client to access certain data, but now it has the option to limit the data it accesses and the functions it can fulfill. Also, you must make sure that requests sent to the API manually by the user never work. Rather, the user should only be able to access and modify the data according to the permissions of their API key.
The second solution is to use random IDs whenever possible, depending on the organization and size of your database.
Authentication is the process of confirming that a request is made legitimately by the person who owns the identity through which it is being made. With poorly implemented authentication protocols, users can assume the identity of others.
Suppose an attacker gets a leaked list of usernames and passwords from a web database. If the website’s authentication endpoints are unprotected against brute force or credential stuffing attacks, an attacker can repeatedly attempt to gain access. They could use the list of usernames and passwords to determine which combinations work. Therefore, the problem lies in the possibility of attackers accessing an authentication endpoint by entering a user’s credentials.
How to prevent it: Deploying tools like Captcha, speed limit, and account lockout can help prevent bots or brute force scripts from using the authentication endpoint to test different credential combinations and gain access to that account. Cloudflare is an example of services that serve as an intermediate layer to prevent this type of attack.
Another strategy to prevent broken user authentication is to give generic responses, with no indication as to whether the information is correct. For example, if you issue a “non-existent user” message, attackers may attempt to send requests until an “existing user” is received. A better response is to display a message like, “Instructions sent to your email.”
An API can expose much more data than the user needs. Disclosing the properties of an object, regardless of the sensitivity of that data, enables attackers to interact directly with the API to obtain all the information they want.
For example, let’s say you’re now a teacher at the Seal Bay College mentioned above, and you have access to the grades of all your students. So, you log on to the website and enter the name and class of one of your students. The website requests the API to get the information of “user/321312.” The API’s response is as follows:
{ “user”: { “name”: “John Connor”, “sex”: “man”, “age”: 21, “address”: “av lope de vega 121”, “phone”: “+153232132”, “mail”: “john@gmail.com”, ““average_grades”:“B+””: null } }
As you can see, the API response exposes private information that shouldn’t be available to anyone other than the user who created the account.
This vulnerability exists because the API is relying on the client (the website) to filter the data. However, if the teacher were able to access and communicate with the API directly, they (and anyone who can access their account) can obtain critical and private information.
How to prevent it: Never rely on the client to filter data! Review all API responses and adapt them to match what the API consumer really needs. Additionally, identify all the sensitive data or personally identifiable information (PII) to avoid storing it, or filter its use.
This issue happens to APIs unprotected against an excessive number of calls or large files. An attacker can use this situation to perform a denial of service (DoS) attack and exploit authentication flaws with brute force attacks. This vulnerability is the result of the API having no restrictions regarding user requests. There are three common ways that malicious users or hackers can use it:
Furthermore, attackers can use “zip bombs,” which are archive files designed so that unpacking them uses an excessive amount of resources and overloads the API.
How to prevent it: The intention of these attacks is commonly to terminate the service temporarily, damage the servers, or affect users by causing the service to malfunction. The best way to avoid them is to include restrictions that filter requests before processing them, such as adding a file size limit, a fixed number of files to be processed, etc. You should also restrict the number of queries per user to limit the number of API requests.
Remember that we talked about API keys? Vulnerabilities can result if they’re not implemented correctly. See, API keys can be stored anywhere within a computer’s request, including the URL, and this is exactly where our problem begins. Let’s illustrate this idea with the website of our favorite college, Seal Bay. An attacker could modify an API call from GET /api /user/id324/my_grades to POST /api/admin/id324/all_grades and gain access to functions that they’re not allowed to have.
This type of problem occurs when the API relies on the client to determine the user level or the administrator level as appropriate. Attackers can discover the API methods to access administrative functions and invoke them directly. However, it wouldn’t happen if non-privileged users couldn’t access these functions without authorization. Different from the first vulnerability, broken object level authorization, exploiting this issue requires the attacker to send API calls to endpoints they shouldn’t have access to but are available to anonymous users or normal, non-privileged users. This type of flaw is often easy to identify and can allow attackers to access unauthorized functions. Accessing administrative functions is one of the main objectives of this type of attack.
How to prevent it: In this scenario, you can prevent the problem by using API keys. Now, we can go a step further and talk about appropriate API key management methods like OAuth. OAuth provides a standard way for the client to get a key from the server by walking the user through a simple set of steps. From the user’s perspective, all OAuth requires is entering credentials. Behind the scenes, an open communication channel between the client and the server enables the client to get a valid hidden key.
This vulnerability can be similar to vulnerability number 1, broken object level authorization, since both occur within the API call. However, unlike the first vulnerability, here the attacker doesn’t attempt to modify the API call. Rather, the attacker focuses on the “section” hosting the data sent to the database, the body. To understand this, imagine that the Seal Bay’s website app gives you the option to edit basic information about yourself in your user profile. For example, you can adjust your name, age, etc. In this case, the API call would look like this: PUT /api/v1/users/me with the following legitimate information within the body of that request: {“user_name”:“mike”,“age”:24}
But the body also contains the following information:
{“user_name”:“mike”,“password”:3213, “role”: “user”}
A knowledgeable attacker could try to edit the body as follows:
{“user_name”:“mike”,“password”:3213,“role”: “admin”}
Notice the replacement of “user” by “admin.” This attack can succeed if the API takes data from the client and processes it without proper filtering. Thus, malicious actors can try to guess the properties of objects and provide additional properties in their requests.
This problem is the result of the API working with data structures without proper filtering. Modern frameworks encourage developers to use functions that automatically transform client input into code and then add the objects to the database. Spring and .NET are two of the many frameworks that allow this process. While this feature is useful for easily setting server-side values, it doesn’t prevent arbitrary parameters from being injected.
How to prevent it: Although this practice may seem very convenient in terms of time, don’t automatically bind incoming data and internal objects. Instead, you must precisely define the data schemes, types, and patterns that will be accepted during the design stage and stick to them. Additionally, some middlewares are in charge of receiving the data and mapping only the data corresponding to the internal object.
Role-based access control (RBAC) refers to the permissions that roles in a system have to access API endpoints.
RBAC works by granting permissions to each role that uses the API, based on its requirements. For example, the role “student” within Seal Bay’s website is associated with a set of permissions that allow the user to check their grades and print them, review incoming exams, the names of their professors, and so on. However, the “administrator” role is associated with a set of permissions that allow the user to add or remove users, add or remove information about those users, customize the website, create events, etc.
The danger is that, when the reliability and functionality of the permissions don’t get adequately tested, regressions emerge. These regressions could allow an attacker to carry out something called privilege escalation, in which one or more users exploit their permissions until they gain access to information that they shouldn’t, potentially resulting in information theft and more critical attacks.
Privilege escalation is a common method attackers use to gain unauthorized access to data within a security perimeter. They start by finding weak spots in an organization’s defenses and gain access to other sensitive endpoints from there.
To see a real-life example of how a vulnerability can exist in this scheme, let’s review one of the most recent and scandalous security breaches related to RBAC vulnerabilities.
In November 2018, Google+ rolled out an upgrade to its platform updating one of its API, called “people:get.” Its use case was to allow third-party developers to access basic information about users. But an overlooked RBAC vulnerability caused a situation in which , instead of third-party developers being allowed to see unharmful data, they could see all the sensitive data from private profiles or profiles marked as non-public. The Google team identified the bug and fixed it within three days, but, during those three days, more than 52 million users were at risk of having their personal information exposed by hackers.
Although the Google Plus team said there were no signs of data breaches, this type of issue could’ve resulted in a privilege escalation attack. The problem with this type of attack is that traditional security scanning solutions can’t detect these problems because hackers use completely legitimate API calls to access sensitive information, as the API endpoints allow them to do so.
How to prevent it: The best way to prevent privilege escalation vulnerabilities is to continuously assess all endpoints/API functionalities and their assignments to different roles. These evaluations should be performed on an ongoing basis to ensure that there are no regressions. Role-based access control vulnerabilities must also be discovered, tracked, and fixed as early as possible in the development cycle.
Finally, as a best practice, you should NOT allow roles to have overlapping permissions. For example, let’s say in the API that provides data for the Seal Bay’s website, you create a Chancellor role and assign it permissions to view, develop, and edit educational events such as tests, courses, and school elections. Then you create an Administrator role and assign it permissions to view and record educational events such as courses, school elections, and inter-university competitions. Any user with Chancellor or Administrator roles will be able to view, create, edit, and log many events of the same category, with few exceptions. Creating an effective access control implies unifying these types of overlapping permissions, as this type of redundancy can vastly affect efficiency.
When malicious code is inserted into vulnerable software to create an attack, it’s called injection. Injection attacks have been around for more than a decade, yet many web applications deployed today are vulnerable to them. The situation has worsened as web-app development tools for non-programming audiences, such as uKit.com, have made it easy for new developers to build web applications without worrying about security flaws.
In an SQL injection, for example, the attacker inserts SQL code into an API request field so that the database server performs an unwanted action. How does something like this play out in real life?
Let’s say a student wants to access Seal Bay’s website as an administrator to change their average grades. They could access the sign-in page, where they would see the new user form. Suppose that in the API backend, the SQL code that registers a new user is as follows:
INSERT INTO users (name, password, email, role) VALUES (%s, %s, %s, ‘user’);
In this example, each “%s” is replaced with a value the user provides in the registration form: name, password, and email.
So instead of entering what any user would enter in the name box, for example:
John
The user enters:
John’, ‘12345’, ‘johnmason@gmail.com’, ‘admin’); —
As you can see, this piece of SQL code completes the query in the API backend. If this input is not parsed, the result is that the SQL query ends up like this:
INSERT INTO users (name, password, email, role) VALUES(‘john’, ‘12345’, ‘johnmason@gmail.com’, ‘admin’); --, ‘’, ‘’, ‘user’);
Besides completing the SQL query by manually adding the “admin” value in the field that corresponds to the user role, the attacker uses hyphens (–) to hide the rest of the legitimate code as a comment because, when completing the original query, the excess code at the end would cause an error. As a result, the attacker inserts an administrator user into the database.
To review, the user entered a new user script with their details and administrator permissions. So it became overwritten in the backend, a user with admin permissions was inserted, and the rest of the SQL code was hidden as a comment.
Any web API requiring parsers or processors, like SQL, NoSQL, LDAP, OS commands, XML parsers, or object-relational Mapping (ORM), is vulnerable to attack. To perform an injection attack, an attacker must find unprotected user-input boxes within the web page or web application.
A web page or web application that has an injection vulnerability applies user input directly to the database. The attacker can create, query, modify, or delete input content. This type of content is often called “malicious payload” and is the primary gear of the attack. Once the attacker submits this content, malicious commands get executed in the database.
How to prevent it: The oldest and best known solution is to use parameterized queries. These queries do not concatenate the variables (such as name or email) to the SQL query, but instead filter the input and build the query using the filtered data as parameters.
Seen in a practical sense, declarations with the question mark (“?”) placeholder are used whenever a user-supplied value is required.
Let’s see an example in Golang:
stmt, err := db.Prepare(“INSERT INTO users (name, password, email, role) VALUES (?, ?, ?, ‘user’)”) if err != nil { panic(err) } _, err = stmt.Exec(user_name, user_password, user_email, ‘user’) if err != nil { panic(err) }
The parameters applied to the placeholder “sanitize” the entry. That is, they ensure that whatever the user enters is treated as a literal string and NOT as part of the query. Parameters “escape” characters that could be interpreted as code before processing, thus preventing an attack. Escaping is the method of replacing special characters with those to be interpreted literally. For example, replacing quotation marks (“) with "
In summary, if we make a conscientious effort to filter user input and create parameters that filter the data, we will be much less likely to be victims of one of the most common attacks on the web.
It’s often productive to replace an API with an improved version when the previous version had data exposure errors. But not blocking access to the faulty version can create a critical vulnerability.
Let’s say, for example, the Seal Bay College is redesigning its application, and it forgot an old API version and left it unprotected, with access to the user database. When using one of the latest released applications, an attacker finds the API address, api.collegeservice.com/v1. Replacing the v2 endpoint with v1 has given them access to the old unprotected API, exposing the PII of thousands of users.
Older API versions are generally unpatched and are easy entry points for compromising systems without even violating any security mechanisms. Such attacks, coupled with not having an accurate record of how many versions of an API exist and what their access points are, can make their exploitation go unnoticed for months and even years.
How to prevent it: Maintain an updated inventory with all the API hosts and versions implemented. Also, use security protocols to limit access to everything that shouldn’t be public, even if it’s an old API version and it seems unlikely that someone will try to access it. Separate access to production and non-production data. Implement additional external controls like API firewalls. Finally, properly removing old API versions or backport security fixes prevents many problems.
Most studies on hacks and infiltrations show that the time to detect an infiltration is more than 200 days, and that it is usually identified by external parties rather than internal processes or monitoring. The lack of proper logs, monitoring, and alerts allows attackers to go unnoticed.
When creating an API, tracking endpoint usage can be troublesome. While commercial products are geared more towards browser analytics like Google Analytics, well-known backend tools for tracking API logs are rarer.
Furthermore, although having a record of events (logs) is very important, most entries are useless messages that obscure and make it difficult to see the pivotal entries, those security alerts that system defenders should analyze. In recent years, many organizations have created databases (usually in spreadsheets) that detail each record that users are generating or could generate without any filtering. It includes a list of all computing devices, as well as servers, routers, firewalls, proxies, switches, antimalware software, application logs, and more. Many of these devices generate dozens of records. But this process only makes it more difficult for the API security team to identify when an attack has occurred.
What’s more, most companies lack a clear understanding of what logs they have or should collect, not to mention what types of events are truly malicious and how to detect them.
How to prevent it: Log failed login attempts, denied access, input validation errors, failure in security policy checks, or any event that indicates a data breach vulnerability. Additionally, format the logs so that other tools can read them as well and protect them as highly confidential information. Logs should include enough details to identify the attackers without describing sensitive user data.
Now, how should you deal with excess data? Several tools, such as Loggy and Matomo, help filter and organize API logs. Still, many companies also choose to create their databases where they can better break down the results provided by these applications.
What do you think? Are any vulnerabilities missing from this list? The truth is, there are many ways in which an API could be mishandled and, therefore, compromised by other users. Here we have discussed the most common and some of the most catastrophic. Security, authorization, and validation protocols are essential to safeguard critical company and user data. By vigorously following best practices from the start, we can reduce the likelihood of attacks.
To be a great project manager, you must be a good leader, coworker, and supervisor. ..
Artificial intelligence (AI) is knocking technology out of the park, especially regarding machine learning (ML). ..
You’ve decided to migrate your team. Now, you’re probably wondering how to make the switch. Let’s find out. ..