Why work with a database API?
Why would you choose to work with database APIs? What is the benefit to you? Here are some of the most common reasons:
- No need to learn ANOTHER query language. Sometimes you might not have the bandwidth to add another language into your repertoire. Additionally, you run the risk of going to great effort to learn another query language only to find that you don't use it much.
- No need to learn how the data is stored. Is it a network model? A star schema? A snowflake schema? We don't care. All we are interested in is retrieving and saving data.
- No need to install and run databases locally. Do you really want this eating up your machine's resources?
- No need to upset the operator and database administrators. Do you want to run complex queries that could use up many resources like CPU or RAM? Do you need to open a wide range of ports to have access? Both of these situations may not be looked at favorably by database administrators.
All it takes to gain access to precisely the data you want is adding an API on top of your database. Let's look at the strengths and limitations of some of the most popular API architectures, which will make choosing the best one for your needs easier.
REST
REST was introduced in early 2000 as a lightweight replacement for heavy solutions like SOAP, RMI, or Corba. REST stands for "Representational State Transfer" and represents a set of HTTP standards (verbs and codes) and design principles that you must conform to when designing an API. You're able to send HTTP requests via methods such as GET, PUT, POST, and DELETE, where you can read, update, create and delete data.
Figure 1. Rest API specification (OpenAPI) generated by Swagger.
Strengths
- REST allows you to decouple the client and the server. The benefit is that changes can be made to a single component without a potential impact system-wide. A tightly coupled design can hamper a system, preventing it from being portable, reliable, scalable, or secure.
- You have better control over versioning so consumers have no problem using the API if changes are made. When designed correctly you can also improve the API without impacting consumers using previous versions of the API.
- REST has excellent support for API management tools, which can help you in designing, deploying, and maintaining your API. You'll find a wider range of tools compatible with REST compared to other API technologies.
- Serverless platforms such as Netlify and https://vercel.com/Vercel have great integration through REST APIs, giving you greater control over the data you can retrieve.
Limitations
- The payload of your request can be quite large due to passing JSON or XML files with lots of queries depending on the data you want to retrieve.
- REST, in its most common implementation, has no discoverability. If the API you're using doesn't have documentation for some reason, it can be challenging to know how to use it.
- It's not suitable for command-like functions and instead functions best for simple actions, such as create, read, update, and delete (CRUD).
GraphQL
Facebook created GraphQL for internal use before releasing it as an open-source project in 2015. GraphQL allows clients to request exactly the data they want from the server, even if it's from multiple sources.
Figure 2. Key characteristics of GraphQL.
Strengths
- Discoverability in GraphQL is excellent because of the auto-generating API documentation. Any changes in the API get automatically changed in the documentation so you can easily explore different possible queries, mutations, fields, and types.
- GraphQL allows you to enter custom payloads, allowing for more flexibility over the data that is retrieved from the server.
- GraphQL is compatible with JSON and is served over HTTP, both commonly known technologies that have lots of documentation and tutorials.
Limitations
- In GraphQL there is only a single endpoint; that can make version control, monitoring, and security challenging. To resolve this, you may have to define multiple GraphQL APIs, which can be time-consuming.
- Unlike REST, GraphQL hasn't been around as long, so there is a less robust set of tooling to help with implementation. Compared to REST, you'll likely spend more time implementing in GraphQL, having to define your schemas, create resolver functions and construct your endpoint.
- GraphQL can be hard on your database. With the ease of queries you can request with GraphQL, it's easy to fall into the N + 1 problem where nested structures can add N number of subsequent requests, making it very inefficient.
gRPC
gRPC was initially released in August 2016. It was created by Google as a means to connect large numbers of microservices across their data centers. It started life as an internal tool called Stubby that was used at Google until they decided to make it open-source on their next rebuilding. gRPC stands for Google remote procedure call.
For message payloads, gRPC leverages the standard protocol buffers (also called "Protobuf"). The requests are encoded as binary information for maximum compression and fast serialization. Each protocol buffer message is a small logical record of information, containing a series of name-value pairs. The schema describing the messages are stored in .proto files with minimal structure that look like this:
message Person {
string last_name = 1;
string email = 2;
string city = 3;
}
Strengths
- gRPC makes use of HTTP/2. HTTP/2 employs binary encoding making it more compact and increasing speed. It's also fully multiplexed, which means that it can send multiple requests between browsers and servers over a single transmission control protocol (TCP) connection at the same time, meaning pages load faster and network resources are fairer. HTTP/2 also reduces round-trip times (RTT), so you can get your website loading faster with less optimization.
gRPC benefits you by allowing you to communicate between server and client in different ways. It can be:
- Synchronous – gRPC takes care of the threading, and the application only needs to handle the remote procedure call (RPC).
- Asynchronous – Instead of gRPC taking care of the threading, you can do it yourself, meaning greater control of the threading aspect.
- Streaming – You can have multiple requests going between server and client, sending multiple messages back and forth on a single connection.
gRPC abstracts any complications for message contracts due to working over different languages by providing a "uniform horizontal layer" so you can have a standard way of specifying contracts, ensuring interoperability.
Figure 3. gRPC communication interoperability principle.
Limitations
- In gRPC, you define your message and services in a .proto file. Due to the relationship between the server and the client, any changes to the .proto file will mean that you'll have to regenerate the client.
- There is no discoverability in gRPC because of the .proto file. If you don't have the .proto file, you won't know what requests the service is capable of receiving.
- gRPC uses protocol buffers to exchange data between a client and a server in a binary format. Protocol buffers enable you to define once how you want your data to be structured. Then you can read and write from a variety of data streams and languages via a specially generated source code. The way you structure the data is through restrictions of protocol buffers. Each field would have a slot with an offset. You cannot add fields afterward in between existing data; they must be added at the end. Also, gRPC doesn't support nested structures, the messages will be "flat'.
Using APIs and data together
Alongside this connection between your data service and your applications, you may prefer one API over another for creating applications. But how can you decide the right approach regarding which APIs to use in the future?
The first thing to understand here is that not all APIs and microservices are stateless. Some need some persistence: a way to retrieve the state even if the component restarts. Technologies like databases or data caches can support these operations.
To access the persistence store you need a set of services to handle Create, Read, Update, and Delete (CRUD) operations; this family of services could be labeled as data-oriented services. Developers may choose to implement those CRUD features for themselves between the API that they pick and the data service they use. However, that can create something that needs additional support and leads to technical debt over time. Alternatively, there are open source projects that can provide that API connection as a reusable technical layer and act as a platform data gateway. Whether you choose gRPC, REST, or GraphQL - or a mix of different APIs for different purposes - you can make it easier through abstraction.
Secondly, you can consider your use case and how you expect your application to develop or change over time. If you expect to have a lot of requests for data back and forth, then REST may be less of an option due to the number and size of JSON files that will be created. Similarly, REST only supports simple operations - so if you expect your application to only need a limited number of ways to interact, then REST is a great fit. For example, as a company, you can expose public APIs limited to a business domain, such as customer service or payment service. The idea is to reduce the number of resources and ways to interact, making it simple to use and consume. This would be a single-purpose operation, where REST is a good fit.
For more complex applications that will request larger volumes of data over time, options like GraphQL and gRPC may be more appropriate. GraphQL's ability to support schema changes and different query permutations can make it easier to keep up over time. However, it can also require significant performance tuning to ensure that problems don't arise. For instance, Github decided to move to GraphQL in 2017, exposing generic services and letting developers consume what they need and how they want.
gRPC is about performance, supporting cross-platform communications, and running across different platforms. For instance, you might know your clients will need to use a wide variety of technologies. Instead of building multiple clients, you can share your proto interfaces and let people generate clients in the languages of their choice. gRPC also makes sense for internal components-to-components communications. This supports the maximum performance and support for streaming and asynchronous operations.
To help choose between a variety of APIs and technologies available to you, you might want to focus on two characteristics for your use cases. First, you need to know who will invoke your services. These will be your customers and they will have different requirements for scope and language. Equally, you may have internal or external customers or both. Alongside this, you will have to decide what they would need the most in terms of performance, ease of use, and discoverability. Second, you need to know the scope of the service such as business bounded context, data, and operations.
One approach to solve these problems is to use a data gateway alongside the API option that you choose. This enables you to select an API for your developers that they feel comfortable with, but then create a layer of abstraction between the API and the database. Many developers do this for themselves as a matter, of course, to make it easier for them to use a particular database over time, but this adds an additional component for the team to maintain over time. Open-source data gateway projects like Stargate offer a different option, keeping that ease of management approach but also keeping updates covered by project maintainers.
Lastly, there is no silver bullet approach for implementing APIs. You can drive your decisions based on the strengths and weaknesses that each API has, and you can apply logic knowing your customers and their scope, but there is no absolute right answer. Sometimes the skillset in your teams will decide for you but you need to be aware of what exists in the industry if not to build but to consume APIs.
For developers, understanding which API is right for them is a mix of personal preference, performance requirements, and long-term impact. Picking the right API approach at the start can make it much easier to scale up an application over time, particularly as your team has to support hundreds of thousands or millions of concurrent user requests. When looking at this level of scale, simplifying the interaction between APIs, databases and infrastructure can help you concentrate on what you want to achieve: better software.