This is Part 5 of the ProgrammableWeb API University Guide to GraphQL: Understanding, Building and Using GraphQL APIs.
In previous installments of this series, we looked at the history that led up to the advent of GraphQL. Also, we looked at the GraphQL specification and then at a GraphQL API demonstration application we created using Apollo Server 2.0. The purpose of the demonstration application is to provide examples for getting a fully functional GraphQL API up and running. Finally, we took a look at GraphQL in terms of the Semantic Web. We looked at some of the ways that GraphQL fulfills the promise the Semantic Web. We also discussed some of its shortcomings.
In this installment, we're going to examine how companies have gone about making GraphQL work in the real world. First, we'll examine how Apollo, the company that publishes Apollo Server, approached creating a commercial-grade implementation of the GraphQL specification.
Then, we'll look at the variety of challenges and successes that GitHub faced when actually implementing the GraphQL version of its API at webscale. Next, we'll consider GraphQL from the viewpoint of the third party development company Crucial that does GraphQL implementations for a number of companies across an assortment of sectors.
Finally, we'll focus on a particular aspect of designing for GraphQL that is common to just about all implementations: whether to model to the domain or model to the use case. For this part of the investigation will compare the approaches of two companies, Trulia, and Edmunds.com.
Implementing GraphQL in the real world is rarely a trivial undertaking. To go into the details of any company's implementation requires a book's worth of information in itself, clearly beyond the scope of this article. Yet, what becomes apparent from the case studies presented is that all companies ask a common set of questions: What do I model? What are the technical challenges at hand? How to make GraphQL work for my company and my company's customers? Answering these questions is the focus of this investigation..
The place to start is with Apollo, publishers of Apollo Server, one the leading implementations of the GraphQL specification
Apollo, Building a Universal Solution
"The biggest challenge in doing app dev today is the need to connect every one of those places your data lives to every possible customer experience." — Matt DeBergalis, Co-founder, Apollo
While it's true that any person or company with the time and expertise can create a GraphQL specification implementation from scratch, it just doesn't make sense to do so. It's just too much work. There are already a number of production quality implementations available and most of them are open source. To "roll your own", so to speak, is akin to reinventing the wheel, which is never a good use of time and money.
One of the companies that provides an enterprise-ready implementation of the GraphQL specification is Apollo, publishers of Apollo Server 2.0. Apollo Server 2.0 runs on the Node.JS platform (server-side Javascript). Apollo Server already has a significant following at the enterprise level and is used by companies such as Airbnb, CNBC and SurveyMonkey, to name a few of its A-List players.
Apollo didn't stumble upon GraphQL. They went there on purpose as a followup to Meteor, the company's initial framework for developing high-quality Javascript applications.
"A modern UI engineer wants to build a great customer experience. That's what gets them out of bed in the morning." Matt DeBergalis told ProgrammableWeb. "In practice, what they do all day is move data around. They call REST APIs. They write Redux functions. They manage the state... It's the plumbing that makes the product possible. And so Meteor had this really forward-thinking approach that made it possible to write a whole application in just a handful of lines of code."
Meteor, which was released in 2011, had a considerable impact on application development. But it also had a shortcoming. The product was specific to the MongoDB database technology. At the time, MongoDB enjoyed a considerable market share in the NoSQL database space. But every day, more competing technologies appeared. Apollo had to adjust to stay viable.
In late 2015 Apollo began work on converting Meteor's data layer into a standalone product. The company wanted something that could work with any data source, not just a Mongo database, and something that could scale to the largest applications on the Internet. GraphQL fit the bill in more ways than one. According to DeBergalis:
"I think you need two ingredients to have a successful technology that gets mass adoption. It has to solve a hard problem because developers won't waste their time learning a technology that gives them only marginal benefit. It has to have a 10x or 100x improvement on the current state of things. But it also has to be really thoughtful and ergonomic and enjoyable. Product developers vote with their feet, right? This isn't the old days when management would mandate a technical solution in all companies, large and small. What we see over and over again is developers calling the shots on which system they're going to use or which language they're going to bring in or what technology they use to manage their containers; all those things up and down the stack. And so we saw GraphQL having both of those properties. We realized it solved a fundamental problem and it did it with elegance. It's something that a React developer touches and then never wants to go back."
And so, Apollo Server was born.
The thing to understand about GraphQL is that it's not a database frontend. It's a general API architectural style; an alternative to other API architectural styles like REST and RPC. Thus, it can support many databases and APIs at once depending on the use case(s) that the GraphQL implementation is intended to represent. DeBergalis describes the essential benefit as follows:
"Our approach with Apollo is we've built it for the product engineering team and we've designed Apollo to be something you can adopt as a product engineer on APIs you already have. So you don't have to ask your backend team to make fundamental changes to how they store data or what the API structure is. If you have REST or Thrift or you need to talk directly to a database, Apollo is perfectly happy in each of those settings. It has a lot of advantages for the infrastructure side of the house for sure, but I think the story fundamentally is application developers driving this interest and this excitement in having a data graph rather than having to have legacy points of pointing APIs for all their data."
When asked about the difficulties that Apollo experienced when creating Apollo Server, DeBergalis responded,
"The app[lication] dev[elopment] ecosystem is rapidly moving. People talk about Javascript fatigue. What we're trying to do with Apollo is build something that addresses a lot of these really fundamental challenges for developers, but in the context of that rapidly moving ecosystem. And that means being really responsive as an open-source community. It means integrating with lots of different APIs as they emerge. Just a lot of the block and tackle work that we're really excited to do here. Uh, and you know, from far away it feels like, eh, how hard is it to write a GraphQL client? But in practice for the most part, now it is"
As DeBergalis considers the future of GraphQL he points out that at some point, all companies large and small will need to adopt the data graph as the primary way to represent data on the Internet. He finds the advantages too compelling to do otherwise. His opinion is that to not adopt the data graph puts a company at risk of losing competitive advantage.
Also, DeBergalis revealed that Apollo is working on a way to make it possible to unite GraphQL domains into a single aggregation. For example, easily unifying a GraphQL implementation in one Internet domain that has movie data with an implementation of GraphQL in another domain that has movie reviews. The result is that now a developer can write an application that takes advantage of information that crosses domains.
Editors note: On May 30, 2019, Apollo announced the release of Apollo Federation. Apollo Federation is a technology that allows developers to combine a variety of GraphQL APIs into a single representation.
What becomes clear from talking with Matt is his passion for empowering developers to make better products. While he understands that Apollo is a business and as such is subject to the demands of competition in order to survive, he also thinks that his company answers to a greater calling. As he said at the end of the conversation:
"My hope is that there's a path forward here where Apollo enables something that's a lot more universal. Where the access to this data and the services that are on the Internet stops being the stumbling block that prevents people from building a good application. And it becomes really just a function of ideas and use cases that really resonate and make sense."
Clearly, Apollo has an ambitious mission and GraphQL is a critical factor to its success.
GitHub: Implementing a Public GraphQL API That Works
"We wanted to write a public API when almost every implementation of GraphQL or every major success of GraphQL that you've probably heard about, or will hear about is private." — Kyle Daigle, GitHub's Director of Ecosystem Engineering, GitHub
GitHub needed the technology to solve a very specific problem: managing its customers' requests for special data structures.
Back around 2017 the company was having problems around its REST API. The API, which was about seven years old at the time, had become quite popular with a lot of integrators, both customers and big-name companies such as Travis-CI and CircleCI. As the API grew in popularity, key companies started to ask for special additions to the API payloads. GitHub accommodated as best they could, but after a while things became unwieldy. A better solution was needed. As Kyle Daigle, GitHub's Director of Ecosystem Engineering told ProgrammableWeb during a recent interview:
"We started to look at other options for what the next version of our API could be. And so we looked at some of the ones that were kind of prescient at the time like JSON:API was sort of in its glory at the moment. Falcor from Netflix had just recently been released. There was OData, and some other options that we started to poke at. We did some proof of concepts and ultimately got pretty excited about GraphQL because it allowed us to give integrators access to data in a format [where] they could define what data they were actually looking for."
GitHub started its research by going to the source, Facebook. The company's engineers asked Facebook the questions common to any due diligence investigation. For example, what are some of the hurdles to overcome? What do you wish you knew when you started?
Facebook was helpful, but the value of the information provided had limited value. GitHub had a secret it couldn't share with Facebook or anybody for that matter. As Daigle revealed:
"...the super unique thing of us that we couldn't really talk to anyone about is we wanted to provide a public API when almost every implementation of GraphQL or every major success of GraphQL that you've probably heard about or we'll hear about is a private API."
In many ways, GitHub was on its own when it came to solving problems that were more than elementary.
The company spent about a year developing its GraphQL API, announcing its release at GitHub Universe in September of 2016. GitHub spent six months after the premiere gathering user feedback. One of the key issues that GitHub encountered was the lack of user education about GraphQL. Not many people were aware of GraphQL back in 2016 let alone how to use it. GitHub found that it had to spend a good deal of time getting its user base accustomed to working with the technology.
The cost of recursion in a single thread environment
In addition to the external issues around user adoption and education, GitHub had an internal problem it had to address. GitHub is a Ruby application. This constraint makes the actual implementation of GraphQL challenging. Daigle described the challenge this way:
"The major technical limitation that we have is that we're a Ruby application, and Ruby is single threaded. GraphQL inherently wants to work at a concurrent environment because the whole purpose is, 'I can access data along the graph and load it concurrently, hypothetically in a lot of cases.' But unfortunately, our architecture doesn't really allow for that."
GitHub had real concerns about users creating GraphQL queries that would be so burdensome as to cripple its data access mechanisms. The way the company addressed the issue was to use Ruby promises. Promises allow single-threaded applications to emulate concurrency by making function calls asynchronously. Promises make it so that one function does not block subsequent functions in a sequence of calls. Rather, all the functions in the sequence execute simultaneously. Each function in the chain has "then" behavior that's executed once that function's task is complete. For example, promises allow a function such as getUsers() to get a list of users from a data source and "then" when the list has been retrieved, another function, getComments() is executed to get the comments made by each user. The benefit of a promise in this example is that the initial getUsers() function call does not stop program flow. The function, getUsers() is executed in a fire-and-forget manner. Although it puts a temporary hold on getComments() for obvious reasons, the application can call getUsers() and move on.
Figuring out how to charge for use
Another internal issue that GitHub had to address was how to charge for usage of the GraphQL API.
Most GraphQL implementations offer a costing metric in which queries are assigned point values determined by the size of the data retrieved. GitHub takes a different direction in order to simplify cost determination on the part of the user.
GitHub charges according to the number of objects retrieved and query complexity. Daigle describes the costing model,
"What we end[ed] up doing is calculating the number of objects that you're getting, which usually people can roughly understand — five issues, five comments are getting down to 30 things and then so on and so forth down the chain. And we put a top-level limit on that number."
GitHub's goal is to make cost determination as simple as possible. "I don't want you sitting there with a calculator or trying to call our API and say, okay, what's the overall cost of this?" Daigle said.
Implementing security
Once GitHub addressed pricing, there was on more significant issue to address; security. According to Daigle, "The most complex thing about GraphQL and a public API is overlaying a permission structure."
Securing a GraphQL API is significantly more complex than securing a REST API. Many if not most companies secure their REST APIs at the access level — a given user can have access to the URL for a specific resource or not. For example, you can have access to a list of users at http://example.com/api/users, but you might not have access the user detail to http://example.com/api/user/12324. It all depends on the scope of your permission.
Unlike a REST API which might be composed of a number of URLs, access to resources in a GraphQL API requires a single URL. In order to secure data represented by a GraphQL API, permission scope needs to be applied to various objects and their fields declared within the given query. For example, a user might have permission to query the API for all users by first name and last name, like so:
{
  Users {
    firstName
    lastName
  }
}Yet, that same user might not have permission to view email addresses. Thus, the following query is forbidden.
{
  Users {
    firstName
    lastName
    email
  }
}The question then becomes, how, in GraphQL, to apply a permissions model that is consistent and easy to maintain. The way the GitHub enforces permission scope is to use server-side GraphQL directives (a subject covered in Part 3 of this series on How To Design, Launch, and Query a GraphQL API Using Apollo Server). Engineers apply directives associated with particular degrees of scope to both object and fields in the GraphQL schema. Then, when a query is executed, runtime behavior within the GraphQL API examines the scope(s) assigned to the owner of the query in order to determine the object and fields that the owner has permission to access.
In terms of programming, it's a laborious effort for GitHub. But the effort is necessary. An API is only as good as the security model it supports.
Addressing the need for viable backend libraries
In addition to its performance and security optimizations, GitHub is in the process of creating a useful set of backend client libraries intended to make working with GraphQL easier from a server-side perspective.
GitHub presently publishes a GraphQL SDK for Ruby and there are a variety of other clients for the server-side such as OctoKit. Daigle's position on writing server-side client libraries is to allow developers to get as close to the GraphQL query syntax as possible:
"I still believe that the best way to do it, [writing client libraries for server-side execution], is to make onboarding a lot easier, so you don't have to dig into the pure GraphQL syntax, but in a way that gets you closer to the experience. ... I don't believe that hiding the queries is going to be particularly helpful in the long run for most users."
Daigle also commented about using nullable fields. The GraphQL spec allows developers to declare fields in an object that can be null. This is called a nullable field. Early on in its discussions with Facebook, GitHub was advised to not declare fields nullable by default, that all fields should be expected to return data unless compelled otherwise. However, GitHub learned that this caused problems in their distributed systems. Daigle points out:
" 'Everything in our [GraphQL] schema is available' is what Facebook told us at the very beginning. And we were like, okay, well should we? And their response was 'you should definitely learn from our mistakes…'. So we did that. But it turns out that I think it's in some ways a major flaw in how GraphQL is structured because it gives you this feeling of [being] very clean and pure, trying to stick to the truth [that] this object will [always] be here. This field is optional, so obviously it can be nullable so on and so forth. But once you get to a complex system size, really everything needs to be nullable to some degree. Unless you plan on never delivering partial results, which again seems pretty unlikely."
In the beginning of its efforts, GitHub engineers marked fields as non-nullable; that data needed to be present. Yet, many times, queries ran in which fields that were marked as having to return data actually had no data available. Often this resulted in HTTP 500 errors, (A 500 error code is typically a "catch-all" error response that's generated when server-side code fails or the server in general fails.)
Today, GitHub marks most fields as nullable. The company's position is, according to Daigle, that it's better to give users no data than to return a 500 error.
Limited Implementation of GraphQL Interfaces
When questioned as to whether GitHub makes use of GraphQL interfaces, (An interface is a programming structure that declares fields that a GraphQL type must implement.) Daigle responded that the company shied away from using them. The work required to support interfaces did not justify the benefit. Daigle says GitHub supports interfaces, but only in very special circumstances,
"We ended up falling away from using interfaces [with] the exception of where there's a very common pattern across the database or across your data structure. For example, structures that are archivable or spammy or anything where every object is touched in the same way — database ID or anything like that. I think that's one of the things that we've used interfaces for. But very rarely do we actually use it for two objects that share a common core."
GraphQL does indeed work for GitHub
Anybody who slings code for a living knows GitHub. The website has been transformational in the way software is made and shared. Millions of developers throughout the world rely on GitHub. For many, it's become the de facto source code repository of record.
Despite the growing pains, today GraphQL is the primary way the GitHub exposes itself to the world at the API level. To quote from the company's website:
"GitHub chose GraphQL for our API v4 because it offers significantly more flexibility for our integrators. The ability to define precisely the data you want — and only the data you want — is a powerful advantage over the REST API v3 endpoints. GraphQL lets you replace multiple REST requests with a single call to fetch the data you specify."
Clearly GitHub is going "all in" with GraphQL and in doing so has inspired other companies to take the plunge, the rationale being, "if it's good enough for GitHub, it's good enough for us." Yet, as we've seen with the GitHub example, adopting GraphQL requires effort. Even more so if you're a third-party developer beholden to the expectation of a client unfamiliar with the technology, as is the case of the noted third-party GraphQL development company, Crucial.
Crucial: Creating An Architecture You Can Have Confidence In
"If you're not confident about something, you find less joy in it." -- Zach Lendon, Co-Founder, Crucial
Zach Lendon has a difficult job. He's a Co-founder of Crucial, a company that does third-party digital transformations using technologies such as GraphQL. Unlike in-house development groups that are implementing GraphQL to an enterprise that has, for the most part, already accepted the value of the technology, Crucial not only has to provide expert implementations of GraphQL from the git-go, but before even before starting work, Lendon has to convince his customers that GraphQL is the right technology to address the needs at hand.
Remember, in the scheme of things, GraphQL is relatively new. Adoption can be risky, particularly in industries, such as insurance, finance, and healthcare. These types of companies are historically risk-averse. Yet, Crucial has been very successful, even with companies that are reluctant to take the plunge with something as new as GraphQL.
We're all still trying to figure it out
Lendon admits that implementing GraphQL has its challenges. One of the biggest challenges for him is that in many ways GraphQL is still in an embryonic stage of it evolution. While a few of the "cooler" companies such as Facebook, GitHub and Airbnb have gone through the large number of development iterations required to get GraphQL right, for mainstream businesses that are just beginning with GraphQL, adoption can be daunting. There's not a lot of history to fall back on. As Lendon shared in a recent interview with ProgrammableWeb;
"Running GraphQL at scale is a space where the patterns aren't super clear. There's a lot of information in the community about setting up GraphQL servers, but as they get more advanced, as you distribute them out and build them in a larger capacity you end up having to learn the lessons on your own."
The organization of the business counts
Lendon finds that one of the key factors to implementing GraphQL successfully in any company is to make sure that the business's organizational structure can map onto the intended digital transformation that takes place when implementing GraphQL. As Melvin Conway, creator of Conway's Law, asserted back in 1967,
"...organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations."
According to Lendon, if a company's development staff is compartmentalized, then the approach is to have each workgroup create its own "micro-schema" which is then aggregated together into a common GraphQL schema using some type of schema stitching/aggregation technique such as the module-based approach recommended by Martijn Walraven at Apollo. Lendon points out that determining which approach to take is a balancing act:
"Sometimes if you have different teams working on [a] graph, you end up having micrographs and it just makes more sense to stitch it up. If you have one team working on the graph, then you have a singular graph. … With GraphQL we're still finding what that balance is. Like how should GraphQL organizations build teams to work on this technology? What's the right way to deploy them? What scenarios are different? That's what I felt as the tug of war for the past couple of years."
In addition to making sure that the technical implementation of GraphQL is in sync with the organizational structure of his clients, he also concerns himself with ongoing technical issues.
The challenge of pagination
One such technical issue that concerns Lendon is how to implement pagination over large datasets. After all, with a technology like GraphQL that can return an entire graph with one query, the results payload can easily run orders of magnitude bigger than when looking to solve the same problem with REST. Lendon admits that pagination is a complex topic that requires a lot of attention. (In fact, ProgrammableWeb has discovered that pagination is a concern of many of the people and companies interviewed for this installment.)
His approach is to follow the Relay connection specification when possible while also making sure his clients want to incur the cost and overhead required to implement pagination comprehensively. Lendon shares the following,
"I always try to follow the Relay a connection specification. I try to do cursor edges and nodes and follow the pattern they have there. I also ask [clients] if they really have to do pagination. Or, can we just return a [entire] data set? Can we build it differently in the graph and take out the complexity? I think that when you actually look at their use cases, some clients really don't necessarily need pagination. There are a lot of use cases where people don't want to go page after page, especially on mobile devices. And so I think that half the time you can also just not do pagination. it's a complicated area to understand."
Education is the key to adoption even in risk-averse organizations
When asked about having experienced any reluctance on the part of his clients to implement GraphQL due to lack of historical precedence or perceptions of technological immaturity, Lendon responded,
"I think because it's new, there are some security parts of organizations that certainly raise questions. But I've been surprised having personally worked with large financial organizations and large medical device organizations, that most of the GraphQL hesitation has just been because of lack of knowledge. When you communicate what it does and how it works. There has not been a lot of resistance. I think GraphQL has provided them with a nice way to solve a problem in that they have these front-end systems that are changing a lot and they don't want to strongly couple them a lot to the backend systems and the REST APIs. GraphQL provides them a nice way to say, 'well, let's build this abstraction in here that's really focused on our clients and their specific needs'."
Looking toward the future
Lendon plans expand Crucial's activities with GraphQL. He see GraphQL as a transformational technology and hopes that it's adopted by mainstream companies into their mission-critical business processes:
"I think what will be interesting in the years to come is to see if GraphQL goes further down [the technology stack] as these companies using GraphQL have more direct interaction with financially sensitive data and with sensitive HIPAA data. That will be interesting to watch."
As Crucial continues to do more work in mainstream finance and healthcare, the odds are that Lendon's vision for GraphQL will be realized in those business sectors.
Trulia and Edmunds: How to Create a Schema
Fundamental to designing for GraphQL is determining what the API should represent. Should the API model a company's entire domain or should it describe particular use cases within a domain? There is no cut and dried way to answer the question. Each company needs to decide for itself based on a variety of factors such as the intended purpose of the API, the degree of complexity of the company's data and processes, the engineering team's facility with GraphQL, and the availability of resources in terms of staff, time and money. There is no one size fits all solution. Different companies will come up with different models.
In order to get a real-world perspective on the issue, ProgrammableWeb interviewed two companies that implemented GraphQL from each perspective. Trulia, the real estate website for buyers, renters, and sellers, takes the "Model the Domain" approach. Edmunds, the website for automobile buyers, created its GraphQL API based on specific use cases. Each approach has merit. Each approach has ramifications as you're about to learn.
Trulia unifies its REST APIs into a single source of truth using GraphQL
The problem that lead Trulia to adopt GraphQL was one that's common to many companies. Each of the company's mobile applications and its web site had a special way to access data. The disparity was having a growing impact on developers. According to Sandosh Vasudevan, Staff Software Engineer at Trulia:
"Historically, we have all these different APIs and the [mobile] apps would talk to one API and the web app would talk to another API. And they would try to filter the data themselves. There was a lot of inconsistency. So we wanted to come up with a way where we decouple all these different clients, but have them talk to a single interface about all data questions. So that's why we evaluated GraphQL and it seemed like a really good idea."
Trulia decided that instead of having it so that each application uses a particular set of REST APIs, all applications — native mobile and web-based — would use a single GraphQL API. That GraphQL Under the hood, API might be getting data from any one of a number of pre-existing REST APIs and microservices, but for all intents and purposes, the single source of truth to applications was to be an API running under GraphQL.
One of the interesting positive side effects of the decision was that moving to GraphQL as the single source of truth for Trulia promoted more collaboration between groups and departments. Previously, development groups were isolated and independent because each application had its own set of APIs for accessing the data that needed to be supported. Implementing a common GraphQL interface made it so that everybody had to be part of the implementation initiative. As Vasudevan describes the benefit of collaboration:
"The biggest [benefit] is a collaboration between different teams. Before we would have all the data needs between apps and web and they would talk to the data services on their own and then have two different requirements. But now we are able to start from a single schema. So we are collaborating more. Everyone is on the same page and we are also matching our GraphQL schema to our internal domain model. So everything from what's in the database, what are product requirements, what our user stories are...now has become universal knowledge and everybody is aware of that."
Vasudevan reports that the subject of collaborative discussions have become very detailed.
"We are asking more and more questions as to what data is really important and do we even need it. As you know, with GraphQL it's discouraged to remove any fields. We are having to think twice about what fields we want and how and what format we want. So it's just a lot of collaboration. [There are] a lot of good questions that are coming out of it."
As a result of using GraphQL, Trulia's data has become more consistent. But there have been some drawbacks, mostly due to GraphQL being a relatively new technology. As Vasudevan reports:
"For example, initially when we were coming up with our first GraphQL schema, we didn't know how to map new fields to our existing APIs; whether it has to be one on one or whether we will only want to expose a subset of the fields. So there were a lot of questions around how do we do that."
Another drawback that Trulia experienced was network latency. To address the issue, Trulia has started to move many of the REST APIs that are used under the hood by the GraphQL API into a common datacenter under AWS. Also, the company is examining how to optimize performance under Apollo Server and Node.JS. (Apollo Server is the engine that powers Trulia's GraphQL API.)
Moving forward, Trulia's plan is to make GraphQL the single source of truth for all of its applications' data. Vasudevan concludes by saying,
"We want to continue going down this route of implementing GraphQL for every single data need at Trulia and make it like a single source of authority for any of our clients."
In addition to making GraphQL a critical component of its technological efforts moving forward, Trulia has implicitly made the architectural decision to use GraphQL to model to its entire domain according to Vasudevan who said "The most important thing, I would say, is data consistency. Apps and web all use the same data."
Yet, as we mentioned earlier, modeling to the domain is one approach to using GraphQL. The other is modeling to the use case, which is the approach that Edmunds took.
Edmunds takes a use case approach to adopting GraphQL
Edmunds adopted GraphQL due more to serendipity than direct intention. The company has always had a history of exploring promising technologies to meet the demands of its user base, which in addition to its 17 to 20 million monthly visitors, includes a network of auto dealers that use various marketing tools that Edmunds designs for their special use. GraphQL got the attention of the then CTO back in 2015 as a possible successor to REST. In 2018 Edmunds got serious about the technology. Upon examination, the company determined that GraphQL provided solutions for many of the technical problems that Edmunds was trying to solve. To be exact, Edmunds had three pressing issues to address. In an interview with ProgrammableWeb, Nitin Mahajan, executive director of engineering at Edmunds described the issues as follows:
"[First] we were trying to solve problems like... time to market; how fast we want to launch a feature. A second aspect is performance. We want our pages to be super fast as soon as possible. And number three is innovation with that. What I mean is having nice documentation and having good visibility to a broader audience of developers or product teams.
We were building internally similar features as GraphQL was solving. So we said, 'hey it's worth exploring and makes sense to adopt this new technology. Let's not reinvent the wheel and let's join the bandwagon to benefit from this'."
In late 2018 the company made the decision to use GraphQL in its next significant feature launch. The feature it decided upon was Vehicles Rankings. The feature allows users to view a ranked listing of motor vehicles according to certain criteria. Mahajan explained:
"So what it means is that as a consumer, you want to know what are the top five cars in different segments or sub-segments. I'll give you an example. You want to know what are the top five cars in SUV mid-size. Or I want to see the top five sedans, luxury cars. So that's what we call rankings. And you can go to Edmunds and if you want to see SUV, you can type in, www.edmunds.com/suv. That page and all those ranking pages are fully powered by GraphQL."
Initially the ranked listings feature accessed data directly from a database. This caused some problems that the company's engineers addressed by having the GraphQL implementation get data from existing REST APIs instead. But, as Mahajan pointed out, this created some new issues. As he said:
"Now it does lead into another GraphQL problem which is data mismatch because we have now three layers of data definition. One is at the database level, then REST API has its own domain. And we introduced a third one which is GraphQL. It's a tricky problem. I think more or less we will look at trade-offs. And unfortunately I didn't come across any standards which say, 'hey on this, this is the way.' So you pick your problems and you proceeded with those."
Unlike Trulia which decided from the beginning to make GraphQL the only source of data access for all applications across its entire domain, Edmunds approach to adoption was to implement GraphQL on a case by case basis. In this instance, the case was Vehicle Rankings. All other features on the site were left untouched.
Implementing GraphQL on a case by case basis gave the company a good deal of flexibility to manage its technical budget. Because Vehicle Ranking required returning only a list of the top five vehicles, Edmunds could get an implementation of GraphQL released while avoiding implementation of a complex technical feature such as pagination. As Manajan tell it:
"As of now, the implementation we have luckily does not have the pagination use case because we are limiting it to get a list of those top five vehicles. So we pretty much touched on all the line items in a single response."
Deferring mastery of GraphQL pagination — which can be a high risk undertaking — until a later release that has a use case that absolutely needs it allowed the company to have a successful experience with GraphQL quickly while leaving the door open for more complex development in the future. Deferring mastery of GraphQL pagination -- which can be a high-risk undertaking -- until a later release that has a use case that absolutely needs it allowed the company to have a successful experience with GraphQL quickly while leaving the door open for more complex development in the future.
Taking an approach that focuses on implementing technology according to a specific use case is emblematic of the "backends for frontends" design pattern, an approach to software development that Mahajan champions. The basic concept behind the BFF pattern is to create a separate backend service that's intended to be consumed by a specific frontend application or interface. The BFF pattern is useful when you want to avoid the work of creating a single backend for multiple consumers.
The road to Edmunds making a commitment to designing schemas according to use case was not direct. When the company first started contemplating how to implement GraphQL, it took an approach similar to Trulia; to design one grand schema to represent all aspects of the company's domain. For Edmunds, that approach didn't work. Mahajan describes the evolution to designing to the use case,
"During our first launch, our goal was to make the schema in a way that people understand domains. People could make sense out of, 'Hey this is a car. A car has reviews, it has images, real inventory, all those things. We were trying to create this nice graph where product people, dev people, everybody could understand car buying," said Manajan, "But on the flip side, what I see now is that as we see more and more use cases, that type of schema is too complex to fulfill all their needs. I've been reading a lot about what Airbnb is doing, what GitHub is doing, what Netflix has done and it all comes down to this pattern, BFF, backend for frontend."
Clearly, the company has found designing to the use case to be more effective. As Mahajan said, "So, in a nutshell, I should say design a schema for the use case and not with the domain or UI elements."
Putting It All Together
One thing that becomes evident among all the case studies we've examined in this installment is that GraphQL's power and flexibility allows developers to define the data they want, as they want it in a declarative manner against a unified data graph. It's a compelling value proposition, particularly given the fact that mobile devices are becoming powerful enough to perform data aggregation chores that previously could only be executed in the datacenter. GraphQL is well-positioned to become a go-to technology for data access across all platforms. It provides a lot of bang for the buck.
Yet, making GraphQL useful in the real world requires significant work. When it comes to effective adoption, the devil really is in the details. GraphQL is a relatively new technology without a lot of historical precedents. Teams coming to GraphQL for the first time will experience some steep learning curves, both in terms of understanding the general specification as well as absorbing concepts and techniques that are particular to the given implementation. For example, as we saw with the GitHub experience, its engineers had to pay special attention to the Ruby-ness of its GraphQL implementation. Those working on implementations in other frameworks will surely be faced with their own special set of challenges.
In addition to absorbing the basics, GraphQL has some technical challenges in terms of creating effective object types, network latency, and query pagination. The variety and scope of such challenges make it hard to formulate general, repeatable design patterns. While it's true that a good deal of knowledge is being centralized within the GraphQL community and the GraphQL Foundation, as you read with Crucial's example, many companies are still trying to figure things out on their own.
Finally, even after the coding aspects are mastered, there's still the issue of implementing to the domain versus implementing GraphQL to the use case. As you read above, Trulia modeled to the domain while Edmunds went for a use case. These are two distinctly different approaches. Each approach was valid to the company's need, but there were implications, especially around growth as well as maintenance.
Still despite the challenges at hand, GraphQL is a transformational technology that's here to stay. As time goes on, we're sure to see more formal design patterns and best practices emerge. Although still in its infancy, the GraphQL Foundation might very well become the common resource for the GraphQL development community at large. The organization's long term viability is a matter of support by developers and companies alike. A community is only as strong as the unified activities of its membership. A community that's formally unified will make GraphQL development easier. Otherwise, the risk is to keep having to figure things out from scratch and continuously reinventing the wheel. GraphQL is too important, too powerful a technology to suffer such a fate.



