Microservice REST API Design in 10 Steps – Part I

The NFJS podcast is back for the 2017 Season! This week I sit with Rohit and discuss his 10 step process for Microservice API design. 

Full Transcript:

[background music]

Michael Carducci:

You’re listening to the “No Fluff Just Stuff Podcast.” I’m joined here with Rohit, a regular on the tour here. I’m very pleased to have you in. We’re going to talk a little bit about microservices and your book that’s coming out.

I want to jump right in with the microservices, because I really feel, among other things, it’s a slightly malign topic. Certainly, it’s a popular word. It’s a buzz word. It’s the free in conference Bingo, if you will.

There’s a lot of situations where microservices are being jumped into because they’re the new, cool thing and not necessarily because they’re going to solve the right domain problem. I think many people are approaching this in the wrong direction.

I know you’re trying to address this in your book. Just quickly, what’s your background? Give us your insight on that.

Rohit Bhardwaj:

That’s the main pain point which I see. Hi, this is Rohit Bhardwaj. The way I look at it is that people have monolithic app, we come up with a business object, some structure. We’re trying to solve a REST API. We just build a REST API on top of it, not really looking at a use case perspective.

This is a pain point, which I see throughout the microservices area. Not only just microservices, but any time when we want to have any service provided, even if it’s regular API we are providing. We want to make sure that we go through some level of methodology.

This methodology which I’m putting in my book, which will be for designing the cloud-native, RESTful microservices API, would go through 10-step process, instead of jumping onto a solution.

It doesn’t make sense for anyone to jump onto a solution, say, “Hey, this is my REST API.” It will be the Holy Grail to solve all problems in the world. That’s the way we deal with any of these APIs, and that’s where the pain point come for me.


You really hit the nail on the head there, that we need to look at these things as more of an evolutionary, agile architecture, instead of replacing this monolith with this distributed monolith that is going to have to evolve over time.


Exactly. The way I put it, in my words, is that initially, we need to define what the problem is. What are we trying to solve here? That’s what we do in our regular approach.

We come up with a domain model, domain data analysis, and then we come up with the conceptual model. But we miss out on the part of what we are developing, and where is this going to be used? For example, if I am building a microservice, where it will be used, when it will be used, who will be using it, what will be its speed load? Those characteristics are missed.

The biggest characteristic that’s missed is why are we developing this in the first place. That brings down to the scoping of our approach. Zachman Framework actually has gone through a process where you look at the conceptual model, and you come up with your logical model and physical model on top of it.

The problem comes is that we already have defined our physical models, like the data we structure. We already defined that. We already defined our logical model, so nobody wants to change that. They want to build on top of it a façade layer, or some kind of layer, a proxy layer, to talk to a monolith, or some way of connecting it. That’s where it brings in lot of problems when we are trying to do the scalability.

We say that, “Hey, you still need to define the problem.” Find out what are we trying to solve, then jump right into the use case. What is your use case which you are trying to solve?

Let’s take airline industry. When we go to any of the sites, Priceline.com, or we go to Orbitz, what do we do? We say that I want to go from Boston to Chicago. I want to go between these dates. I want to go with that date range.

Basically, what it does is a query to around 50 different airlines. They come back with the results.

If you think about it, the user is impatient. They want a response time of one second or two seconds. How is it possible to do that if the airline is going to take more than two seconds to respond?


It’s not going to be an uncommon situation, knowing what I know about some airlines’ applications that have been more or less unchanged in 40 years, or thereabouts.


Exactly, and if you think about it, what they’re really looking for is a quick response time. The response time has to be, nowadays, in my company, and other companies also, we have a response time of less than 500 milliseconds.

We are trying to achieve 250 milliseconds response time for our REST APIs. How are we going to achieve that?

Not only that, if you talk to the product owners, say, “OK, I’m building this REST API,” who’s going to use it? Or if they come back and say that, “Oh, this’ll be used by just admin. Probably there are 10 admins in the system, not many admins,” you would not need to build a robust REST API. A broad REST API, you can put it on top of the same monolithic app, which you already have. You would not want to build that use case for heavy-duty work.


We do see that. Maybe it’s something that’s worth mentioning, is going through this discovery process, asking the question, “Do we really need a microservice?”

Microservices bring with them complexity and other issues that need to be dealt with. It only makes sense to bring on those complications and those issues if you’re going to get the benefits that come with it. If you don’t really need those benefits, or you’re not going to reap those benefits, you’re carrying complexity.


You are always coming back with an excuse of doing a microservice. That’s not the main idea behind it.

Microservice can definitely solve the problems if you are looking for scalability aspect of it, if you are looking for availability. Let’s say the product that I want is 99.99 percent available. If you say 99.99 available, that’s four nines. That’s five minutes of downtime per month. It’s a pretty high order to maintain.

Not only that, the important thing is availability. If there are 10 airlines, and one airline brings back the results in sub-second response time, who would be on the top of the list?


It depends on how the application’s been defined, as well.




Depending on where they’re putting these, if they’re not stored in the results as they come in, then…


They would be [inaudible 6:54] . That’s the problem, which we all face. What they need to do is they need to come up with a design which supports the performance, the performance which can do the filtering in the fastest way possible.

That brings to the number three aspect of my 10-step process, which has to do with what kind of use cases you’re trying to support. If you think from the CAP theorem perspective, like consistency, availability, and partition tolerance, important thing to remember here is that if I want a 250-milliseconds response time, I probably want to make sure my data is partitioned.

My data is partitioned with respect to dates, so I can say, “OK, I want to get my flight information for this day, going from here to here.” That information, because it’s already partitioned as part of adjacent object, I can return the value, compare my pricing, and I can send this pricing out really fast for the user.

That’s where we have to say that, “OK, for consistency, maybe consistency’s not that important in this situation.” Because it will be eventually consistent, but most important thing is I want to be highest availability, and also support the partition tolerance.

To do that, there are a lot of technologies which are available right now. Cassandra is a great database, which you can us for 10X more highly available system. Netflix use that system, and there are a lot of companies which are using it right now, at this moment.

The important thing here is that what are we going to do with that data. That’s what you want to make sure you understand what is the use case you are trying to support. Based on that use case, for example, in the airline industry we just discussed, you want to make sure that it’s highly available system.

That comes up with a lot of other problems, if you think about it. You may ask a question that, “Hey, I can do that.” The airline is saying that, “OK, I am available $200. You can book me.”

What could happen wrong there? Because it’s not consistent…


You’ve got no lock. It’s very possible to get a quote on a ticket where there’s one ticket left.


There’s only one ticket left. What can we do here?

The industry’s moving towards the direction where they can do overbooking. Overbooking is part of this, because you want to make sure you have overbooked. They’re under-booked. That’s the key part here.


The other thing that’s happening more and more that I see, that I actually think is a great idea, is when you get the results back from an airline that they’re not only saying, “Here’s the lowest available fare class for this particular flight on this date,” but also they tell you how many seats are left in that fare class.

You can look at that and say, “Oh, there’s three seats left,” or one seat left. That doesn’t solve the problem of these inconsistent read type things, that you’ll you read a price quote, and as soon as you go to book it’s not available anymore.

But it sets expectations, and I think that’s important as well in terms of building a user-friendly performance system.


Yeah, you hit the nail on my fifth point, which I’ll definitely come to, which has to do with the find-the-fare points. All the fare point is, “Hey, I’m going to book for three tickets, and two other people are booking it.”

I have to find out whether I can really go in that time period. One of the failure points which can come in — and that kind of drives our design, the microservice design, the API design, the RESTful API design which we are looking for.

That takes care of all the problems which can happen, and that, if you think about it, what can go wrong? You ask that question. You build that REST API, you’re going to build that REST API, what can go wrong while you are trying to do that?

You can find out, “Oh, maybe there’s a delay, there’s a delay lag,” or, “Maybe we are overbooked.” What if you are overbooked? Then what are you going to do? You can do compensation pattern, where you are compensating that person, United, Delta, and other airlines, they all started doing that.

“Anybody who would like to volunteer, we are overbooked.” People volunteer at that time. That’s the best model than saying “No, we cannot provide you the ticket.&quoquot;


It’s interesting that you can actually get some analytics around that, as well, and figure out what is the financial gain of overbooking, and what is the financial cost. What Delta does that I’ve seen, that I actually think is a great idea, is they have an auction system for volunteering your seat.

As you check in, they’ll say, “We’re overbooked, would you be willing to volunteer to give up your seat if there’s limited availability, and if so, what would you like in compensation?”

There’s a sliding scale. You can say, “I want a $500 voucher. I want a $100 voucher,” and they process. It creates this auction economy, where you can say, “All right, we need to give up six seats, and we’re going to go for the six lowest bids,” as it were.

If there are four people who say, “I would be happy to give up my seat for a $100 voucher, they are the first four.” You’ve got this supply and demand thing. It encourages people to be a little more generous with what compensation they would accept, and it gives the airline the chance to give people what they want, and what they ask for, which I think is an interesting idea.

I want to pause for a minute and review here for a moment. You’ve hit on a lot of really great points. Starting at the beginning, the step that so many people overlook is stop and think about the problem. “What are we really trying to achieve? What is the real problem we’re trying to solve?”

I see this, that so many people are excited to jump into a new solution through a new architecture, a new technology. The problem is almost inconsequential, and as a result, we end up with these really painful microservice implementations.

I remember Matt Stein posted a very popular tweet. He said, “If your microservices must be deployed as a complete set in a specific order, please put them back in a monolith and save yourself some pain.”

That seems to be a very real consequence of people not thinking through this entire process. Just to recap what we’ve had so far, what are a few of your suggestions? What are a few of your thoughts on this definition process in the beginning?


That’s the main thing. The important thing here is that when you come up with your conceptual model — let’s say if you talk about the airline industry only, you’re the buyer. He’s the customer. The customer goes to a site, and then that site needs to process all these airlines.

What’s happening? A customer can go in multiple airlines. An airline can have multiple customers coming to it. Now, the important thing here is that when you come up with this conceptual model, you have to take it further and say, “OK, I want to make sure that my results are back in 250 milliseconds.”

To recap our process behind this is that you want to make sure to not only just solve that particular problem, but design yourself based on it. That’s the next element which comes in which has to do when we say that use cases — and you’re trying to come up with a use case.

The important thing is that you have to find out how the customer is going to use your application. What’s the behavior-driven development? Once you come up with the behavior-driven development, you will know what the application flow will look like.

Let’s say, “I would like to know the cheapest airfare for any day. I want to find out, and I want in an order coming to me.” If the airlines have already processed that data, ready to go in a cache, or ready to go in a Cassandra database, I don’t have to do a join.

Join is bad, if you think about from the performance perspective, because you have millions and millions of rows in the database. Instead of doing a join at run time, you have already processed all the values, ready to go. That’s the way the Cassandra promises, if you use Cassandra, to get the data really fast from the system in subsequent response time.

That is the main use case from the airline’s perspective we need to apply here. We need to make sure that airlines is able to send all the scores back within some time. We get two to three second response time. We can figure out which airline you want to book in this case.


There’s one other thing I wanted to ask you about. You mentioned not just saying, “OK, I have this existing monolith. I’m going to build some services off that,” but to actually stop and take the opportunity to do some of this redesign work.

My question, though, is do you think it makes sense to keep that as a potentially valid approach as part of a more evolutionary architecture, because the reality is not a lot of people have the luxury to say, “OK, we have this monolith, but we’re not going to reuse any of this. We’re going to rebuild everything in microservices.” There usually has to be some kind of transition period.


Exactly, and that’s where the people shoot themself on their foot. What they do is they try to still linger around with the old monolithic model and figure out what their business subjects are, and then they build the rest of the API on top of it.

Those are the microservices still in process, and coming up with a design, instead of coming from the top to bottom. “What does my use case look like?” and how I’m going to satisfy that particular use case.

That’s the kind of part which is missing from this whole picture, which I was describing, where you have to come up with an ideal design. When you come up with your ideal design, you make sure of that, how my API will look like. How it will be designed.

There is very much a possibility that underneath the hood, you’re still using monoliths, but you clear the facade pattern. When using the facade pattern, your ideal design is not going to change. How the people are going to use your application, that’s not going to change.

The basic part which will change is the response time, because currently, the monolithic app may be able to just provide 50 concurrent users. You’re working towards making it 500 concurrent users.

That’s where the evolutionary design comes in, the agile design comes in, where you’re not making all the aspects of your monolithic app into all the services you want to provide, but provide ideal design, which is go to this facade layer, and build a facade layer.

You can still have the proxies, still calling your own monolithic app. You don’t need that much response time.


You’re not necessarily advocating throwing the baby out with the bathwater, as it were, but at the same time, not being shackled by the architecture of your monolith. Don’t start from there and build out from there.

Start with, “What do we really need to do?” Then figure out a way to meld the two layers as you evolve your broader architecture.


Exactly. The problem which comes is that people always try to think through and say, “I got a transaction, and I want to maintain that transaction. I already coded for that. Why can’t I just reuse that?” If you think about this use case…


That sounds like you’re entering a world of pain if you try to…


If you do that, exactly. You already got it coded, but it’s not solving the purpose. Now, you have to change, but you don’t have to change the whole thing. Let the transactions still go through the same way.

Why I say that, because how many people go into office, and first time — they are looking at they’re going to buy something Boston to Chicago — they got a code, and you just say, “OK, I’m going to buy this”? Not many people.


They’re price shopping.


They’re just price shopping.


They’re going to have a discussion and things like that.


They got to price shop and everything else. Finally, they talk to their spouse, and if they say yes, then they’re going. Otherwise, they’re not.


“quot;Can I get the time off of work?” There’s a whole lot of process.


The whole process going on. Do you think that whatever they have done so far, we need to store it in a transactional database?


No, not at all. It’s all ephemeral.


It’s all ephemeral. It’s not really needed as part of the transaction. When they see buy button, at that point…


That needs to be a transaction.


Exactly. Let that be handled through our monolithic app the way that it needs to handle it. You may want to do the performance improvements there, too, but the important thing here is that now your scale of a million people looking for a flight, and now, people who are actually buying is thousand.

That’s where you know where you want to focus on from the availability perspective. Now, you are solving two problems here. One problem is the transaction asset nature, using the SQL server database, or actual database, or enterprise DB database, nothing throw them away.

Those still there, but what you do is, using this model of splining, you use the same transactional data and spline the data to Cassandra database. Now, when you’re splining the data to Cassandra database, what’s really happening is now, in the Cassandra database, the data is available in the highly available format.


And highly performant format. You can get that data very quickly.


Exactly. That’s a win-win situation, where you may say to the product [inaudible 20:57] , “Hey, my data will be three seconds delayed. Is that OK, and what’s the risk?” That’s the next question you’re going to ask, like, &quoquot;Hey, what is the risk? What are my safe points in this?”


A lot of customers have short attention spans. If you have that delay in this thing, and I click, and I’m still waiting, and still waiting, the results are still trickling in, I might just lose interest. What is the statistic for the attention span of an average Web visitor now? It’s some number, single digit of seconds.


It’s anything more than two seconds, I’m out of that site, maybe maximum five seconds. You don’t want to do more than that. The important thing is that we want to drive in what our design looks like, and make sure that we support our use case.

That way, we get real benefit of out of creating the facade layer, and not only the facade layer. People make a mistake while building the facade layer also. They say like, “I want to solve everything in the world.”

That is not the way to create ideal design. You only solve the problem which is given to you. Do not try to solve all the problems. They say that, “I would want to make sure that my data, I can get customers by cost. I also want to get the similar data by representatives,” like someone else is booking for them.

In that case, they are still hitting the same database table. Instead of that, they can have duplicate database tables. Depending upon the usage, they have in ascending order or descending order. Duplication is perfectly fine in this example.

Now, 15 years back, I would have said differently. That was a perfect strategy, to have a normalized form. The database needs to be in the normalized form, but not anymore. You have to design looking at your application flow, how the data will look like, and store the data in the same format and partition so that you can get the data really fast.

That’s the key part for us in the facade layer, which we are building.


The practicum system, it goes back to the defining the problem points that you mentioned, that actually, it defines the performance and scalability issue as well. A lot of people want to optimize for a million concurrent users, when they only really have 100, 200 users.

Again, you’re creating a very painful world to architect and build your application. That’s just one of the things that I want to say, but I think you’re really speaking to a different audience, the people who are dealing with these issues.


The thing is, that’s exactly, too. You won’t want to build something…and cost is the main factor. You can build an architecture which can solve all the problems, space-based architecture. You can come up with a really, really costly solution.

Is that solution something useful, which you can really sustain? Is it a sustainable model? That’s all part of the architect. He has to make sure that whatever solution you come up with, it has to suffice what is really needed for you at a given time.


It reminds me of conversations I’ve had with business owners in different roles in my career, saying, “We need 100 percent availability.” “OK. We can provide that for you. It’s going to cost this much money. Now, if you want four nines, that means you’re going to have so many minutes of downtime a year. It’s only going to cost this much money.”

That’s a decision for you as a business to make. Is it worth two million extra dollars to avoid 35 minutes of downtime over the course of the year?


Exactly, and the important thing is what is the mitigation strategy? What is your mitigation strategy for any of the APIs which you’re designing in this case? That’s always, always the case. You have always got a trade-off, like you want to make sure that you build your application with some trade-offs in mind.

One of the trade-offs, which you correctly pointed out, is that you want to give the system what’s the available time, plus if there’s a downtime. Maybe that downtime is at a time where it’s perfectly fine to do the batch operations.

The companies do that. They have a lot of batch operations that they want to perform. That is a new model. The new REST API model says that, let’s say, for example, in DC wires and other places, you have a constant load.

The CPU is utilized 80 percent of the time, 70 to 80 percent of the time. Now, if the users goes down, the batch processing kicks in. They fill in the node. You want to utilize your images at the fullest granularity.

That’s where it comes the valley. You have to find out what are your peaks, what are your valleys, and come up with a limit structure where some APIs are kicking in at the time when the load is down, and then comes up. That way, you have uniformity of the load.


This is where we’ll start coming to the cloud native aspect of it, because there are capabilities that we can be leveraging, but we have to be designed to take advantage of those capabilities. For example, my application is in Azure.

I have my host that I can do whatever I want with, and then I have applications that run on top of that host. Then as well, on the same host, taking advantage of that same fixed cost for the hour of time, I have Web jobs.

These run in the background basically in spare cycles. When my application is getting hit really hard, these Web jobs are throttled back. When my Web traffic dies down, the Web jobs spin up, and they process queues, and they do all the things that they do.


Also the other problem which we see here is that people try to build their own infrastructure, instead of using some of the existing ones which are available. Like for example, you have security. You want to make sure you update your security aspect of error handling, logging, all those layers.


These are solved problems.


Every domain has the same problem. Those problems, by the way, are already solved by our tooling offering. If you think from the Spring perspective, Spring does solve a lot of these problems. Spring will solve these problems.

Why don’t we just use those technologies to solve some of those problems? Some of those design patterns would really help drive towards the scalability aspect of it, where the important thing is resiliency.

You are just not looking for “I have a problem, and then I will solve it when I get time. On Monday, I will go and solve that problem.” Instead of that, you come up with what are the failure points you will have, how your application will react to that particular failure point. That’s another key part here.

“Hey, maybe not all the airlines are returning me, but it doesn’t matter. I can return with the five airlines for now.” That happened with me. I was not getting some cheaper airline for some time. I thought, “Hey, maybe all the fares have gone.”

I came back after some time. Sure enough, I got in the search that airline. That airline was taking a long time to return me the results. That’s another aspect of it where you have to find out what are your failure points, and then based on that, you actually have a common design layer.

Those design layers or mediators are going to help us drive through our aspect of it to take care of this.


We’ve covered the first 5 points of your 10-point process. This has been really great stuff. We talked about your process for defining the problem, defining the use cases, consistency of availability, being partition-tolerant.

The ideal design, we didn’t really touch on that. Do you want to circle back to that before we wrap up for part one of this podcast, in terms of naming, microformats, and everything else?


The important thing here is that when you come up with the ideal design, you won’t just come up with the ideal design. You want to make sure that, for every industry, you to go any industry, it has already predefined formats for how the JSON should look like.

Why don’t we really reuse them? For example, you can go to schema.org and find out how the organization events are being processed. Another one is iana.org and microformats.org. You can go to these sites and basically get more information regarding how the processing can be done.

Whatever your business objects you are coming up with, the ideal design should always take into account the schema which you are building. You don’t want to start from scratch. That’s something which people try to do that.

People try, “Maybe I won’t call it a short name,” or something. It’s already there, already predefined how the address will look like, how it will be used throughout different applications. Why don’t elaborate the person, the same object for the person. Can you leverage and person can have many attributes which is predefined for everyone.

[inaudible 30:50] is the same system. [inaudible 30:51] has already defined all these things. That’s what I would suggest, like using as we said from the five-step process, you define the problem, define the use cases, find out from the use cases what kind of pattern you’re really looking. Are you want more consistency, or availability, or partition tolerant?

Come up with your ideal design, looking at also reconciling the names we are coming up with, with the schema.org. All right, thank you.


Really great stuff. Thank you so much, Rohit, for joining us. Thank you everybody for listening in. Do definitely check out part two where we’re going to do through the second part of this 10-part process. Again, thank you, and thank you for listening to the No Fluff Just Stuff Podcast.

Join us on the tour. We’ve got tour dates up on NoFluffJustStuff.com, and we’ve got some great destination shows coming up this year. I look forward to seeing you on the road.



Leave a Reply

Your email address will not be published. Required fields are marked *