Skip to main content

Sagas

When building distributed systems, it is crucial to ensure that the system remains consistent even in the presence of failures. One way to achieve this is by using the Saga pattern.

Sagas are a way to manage transactions that span multiple services. They allow you to run compensations when your code crashes halfway through. This way, you can ensure that your system remains consistent even in the presence of failures.

Implementing Sagas in Restate

Let’s assume we want to build a travel booking application. The core of our application is a workflow that first tries to book the flight, then rents a car, and finally processes the customer’s payment before confirming the flight and car rental. When the payment fails, we want to undo the flight booking and car rental.

Restate lets us implement this purely in user code:

  • Wrap your business logic in a try-block, and throw a terminal error for cases where you want to compensate and finish.
  • For each step you do in your try-block, add a compensation to a list.
  • In the catch block, in case of a terminal error, you run the compensations in reverse order, and rethrow the error.

Restate guarantees us that all code will execute. So if a terminal error is thrown, all compensations will run:

Language of the next code block: typescript

const bookingWorkflow = restate.service({
name: "BookingWorkflow",
handlers: {
run: async (ctx: restate.Context, req: BookingRequest) => {
// create a list of undo actions
const compensations = [];
try {
// For each action, we register a compensation that will be executed on failures
compensations.push(() =>
ctx.run("Cancel flight", () => flightClient.cancel(req.customerId))
);
await ctx.run("Book flight", () =>
flightClient.book(req.customerId, req.flight)
);
compensations.push(() =>
ctx.run("Cancel car", () => carRentalClient.cancel(req.customerId))
);
await ctx.run("Book car", () =>
carRentalClient.book(req.customerId, req.car)
);
compensations.push(() =>
ctx.run("Cancel hotel", () => hotelClient.cancel(req.customerId))
);
await ctx.run("Book hotel", () =>
hotelClient.book(req.customerId, req.hotel)
);
} catch (e) {
// Terminal errors are not retried by Restate, so undo previous actions and fail the workflow
if (e instanceof restate.TerminalError) {
// Restate guarantees that all compensations are executed
for (const compensation of compensations.reverse()) {
await compensation();
}
}
throw e;
}
},
},
});
restate.endpoint().bind(bookingWorkflow).listen(9080);
Example not available in your language?

This pattern is implementable with any of our SDKs. We are still working on translating all patterns to all SDK languages. If you need help with a specific language, please reach out to us via Discord or Slack.

When to use Sagas

Restate runs invocations till completion, with infinite retries and recovery of partial progress. In that sense, you do not require to run compensations in between retries. Restate will start the retry attempt from the point where the invocation failed.

However, there can still be cases in your business logic where you want to stop a handler from executing any further and run compensations for the work done so far.

You will also need sagas to end up in a consistent state when you cancel an invocation (via the CLI or programmatically). For example, if an invocation gets stuck because an external system is not responding, you might want to stop executing the invocation while keeping the overall system state consistent.

Registering compensations

Because this is all implemented in pure user code, there are no restrictions on what you can do in compensations, as long as its idempotent. It is for example possible to reset the state of the service, call other services to undo previously executed calls, or run ctx.run actions to delete previously inserted rows in a database.

Adding compensations

Depending on the characteristics of the API, adding the compensation might look different:

  1. The flights and cars require to first reserve, and then use the ID you get to confirm or cancel. In this case, we add the compensation after creating the reservation (because we need the ID).

  2. The example of the payment API requires you to generate an idempotency key yourself, and executes in one shot. Here, we add the compensation before performing the action, using the same UUID. This way, we ensure that a payment which throws a terminal error did not go through.