Finding Harmony in Microservices: A Guide to Composing Your Monolith. Orchestration vs Choreography

Roman Glushach
6 min readJun 22, 2023

--

Orchestration vs Choreography

Microservices are a popular architectural style that breaks down an application into small, independent, and loosely coupled services that communicate with each other to achieve a business goal. However, designing and managing a complex workflow of microservices can be challenging, especially when it comes to deciding how the services should coordinate their actions and handle failures.

Orchestration

Orchestration

Orchestration is a centralized approach to microservice coordination, where a single service (the orchestrator) controls the workflow of the entire business transaction. The orchestrator is responsible for invoking other services (the participants) in a predefined order, passing data between them, handling errors and compensating actions, and monitoring the overall progress. The participants are unaware of the global workflow and just execute their assigned tasks.

Orchestration tools

Kubernetes is a well-known orchestrator, but there are other middleware options that perform similar functions, such as Docker Swarm and Mesos. The Cloud Native Computing Foundation (CNCF), which oversees the Kubernetes project, also supports the development of other orchestrators. These projects vary in maturity, with Crossplane in the advanced incubation phase and five others — Fluid, Karmada, Open Cluster Management, Volcano and wasmCloud — still in early development.

Advantages

  • provides a clear and explicit view of the workflow logic and the status of each step
  • simplifies the implementation of complex business rules and validations that span multiple services
  • enables easy modification of the workflow by changing the orchestrator logic

Disadvantages

  • creates a tight coupling between the orchestrator and the participants, which reduces their autonomy and reusability
  • introduces a single point of failure and a performance bottleneck in the system
  • requires synchronous communication between services, which increases latency and reduces scalability

Use cases

  • Order processing: The orchestrator manages the steps involved in processing an order, such as validating payment information, checking inventory availability, shipping items, sending notifications, etc.
  • Travel booking: The orchestrator coordinates the actions required to book a trip, such as reserving flights, hotels, cars, etc., applying discounts or coupons, confirming reservations, etc.
  • Workflow automation: The orchestrator executes a series of tasks based on predefined rules or conditions, such as sending emails, generating reports, updating databases, etc.

Possible problems and solutions

  • The orchestrator becomes a bottleneck or a single point of failure in the system, affecting the performance and reliability of the workflow: use a scalable and fault-tolerant technology for implementing the orchestrator, such as a message queue, a state machine, or a workflow engine.
  • The orchestrator needs to handle errors and compensating actions from the participants, increasing the complexity and maintenance of the workflow logic: use a saga pattern to implement long-running transactions that span multiple services, where each service performs a local transaction and publishes an event to trigger the next step or undo the previous step in case of failure.

Choreography

Choreography

Choreography is a decentralized approach to microservice coordination, where each service (the choreographer) decides when and how to perform its part of the workflow based on events or messages from other services. There is no central authority that controls the workflow; instead, the services cooperate autonomously by following a set of rules or agreements. The choreographers are aware of the global workflow and publish or subscribe to relevant events or messages.

Choreography tools

Kafka, RabbitMQ and Amazon SQS are some examples of event brokers. Service-meshes like Istio and the newer runtime system DAPR can also be integrated into an event-driven architecture.

Advantages

  • promotes loose coupling between services, which enhances their autonomy and reusability
  • eliminates a single point of failure and a performance bottleneck in the system
  • enables asynchronous communication between services, which reduces latency and improves scalability

Disadvantages

  • obscures the workflow logic and the status of each step, making it harder to understand and monitor
  • complicates the implementation of complex business rules and validations that span multiple services
  • requires careful design of events or messages to avoid inconsistency and duplication

Use cases

  • Social media: The choreographers publish or subscribe to events related to user activities, such as posting comments, liking posts, following users, etc., and update their state accordingly.
  • Online gaming: The choreographers exchange messages with other players or game servers based on user actions or game events, such as moving characters, shooting weapons, scoring points, etc., and update their state accordingly.
  • IoT: The choreographers react to events or messages from sensors or devices based on user preferences or environmental conditions, such as turning lights on or off, adjusting temperature, playing music, etc., and update their state accordingly.

Possible problems and solutions

  • The choreographers lack visibility and traceability of the workflow progress and status, making it difficult to debug and monitor the system: use an event-driven architecture to implement the choreography, where each service emits events or messages that describe its actions and outcomes, and use a centralized event store or log to capture and query the events or messages.
  • The choreographers need to ensure consistency and idempotency of the workflow outcome, avoiding duplication or omission of events or messages: use an event sourcing pattern to implement the choreography, where each service stores its state as a sequence of events or messages, and uses them to reconstruct its state or replay its actions when needed.

Common problems for both approaches

Service Discovery

One of the biggest challenges with microservices is service discovery. In a microservices architecture, there are many services that need to communicate with each other, and it can be difficult to keep track of which services are available and where they are located.

Use a service registry. A service registry is a centralized database that keeps track of all the services in your architecture. When a service needs to communicate with another service, it can look up the service in the registry to find its location.

Data Consistency

Another challenge with microservices is data consistency. In a microservices architecture, each service has its own database, which can make it difficult to ensure that data is consistent across all services.

Use a distributed transaction coordinator. A distributed transaction coordinator is a service that manages transactions across multiple services. When a transaction involves multiple services, the coordinator ensures that all services commit or rollback the transaction together.

Service Resilience

Service resilience is another challenge with microservices. In a microservices architecture, there are many services that need to communicate with each other, and it can be difficult to ensure that the system remains resilient in the face of failures.

Use circuit breakers. A circuit breaker is a pattern that can be used to detect and recover from failures in a microservices architecture. When a service fails, the circuit breaker can be used to stop sending requests to the service and to redirect requests to a backup service.

Conclusion

There is no definitive answer which one to chose, as both approaches have their pros and cons depending on the context and requirements of your application. However, here are some general guidelines that can help you make an informed decision:

  • Use orchestration when you need a high level of control over the workflow logic, such as when you have complex business rules or validations that involve multiple services.
  • Use choreography when you need a high level of flexibility over the workflow execution, such as when you have dynamic or unpredictable scenarios that require adaptive behavior from services.
  • Use a hybrid approach when you need a balance between control and flexibility, such as when you have some parts of the workflow that are well-defined and stable, and some parts that are variable and evolving.

--

--

Roman Glushach
Roman Glushach

Written by Roman Glushach

Senior Software Architect & Engineer Manager at Freelance

No responses yet