Cloud Patterns

Containing twenty-four design patterns and ten related guidance topics, this guide articulates the benefit of applying patterns by showing how each piece can fit into the big picture of cloud application architectures. It also discusses the benefits and considerations for each pattern. Most of the patterns have code samples or snippets that show how to implement the patterns using the features of Microsoft Azure. However the majority of topics described in this guide are equally relevant to a...

Laten we beginnen. Het is Gratis
of registreren met je e-mailadres
Cloud Patterns Door Mind Map: Cloud Patterns

1. Availability Patterns and Guidance

1.1. Health Endpoint Monitoring Pattern

1.1.1. Implement functional checks within an application that external tools can access through exposed endpoints at regular intervals. This pattern can help to verify that applications and services are performing correctly.

1.1.2. When to Use this Pattern Monitoring websites and web applications to verify availability. Monitoring websites and web applications to check for correct operation. Monitoring middle-tier or shared services to detect and isolate a failure that could disrupt other applications. To complement existing instrumentation within the application, such as performance counters and error handlers. Health verification checking does not replace the requirement for logging and auditing in the application. Instrumentation can provide valuable information for an existing framework that monitors counters and error logs to detect failures or other issues. However, it cannot provide information if the application is unavailable.

1.1.3. Related Patterns and Guidance Instrumentation and Telemetry Guidance

1.2. Queue-based Load Leveling Pattern

1.2.1. Use a queue that acts as a buffer between a task and a service that it invokes in order to smooth intermittent heavy loads that may otherwise cause the service to fail or the task to time out. This pattern can help to minimize the impact of peaks in demand on availability and responsiveness for both the task and the service.

1.2.2. When to Use this Pattern This pattern is ideally suited to any type of application that uses services that may be subject to overloading. NOT: This pattern might not be suitable if the application expects a response from the service with minimal latency.

1.2.3. Related Patterns and Guidance Asynchronous Messaging Primer Competing Consumer Pattern Throttling Pattern

1.3. Throttling Pattern

1.3.1. Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service. This pattern can allow the system to continue to function and meet service level agreements, even when an increase in demand places an extreme load on resources.

1.4. Multiple Datacenter Deployment Guidance

2. Data Management Patterns and Guidance

2.1. Cache-aside Pattern

2.1.1. Load data on demand into a cache from a data store. This pattern can improve performance and also helps to maintain consistency between data held in the cache and the data in the underlying data store.

2.1.2. When to Use this Pattern A cache does not provide native read-through and write-through operations. Resource demand is unpredictable. This pattern enables applications to load data on demand. It makes no assumptions about which data an application will require in advance. NOT: When the cached data set is static. If the data will fit into the available cache space, prime the cache with the data on startup and apply a policy that prevents the data from expiring. NOT: For caching session state information in a web application hosted in a web farm. In this environment, you should avoid introducing dependencies based on client-server affinity.

2.1.3. Related Patterns and Guidance Caching Guidance Data Consistency Primer

2.2. Command and Query Responsible Segregation (CQRS) Pattern

2.2.1. Segregate operations that read data from operations that update data by using separate interfaces. This pattern can maximize performance, scalability, and security; support evolution of the system over time through higher flexibility; and prevent update commands from causing merge conflicts at the domain level.

2.2.2. When to Use this Pattern Collaborative domains where multiple operations are performed in parallel on the same data. CQRS allows you to define commands with a sufficient granularity to minimize merge conflicts at the domain level (or any conflicts that do arise can be merged by the command), even when updating what appears to be the same type of data. Use with task-based user interfaces (where users are guided through a complex process as a series of steps), with complex domain models, and for teams already familiar with domain-driven design (DDD) techniques. The write model has a full command-processing stack with business logic, input validation, and business validation to ensure that everything is always consistent for each of the aggregates (each cluster of associated objects that are treated as a unit for the purpose of data changes) in the write model. The read model has no business logic or validation stack and just returns a DTO for use in a view model. The read model is eventually consistent with the write model. Scenarios where performance of data reads must be fine-tuned separately from performance of data writes, especially when the read/write ratio is very high, and when horizontal scaling is required. For example, in many systems the number of read operations is orders of magnitude greater that the number of write operations. To accommodate this, consider scaling out the read model, but running the write model on only one or a few instances. A small number of write model instances also helps to minimize the occurrence of merge conflicts. Scenarios where one team of developers can focus on the complex domain model that is part of the write model, and another less experienced team can focus on the read model and the user interfaces. Scenarios where the system is expected to evolve over time and may contain multiple versions of the model, or where business rules change regularly. Integration with other systems, especially in combination with Event Sourcing, where the temporal failure of one subsystem should not affect the availability of the others. NOT: Where the domain or the business rules are simple. NOT: Where a simple CRUD-style user interface and the related data access operations are sufficient. NOT: For implementation across the whole system. There are specific components of an overall data management scenario where CQRS can be useful, but it can add considerable and often unnecessary complexity where it is not actually required.

2.2.3. Related Patterns and Guidance Data Consistency Primer Data Partitioning Guidance Event Sourcing Pattern Materlialized View Pattern

2.3. Event Sourcing Pattern

2.3.1. Use an append-only store to record the full series of events that describe actions taken on data in a domain, rather than storing just the current state, so that the store can be used to materialize the domain objects. This pattern can simplify tasks in complex domains by avoiding the requirement to synchronize the data model and the business domain; improve performance, scalability, and responsiveness; provide consistency for transactional data; and maintain full audit trails and history that may enable compensating actions.

2.3.2. When to Use this Pattern When you want to capture “intent,” “purpose,” or “reason” in the data. For example, changes to a customer entity may be captured as a series of specific event types such as Moved home, Closed account, or Deceased. When it is vital to minimize or completely avoid the occurrence of conflicting updates to data. When you want to record events that occur, and be able to replay them to restore the state of a system; use them to roll back changes to a system; or simply as a history and audit log. For example, when a task involves multiple steps you may need to execute actions to revert updates and then replay some steps to bring the data back into a consistent state. When using events is a natural feature of the operation of the application, and requires little additional development or implementation effort. When you need to decouple the process of inputting or updating data from the tasks required to apply these actions. This may be to improve UI performance, or to distribute events to other listeners such as other applications or services that must take some action when the events occur. An example would be integrating a payroll system with an expenses submission website so that events raised by the event store in response to data updates made in the expenses submission website are consumed by both the website and the payroll system. When you want flexibility to be able to change the format of materialized models and entity data if requirements change, or—when used in conjunction with CQRS—you need to adapt a read model or the views that expose the data. When used in conjunction with CQRS, and eventual consistency is acceptable while a read model is updated or, alternatively, the performance impact incurred in rehydrating entities and data from an event stream is acceptable. NOT: Small or simple domains, systems that have little or no business logic, or non-domain systems that naturally work well with traditional CRUD data management mechanisms. NOT: Systems where consistency and real-time updates to the views of the data are required. NOT: Systems where audit trails, history, and capabilities to roll back and replay actions are not required. NOT: Systems where there is only a very low occurrence of conflicting updates to the underlying data. For example, systems that predominantly add data rather than updating it.

2.3.3. Related Patterns and Guidance Command and Query Responsibility Segregation (CQRS) Pattern Materlialized View Pattern Compensation Transaction Pattern Data Consistency Primer Data Partitioning Guidance

2.4. Index Table Pattern

2.4.1. Create indexes over the fields in data stores that are frequently referenced by query criteria. This pattern can improve query performance by allowing applications to more quickly locate the data to retrieve from a data store.

2.4.2. When to Use this Pattern Use this pattern to improve query performance when an application frequently needs to retrieve data by using a key other than the primary (or shard) key. NOT: Data is volatile. An index table may become out of date very quickly, rendering it ineffective or making the overhead of maintaining the index table greater than any savings made by using it. NOT: A field selected as the secondary key for an index table is very non-discriminating and can only have a small set of values (for example, gender). NOT: The balance of the data values for a field selected as the secondary key for an index table are highly skewed. For example, if 90% of the records contain the same value in a field, then creating and maintaining an index table to look up data based on this field may exert more overhead than scanning sequentially through the data. However, if queries very frequently target values that lie in the remaining 10%, this index may be useful. You must understand the queries that your application is performing, and how frequently they are performed.

2.4.3. Related Patterns and Guidance Data Consistency Primer Sharding Pattern Materialized View Pattern

2.5. Materialized View Pattern

2.5.1. Generate prepopulated views over the data in one or more data stores when the data is formatted in a way that does not favor the required query operations. This pattern can help to support efficient querying and data extraction, and improve application performance.

2.5.2. When to Use this Pattern Creating materialized views over data that is difficult to query directly, or where queries must be very complex in order to extract data that is stored in a normalized, semi-structured, or unstructured way. Creating temporary views that can dramatically improve query performance, or can act directly as source views or data transfer objects (DTOs) for the UI, for reporting, or for display. Supporting occasionally connected or disconnected scenarios where connection to the data store is not always available. The view may be cached locally in this case. Simplifying queries and exposing data for experimentation in a way that does not require knowledge of the source data format. For example, by joining different tables in one or more databases, or one or more domains in NoSQL stores, and then formatting the data to suit its eventual use. Providing access to specific subsets of the source data that, for security or privacy reasons, should not be generally accessible, open to modification, or fully exposed to users. Bridging the disjoint when using different data stores based on their individual capabilities. For example, by using a cloud store that is efficient for writing as the reference data store, and a relational database that offers good query and read performance to hold the materialized views. NOT: The source data is simple and easy to query. NOT: The source data changes very quickly, or can be accessed without using a view. The processing overhead of creating views may be avoidable in these cases. NOT: Consistency is a high priority. The views may not always be fully consistent with the original data.

2.5.3. Related Patterns and Guidance Data Consistency Primer Command and Query Responsible Segregation (CQRS) Pattern Event Sourcing Pattern Index Table Pattern

2.6. Sharding Pattern

2.6.1. Divide a data store into a set of horizontal partitions or shards. This pattern can improve scalability when storing and accessing large volumes of data.

2.6.2. When to Use this Pattern When a data store is likely to need to scale beyond the limits of the resources available to a single storage node. To improve performance by reducing contention in a data store.

2.6.3. Related Patterns and Guidance Data Consistency Primer Data Partitioning Guidance Index Table Pattern Materialized View Pattern

2.7. Static Content Hosting Pattern

2.7.1. Deploy static content to a cloud-based storage service that can deliver these directly to the client. This pattern can reduce the requirement for potentially expensive compute instances.

2.7.2. When to Use this Pattern Minimizing the hosting cost for websites and applications that contain some static resources. Minimizing the hosting cost for websites that consist of only static content and resources. Depending on the capabilities of the hosting provider’s storage system, it might be possible to host a fully static website in its entirety within a storage account. Exposing static resources and content for applications running in other hosting environments or on-premises servers. Locating content in more than one geographical area by using a content delivery network that caches the contents of the storage account in multiple datacenters around the world. Monitoring costs and bandwidth usage. Using a separate storage account for some or all of the static content allows the costs to be more easily distinguished from hosting and runtime costs. NOT: The application needs to perform some processing on the static content before delivering it to the client. For example, it may be necessary to add a timestamp to a document. NOT: The volume of static content is very small. The overhead of retrieving this content from separate storage may outweigh the cost benefit of separating it out from the compute resources.

2.7.3. Related Patterns and Guidance Valet Key Pattern

2.8. Valet Key Pattern

2.8.1. Use a token or key that provides clients with restricted direct access to a specific resource or service in order to offload data transfer operations from the application code. This pattern is particularly useful in applications that use cloud-hosted storage systems or queues, and can minimize cost and maximize scalability and performance.

2.8.2. When to Use this Pattern To minimize resource loading and maximize performance and scalability. Using a valet key does not require the resource to be locked, no remote server call is required, there is no limit on the number of valet keys that can be issued, and it avoids a single point of failure that would arise from performing the data transfer through the application code. Creating a valet key is typically a simple cryptographic operation of signing a string with a key. To minimize operational cost. Enabling direct access to stores and queues is resource and cost efficient, can result in fewer network round trips, and may allow for a reduction in the number of compute resources required. When clients regularly upload or download data, particularly where there is a large volume or when each operation involves large files. When the application has limited compute resources available, either due to hosting limitations or cost considerations. In this scenario, the pattern is even more advantageous if there are many concurrent data uploads or downloads because it relieves the application from handling the data transfer. When the data is stored in a remote data store or a different datacenter. If the application was required to act as a gatekeeper, there may be a charge for the additional bandwidth of transferring the data between datacenters, or across public or private networks between the client and the application, and then between the application and the data store. NOT: If the application must perform some task on the data before it is stored or before it is sent to the client. For example, the application may need to perform validation, log access success, or execute a transformation on the data. However, some data stores and clients are able to negotiate and carry out simple transformations such as compression and decompression (for example, a web browser can usually handle GZip formats). NOT: If the design and implementation of an existing application makes it difficult and costly to implement. Using this pattern typically requires a different architectural approach for delivering and receiving data. NOT: If it is necessary to maintain audit trails or control the number of times a data transfer operation is executed, and the valet key mechanism in use does not support notifications that the server can use to manage these operations. NOT: If it is necessary to limit the size of the data, especially during upload operations. The only solution to this is for the application to check the data size after the operation is complete, or check the size of uploads after a specified period or on a scheduled basis.

2.8.3. Related Patterns and Guidance Gatekeeper Pattern Static Content Hosting Pattern

2.9. Caching Guidance

2.10. Data Consistency Primer

2.11. Data Partitioning Guidance

2.12. Data Replication and Synchronization Guidance

3. Design and Implementation Patterns and Guidance

3.1. Compute Resource Consolidation Pattern

3.1.1. Consolidate multiple tasks or operations into a single computational unit. This pattern can increase compute resource utilization, and reduce the costs and management overhead associated with performing compute processing in cloud-hosted applications.

3.1.2. When to Use this Pattern Use this pattern for tasks that are not cost effective if they run in their own computational units. If a task spends much of its time idle, running this task in a dedicated unit can be expensive. NOT: This pattern might not be suitable for tasks that perform critical fault-tolerant operations, or tasks that process highly-sensitive or private data and require their own security context. These tasks should run in their own isolated environment, in a separate computational unit.

3.1.3. Related Patterns and Guidance Autoscale Guidance Compute Partitioning Guidance

3.2. Command and Query Responsible Segregation (CQRS) Pattern

3.3. External Configuration Store Pattern

3.3.1. Move configuration information out of the application deployment package to a centralized location. This pattern can provide opportunities for easier management and control of configuration data, and for sharing configuration data across applications and application instances

3.3.2. When to Use this Pattern Configuration settings that are shared between multiple applications and application instances, or where a standard configuration must be enforced across multiple applications and application instances. Where the standard configuration system does not support all of the required configuration settings, such as storing images or complex data types. As a complementary store for some of the settings for applications, perhaps allowing applications to override some or all of the centrally-stored settings. As a mechanism for simplifying administration of multiple applications, and optionally for monitoring use of configuration settings by logging some or all types of access to the configuration store.

3.3.3. Related Patterns and Guidance Runtime Reconfiguration Pattern

3.4. Leader Election Pattern

3.4.1. Coordinate the actions performed by a collection of collaborating task instances in a distributed application by electing one instance as the leader that assumes responsibility for managing the other instances. This pattern can help to ensure that task instances do not conflict with each other, cause contention for shared resources, or inadvertently interfere with the work that other task instances are performing.

3.4.2. When to Use this Pattern Use this pattern when the tasks in a distributed application, such as a cloud-hosted solution, require careful coordination and there is no natural leader. NOT: If there is a natural leader or dedicated process that can always act as the leader. For example, it may be possible to implement a singleton process that coordinates the task instances. If this process fails or becomes unhealthy, the system can shut it down and restart it. NOT: If the coordination between tasks can be easily achieved by using a more lightweight mechanism. For example, if several task instances simply require coordinated access to a shared resource, a preferable solution might be to use optimistic or pessimistic locking to control access to that resource. NOT: If a third-party solution is more appropriate. For example, the Microsoft Azure HDInsight service (based on Apache Hadoop) uses the services provided by Apache Zookeeper to coordinate the map/reduce tasks that aggregate and summarize data.

3.4.3. Related Patterns and Guidance Autoscaling Guidance Compute Partitioning Guidance

3.5. Pipes and Filters Pattern

3.5.1. Decompose a task that performs complex processing into a series of discrete elements that can be reused. This pattern can improve performance, scalability, and reusability by allowing task elements that perform the processing to be deployed and scaled independently.

3.5.2. When to Use this Pattern The processing required by an application can easily be decomposed into a set of discrete, independent steps. The processing steps performed by an application have different scalability requirements. Flexibility is required to allow reordering of the processing steps performed by an application, or the capability to add and remove steps. The system can benefit from distributing the processing for steps across different servers. A reliable solution is required that minimizes the effects of failure in a step while data is being processed. NOT: The processing steps performed by an application are not independent, or they must be performed together as part of the same transaction. NOt: The amount of context or state information required by a step makes this approach inefficient. It may be possible to persist state information to a database instead, but do not use this strategy if the additional load on the database causes excessive contention.

3.5.3. Related Patterns and Guidance Competing Consumers Pattern Compute Resource Consolidation Pattern Compensating Transaction Pattern

3.6. Runtime Reconfiguration Pattern

3.6.1. Design an application so that it can be reconfigured without requiring redeployment or restarting the application. This helps to maintain availability and minimize downtime.

3.6.2. When to Use this Pattern Applications for which you must avoid all unnecessary downtime, while still being able to apply changes to the application configuration. Environments that expose events raised automatically when the main configuration changes. Typically this is when a new configuration file is detected, or when changes are made to an existing configuration file. Applications where the configuration changes often and the changes can be applied to components without requiring the application to be restarted, or without requiring the hosting server to be rebooted. NOT: This pattern might not be suitable if the runtime components are designed so they can be configured only at initialization time, and the effort of updating those components cannot be justified in comparison to restarting the application and enduring a short downtime.

3.7. Static Content Hosting Pattern

3.8. Compute Partitioning Guidance

4. Management and Monitoring Patterns and Guidance

4.1. External Configuration Store Pattern

4.2. Health Endpoint Monitoring Pattern

4.3. Runtime Reconfiguration Pattern

4.4. Service Metering Guidance

5. Messaging Patterns and Guidance

5.1. Competing Consumers Pattern

5.1.1. Enable multiple concurrent consumers to process messages received on the same messaging channel. This pattern enables a system to process multiple messages concurrently to optimize throughput, to improve scalability and availability, and to balance the workload.

5.1.2. When to Use this Pattern The workload for an application is divided into tasks that can run asynchronously. Tasks are independent and can run in parallel. The volume of work is highly variable, requiring a scalable solution. The solution must provide high availability, and must be resilient if the processing for a task fails.

5.1.3. Related Patterns and Guidance Asynchronous Messaging Primer Autoscaling Guidance Compute Resource Consolidation Pattern Queue-based Load Leveling Pattern

5.2. Pipes and Filters Pattern

5.3. Priority Queue Pattern

5.3.1. Prioritize requests sent to services so that requests with a higher priority are received and processed more quickly than those of a lower priority. This pattern is useful in applications that offer different service level guarantees to individual clients.

5.3.2. When to Use this Pattern The system must handle multiple tasks that might have different priorities. Different users or tenants should be served with different priority.

5.4. Queue-based Load Leveling Pattern

5.5. Scheduler Agent Supervisor Pattern

5.5.1. Coordinate a set of actions across a distributed set of services and other remote resources, attempt to transparently handle faults if any of these actions fail, or undo the effects of the work performed if the system cannot recover from a fault. This pattern can add resiliency to a distributed system by enabling it to recover and retry actions that fail due to transient exceptions, long-lasting faults, and process failures.

5.5.2. When to Use this Pattern Use this pattern when a process that runs in a distributed environment such as the cloud must be resilient to communications failure and/or operational failure. NOT: This pattern might not be suitable for tasks that do not invoke remote services or access remote resources.

5.5.3. Related Patterns and Guidance Retry Pattern Compensating Transaction Pattern Asynchronous Messaging Primer Leader Election Pattern

5.6. Asynchronous Messaging Primer

6. Performance and Scalability Patterns and Guidance

6.1. Cache-aside Pattern

6.2. Competing Consumers Pattern

6.3. Command and Query Responsible Segregation (CQRS) Pattern

6.4. Event Sourcing Pattern

6.5. Index Table Pattern

6.6. Materialized View Pattern

6.7. Priority Queue Pattern

6.8. Queue-based Load Leveling Pattern

6.9. Sharding Pattern

6.10. Static Content Hosting Pattern

6.11. Throttling Pattern

6.12. Autoscale Guidance

6.13. Caching Guidance

6.14. Data Consistency Guidance

6.15. Data Partitioning Guidance

7. Resiliency Patterns and Guidance

7.1. Circut Breaker Pattern

7.1.1. Resiliency is the ability of a system to gracefully handle and recover from failures. The nature of cloud hosting, where applications are often multi-tenant, use shared platform services, compete for resources and bandwidth, communicate over the Internet, and run on commodity hardware means there is an increased likelihood that both transient and more permanent faults will arise. Detecting failures, and recovering quickly and efficiently, is necessary to maintain resiliency.

7.1.2. When to Use this Pattern To prevent an application from attempting to invoke a remote service or access a shared resource if this operation is highly likely to fail. NOT: For handling access to local private resources in an application, such as in-memory data structure. In this environment, using a circuit breaker would simply add overhead to your system. NOT: As a substitute for handling exceptions in the business logic of your applications.

7.1.3. Related Patterns and Guidance Retry Pattern Health Endpoint Monitoring Pattern

7.2. Compensating Transaction Pattern

7.2.1. Undo the work performed by a series of steps, which together define an eventually consistent operation, if one or more of the operations fails. Operations that follow the eventual consistency model are commonly found in cloud-hosted applications that implement complex business processes and workflows.

7.2.2. When to Use this Pattern Use this pattern only for operations that must be undone if they fail. If possible, design solutions to avoid the complexity of requiring compensating transactions.

7.2.3. Related Patterns and Guidance Data Consistency Primer Scheduler Agent Supervisor Pattern Retry Pattern

7.3. Leader Election Pattern

7.4. Retry Pattern

7.4.1. Enable an application to handle temporary failures when connecting to a service or network resource by transparently retrying the operation in the expectation that the failure is transient. This pattern can improve the stability of the application.

7.4.2. When to Use this Pattern When an application could experience transient faults as it interacts with a remote service or accesses a remote resource. These faults are expected to be short lived, and repeating a request that has previously failed could succeed on a subsequent attempt. NOT: When a fault is likely to be long lasting, because this can affect the responsiveness of an application. The application may simply be wasting time and resources attempting to repeat a request that is most likely to fail. NOT: For handling failures that are not due to transient faults, such as internal exceptions caused by errors in the business logic of an application. NOT: As an alternative to addressing scalability issues in a system. If an application experiences frequent “busy” faults, it is often an indication that the service or resource being accessed should be scaled up.

7.4.3. Related Patterns and Guidance Circuit Breaker Pattern

7.5. Scheduler Agent Supervisor Pattern

8. Security Patterns and Guidance

8.1. Federated Identity Pattern

8.1.1. Delegate authentication to an external identity provider. This pattern can simplify development, minimize the requirement for user administration, and improve the user experience of the application.

8.1.2. When to Use this Pattern Single sign on in the enterprise Federated identity with multiple partners Federated identity in SaaS applications NOT: All users of the application can be authenticated by one identity provider, and there is no requirement to authenticate using any other identity provider. This is typical in business applications that use only a corporate directory for authentication, and access to this directory is available in the application directly, by using a VPN, or (in a cloud-hosted scenario) through a virtual network connection between the on-premises directory and the application. NOT: The application was originally built using a different authentication mechanism, perhaps with custom user stores, or does not have the capability to handle the negotiation standards used by claims-based technologies. Retrofitting claims-based authentication and access control into existing applications can be complex, and may not be cost effective.

8.1.3. Related Patterns and Guidance

8.2. Gatekeeper Pattern

8.2.1. Protect applications and services by using a dedicated host instance that acts as a broker between clients and the application or service, validates and sanitizes requests, and passes requests and data between them. This can provide an additional layer of security, and limit the attack surface of the system.

8.2.2. When to Use this Pattern Applications that handle sensitive information, expose services that must have high a degree of protection from malicious attacks, or perform mission-critical operations that must not be disrupted. Distributed applications where it is necessary to perform request validation separately from the main tasks, or to centralize this validation to simplify maintenance and administration.

8.2.3. Related Patterns and Guidance Valet Key Pattern

8.3. Valet Key Pattern

9. Introduction

9.1. Download book (PDF)

9.2. Podcast