In this talk, Richard discussed the AWS Well-Architected Framework and its associated design principles and best practices. His goal was to teach everyone how to conduct a Well-Architected review.
A Story About Risk
During Richard’s time in the military, he learned about the importance of risk management in analyzing situations: how to look at operations related to risk and develop plans, contingency plans, and more backup plans to deal with that.
So why is there no risk-centered plan for building applications in the cloud? For AWS, the Well-Architected Framework intends to fill that need.
The Well-Architected Framework contains practices on how to build, develop, and evolve within the cloud. As many now know, architecting for applications in the cloud is wholly different than in the data center, and it calls for different approaches. The Well-Architected Framework will help you with this leap. Plus, it’s technology-agnostic, so it’s applicable to most situations.
Data Center vs. Cloud
Having a data center presence is like a having a generator:
- There are upfront costs.
- You must ensure you have sufficient capacity.
- You must ensure you’re using enough capacity for it to be worthwhile.
- You need specialized knowledge about the generator and the system to keep it running.
In contrast, running in the cloud is like being on the power grid. In your house, you flip a switch and you have power. The cloud gives you that same level of flexibility.
There is some terminology that’s key to understanding cloud operations:
When designing a workload, start with the business value and work backwards to the people, processes, and technology. Even better is to focus more acutely on the business value. Instead of looking at an entire web page, look at a specific modal or specific feature on the page.
Back in 2012, it was common for solution architects to carry around reams of practices and ideas to help ensure clients knew about and followed best practices. The Well-Architected Framework, released in 2016, replaced this homegrown set of documents. It has several pillars:
- Operational excellence is intertwined with the other pillars.
- Security is about how data is encrypted in transit and at rest, as well as how identities are secured. It’s also about observability being in place to know security controls work.
- Reliability is a workload operating as expected for the given situation. It should do so consistently.
- Performance efficiency is about knowing what needs to be fast and what needs to be optimized based on KPIs—not trying to make everything fast. Going fast has cost.
- Cost optimization requires built-in cloud financial management within the organization to monitor and control costs.
Design Principles
Each pillar has its own unique principles. Plus, there are core principles.
The Well-Architected core principles that cut across all pillars are as follows:
- Don’t guess capacity needs. Rather, understand the needs.
- Test at production scale, not at a stripped down test environment scale.
- Use automation to make time for important things like experimentation instead of chasing problems with people.
- Anticipate architectures changing in the future and account for that evolution over time.
- Make architectural decisions using data and not intuition.
- Structure processes and events to ensure effective training and preparation.
Key Pillar Principles
The most important question in the framework is from the operational excellence pillar and it’s this: how do you determine what your priorities are?
Related to the security pillar, the most critical question is how to classify, structure, store, and handle data.
Under reliability, the key question is the proper design of the workload service architecture: what are all the elements that must come together to deliver that service to the customer?
For performance efficiency, with a goal of optimizing in certain areas and leaving others alone, the question is how to manage tradeoffs to improve performance.
Cost optimization, another pillar, is often a function of how well a company turns off services that aren't needed. So its question reflects that central tenet: how do you ensure resources are decommissioned when not being used and related, and how do you know when a workload can be retired entirely?
The pillars exist in tension with each other. For example, you can be faster but it will cost more money. The Well-Architected Framework embraces these relationships. It makes you informed about them and helps you understand and make decisions based on what matters to you at that point in time.
Tradeoffs can change over time depending on the phase of the workload:
The Well-Architected Review
You should be running Well-Architected Reviews using the framework all the time: before you build them, after you build them, and as you evolve them.
First, scope the workload. Then, identify the sponsor or owner for the workload. Lastly, use the Well-Architected Tool to walk through the review: answer the questions for each pillar, review the design principles, and document the decision points and trade-offs that were considered and accepted.
As you do Well-Architected Reviews, you’ll establish a feedback loop that carries you across all your workloads:
Conclusion
In this talk, RIchard covered the Well-Architected Framework, its central principles and concepts, and he outlined the right way to get started immediately with the framework by doing Well-Architected Reviews.
This session was summarized by Daniel Longest. With over a decade in the software field, Daniel has worked in basically every possible role, from tester to project manager to development manager to enterprise architect. He has deep technical experience in .NET and database application development. And after several experiences with agile transformations and years spent coaching and mentoring developers, he's passionate about how organizational design, engineering fundamentals, and continuous improvement can be united in modern software development.