Scott McCarty
Contributor

How to identify and solve web-scale problems

feature
Nov 16, 20228 mins

The storied history of web-scale problems offers lessons for operators of increasingly complex IT environments.

global network
Credit: Thinkstock

Some problems are good to haveโ€ฆ but theyโ€™re still problems. A company that has web-scale problems is probably growing and innovatingโ€”but at a pace so rapid that the current infrastructure canโ€™t keep up. Adding to the challenge is that companies donโ€™t always know that they even have a web-scale problem.

In this article I will discuss the origin and evolution of web-scale problems, how to determine whether you have a web-scale problem, and how container orchestration is the most elegant solution weโ€™ve found to help organizations solve these problems.

Early warnings

We saw one of the first harbingers of web-scale worries in, of all places, the greeting card industry. For almost 100 years, greeting card companies in the United States hummed along, manufacturing and merchandising cards that would get taped to gifts, sent through the mail, and stuck on refrigerators. Then, in the mid-1990s, everything changed. It was the rise of the World Wide Web, and everyone wanted to be part of it. In 1996,ย Blue Mountain,ย American Greetings, andย Hallmarkย all launched dot-com sites to serve e-cardsโ€”and a digital battle ensued.ย 

I worked in the greeting card industry, and it was all about the holidays. Valentineโ€™s Day, Motherโ€™s Day, and Christmas are some of the happiestโ€”and, not coincidentally, most lucrativeโ€”times of the year for greeting card companies. As business moved online, these major holidays became battlegrounds in the e-card spaceโ€”blending the teachings of The Art of War (Sun Tzu) with The Mythical Man-Month (Fred Brooks) to craft state-of-the-art web infrastructure and win new digital business. (Today, we call this digital transformation.)

At first, e-cards were free. The goal was to attract users, not make money. For dot-coms, millions of users were worth millions of dollars in company valuations. Things were great for a while. Everyone was attracting new users. Soon enough, however, the dot-coms needed to make real money. This created both strife and opportunity.

When AmericanGreetings.com decided to start charging for e-cards, people didnโ€™t want to pay, so they flooded Hallmark.com. Hallmark couldnโ€™t handle the extra traffic, and it crashed. People still wanted to send e-cards, so they went back to AmericanGreetings.com and paid to send them. This drove tremendous business for American Greetings, but, more importantly, it highlighted the competitive advantage of being able to handle not just web-scale traffic, but unpredictable web-scale traffic.

The business lesson we quickly learned was that web infrastructure could be an advantage in driving revenue.

The dawn of web-scale worries

Consumers at this time were warming to the idea of e-commerce, and servers powering small intranet and internet sites were being asked to perform web-based transactional processing at a scale no one had ever imagined. The servers, network equipment, storage devices, and internet pipes already in place couldnโ€™t handle the traffic, creating the first web-scale problems for companies doing business on the web.ย 

At the time, there were no out-of-the-box solutions to solve these problems, so dot-coms had to build their ownโ€”through, in my experience, lots of trial and error and a great deal of pain. Best practices for how to solve web-scale problems were collected and disseminated throughout the industry, as talented systems administrators and developers taught each other through social connections. Not every company had web-scale problemsโ€”it was mostly start-ups and dot-comsโ€”but those that did started targeting this talent pool.

Web-scale problems go mainstream

Of course, purely transactional e-commerce is now table stakes. Companies have systems on premises, in the cloud, and at the edge, spread across multiple providersโ€™ platforms. And then thereโ€™s the demand from customers for more powerful and more personalized applications, not to mention information in real time.

The scope and context of web-scale problems has changed, which, in many ways, makes them even more challenging to identify. Here is a list of questions to ask to determine if you have a web-scale problem in your business (and how big that problem really is):

  1. Do you have a double-sided marketplace with hundreds or thousands of users who purchase or consume resources, as well as tens or hundreds of IT professionals curating the services offered?
  2. Do you have scenarios where load on the system can change dramatically in a short period of time?
  3. Do you have hundreds or thousands of servers that are underutilized most of the time, but spike at other times?
  4. Do you collect data generated from thousands or millions of small devices or users?
  5. Do you have a workload that dramatically out-scales the capacity of a single box?
  6. Are you developing hundreds or thousands of services or microservices?

Did you say yes to any of these questions? Do you think youย willย say yes to any of these questions within the next three to five years?

Solving web-scale problems elegantly

Back at American Greetings (and for years afterwards at other places), I solved web-scale problems with the software equivalent of shoestring and bubblegum. At the time, our team used a mix of open source and homegrown solutions to manage one of the largest websites on the internet. Using tools like Linux, Apache, and a homegrownย CFEngineย replicaโ€”yes, a replicaโ€”we were able to manage more than 1,000 servers and 70 applications with approximately three people (what most would call site reliability engineers nowadays).ย 

These tools were great, and cutting-edge for the time, but the set of higher-level primitives we used to define clusters, network endpoints, and applications were all things we simply made up. We had to, because there was no standard way to imagine, define, and build web-scale applications in those days. Each company was left to invent primitives, and each team member had to learn them if they wanted to understand the system and build new applications or troubleshoot broken ones.ย 

Early web scaling was akin to the earliest days of computers: If you didnโ€™t know how to use Windows or Linux, you knew how to use a specific computer like COLOSSUS or the ENIAC. In those early days of web-scale computing, there wasnโ€™t much portability in the knowledge you had, although basic concepts (networking, load balancers, storage, web servers, and so on) applied.

After American Greetings, I worked at an ISP and web development company and solved similar problems for more than 70 different customers. That work helped me realize that there could and should be a standard way to solve web-scale problems. Thatโ€™s why I was so excited when I saw Kubernetes come along. It changed everything. When I first saw Kubernetes, I was excited beyond belief. I knew there was finally a way to solve web-scale problems in a standard way.

A need for Kubernetes

At build time, Kubernetes and containers enable a standardized way to construct applications. Everyone can learn this way: Use Dockerfiles/Containerfiles, and commit them in Git. This standardized language for build management simplifies the cognitive load and makes the knowledge that SREs have portable to other systems within your organization and from other organizations (making it easier to hire new people). It also makes it a lot easier to test applications before pushing them into production.

At run time, Kubernetes makes applications portable among different servers in the cluster, manages failover, handles the load balancers in the cluster, scales when traffic is heavy, and deploys pretty much anywhereโ€”in the cloud or on premises. In fact, when people say they donโ€™t need Kubernetes, itโ€™s jarring for an e-commerce veteran like me to hear.ย 

My theory is that people who say they donโ€™t need Kubernetes donโ€™t realize they have web-scale problems. (And, itโ€™s highly likely that they do.)

The Kubernetes project, in combination with the many open source tools designed to complement it, enables organizations to effectively meet web-scale needs. Notice I didnโ€™t say โ€œeasilyย meet.โ€ Iโ€™m not going to pretend Kubernetes is an easy lift, because itโ€™s not. But, remember, web-scale problems arenโ€™t easy, and almost everyone has one (or more) nowadays. Kubernetes has capabilities I never could have imagined when I was going crazy trying to prevent Valentineโ€™s Day from breaking my companyโ€™s technological heart.

โ€”

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries toย newtechforum@infoworld.com.

Scott McCarty

At Red Hat, Scott McCarty is senior principal product manager for RHEL Server, arguably the largest open source software business in the world. Scott is a social media startup veteran, an e-commerce old timer, and a weathered government research technologist, with experience across a variety of companies and organizations, from seven person startups to 12,000 employee technology companies. This experience has culminated in a unique perspective on open source software development, delivery, and maintenance.

More from this author