PWN Your Infrastructure: Behind Call of Duty: World at War - RailsConf

I’ve included the outline I’ve prepared for the talk below.

In it, I cover two technologies (NFS and Monit) and how building our infrastructure around them from the ground up allowed us to solve a number of scalability and convenience problems (specifically, software installation, application deployment, server configuration, and server monitoring) for our several-dozen machine virtualized network.

I. Introduction

A. Who is Agora?
B. A Typical Rails Infrastructure
C. Why Change It?
II. Using a Shared Filesystem (NFS)

A. Local Software (Ruby, Gems, Administration Scripts)
B. Application Directories (the entire app directory, not just shared/)
1. Simplified Deploys
2. Simplified Page Caching
C. Configuring NFS
D. Notes
1. Web Server Should be the File Server
i. Efficiency Concerns
ii. Redundancy Concerns
2. Servers/Clients Should Share a LAN
3. Security Concerns
i. File Ownership
ii. Protocol Security
III. Deploys

A. Capistrano Has Problems
1. Synchronous Tasks (Across Machines)
2. Failures Not Localized
3. Limitations of Networking Create Failures (Especially as You Scale)
4. Scaling Architecture Requires Scaling Other Services (SVN, etc.)
5. Insecure By Default, Requires Shared User Privileges
B. An Alternate Solution, Using a Shared Filesystem
1. Simple Bash Deploy Script
2. No Shared User Privileges Required
3. Relies on Monitoring Daemon
IV. Monitoring

A. Nagios/ZABBIX/Cacti/Monit/God/Munin Not Good Enough
1. None do Everything You Need - Process Monitoring, Usage Graphing, Server Configuration
2. Several Consume Excessive Resources
3. Combining Solutions (Monit + Munin) Still Insufficient, and Hackish
B. Overlord
1. Rails App, Configures Monit, NFS
2. Monit Controls Everything Else (Server/Process Monitoring, Alerting, etc.)
3. Reading/Reporting Status From Monit (Undocumented XML Feeds)
4. Graphing: RRDTool is Awesome
V. Conclusion, Questions