RSC Policy Brief: Three Myths about Copyright Law and Where to Start to Fix it

The RSC recently published a startlingly-enlightened paper on fixing copyright law in the United States. However, that apparently didn’t go over too well with Big Media, and the paper disappeared from official sources earlier this month not long after its original publication. Like many Americans, I think that’s just plain sick. So I join with many in the internet community who want to “back up” this piece of work. While I may not agree with everything in it, I think it’s a great starting place for repairing the obviously broken copyright system, and I join with the others out there who stand in defiance of such ideas being prohibited at the behest of media companies by putting a “backup” here.

Here’s a link to the file:

rsc_policy_brief_–_three_myths_about_copyright_law_and_where_to_start_to_fix_it_–_november_16_2012

Gentoo Is Weird

So at Engine Yard we use Gentoo for our Linux distro. I find how it manages its init.d scripts a little weird.

Say you have a process and /etc/init.d/script status reports that it’s not running. But ps says it is. Weird right?

Gentoo keeps a directory at /var/lib/init.d/started that contains symlinks to each running service’s init.d script. So when you run /etc/init.d/<service> status it checks for that symlink. If it’s not found for whatever reason, the service is reported as down, though actually running.

You can just put the symlink back in place to fix it, but I thought this was odd enough to document somewhere for posterity.

Technical Ethics

A recent article over at InfoWorld about various privacy concerns got me thinking today about differences between my (generalized) profession – Engineering – and that of medicine.

The aforementioned article details how some companies are using technology to essentially spy on unsuspecting citizens simply for the sake of profit. Among the ideas mentioned are “wellness incentives” programs to use technology to track whether or not you eat right, exercise, take your medication as prescribed, and so on. Another idea is that with a tracking device in your car, you could be eligible for an insurance “discount” if you drive “safely” in the eyes of an insurance company.

All these initiatives have one major flaw: it’s none of The Man’s goddamn business.

At the end of the day, however, this technology is being used for one reason alone: because other engineers are out there building it.

Profit-seeking entities, especially these days and especially in the United States, where we have an “anything goes” regulatory body in charge, can and will do anything they possibly can to get more money in the bank – ethics be damned.

This had me consider how different things are in the medical profession. In medicine, a doctor not only has medical ethics courses in school, they even have ethics reviews and ethics panels to judge whether or not a physician acted in the best interest of patients. In other words, they have real laws, real consequences, and real enforcement if a physician fails to act in the best interest of his or her patients.

For example, look at the doctor who was recently convicted of basically killing Michael Jackson. Whether or not I agree with the ruling is irrelevant – the physician was prosecuted for acting in a medically unethical manner.

Imagine how the world would change if we could apply similar restrictions to engineering.

I’m not talking about the concept of “forbidden technology” so much as the idea that we, as a society, should have some system of technology ethics review that would hold engineers accountable for creating software or systems that could be used to cause harm to other people, invade their privacy, steal their money, or worse. We should also hold the companies who finance the development of such things accountable, demanding extremely harsh penalties of companies who even so much as ask an engineer to develop privacy-infringing or other negative technology.

Understand me here: I’m not saying that we have a slew of unethical engineers out there today. In fact, the opposite is true – most of the engineers I know and have met are extremely upstanding people who would do everything in their power to prevent giving their employer such “evil” levels of control over customers or other stakeholders.

However, the truth is that at the end of the day, we have no one to turn to when a company forces us, often against our will, to create technology that will be used for nefarious purposes.

I used to work at a small cell phone company. One day, I heard an employee who’d been there much longer than myself tell a story about how the owner of the company (it was small and privately held) asked the chief of network operations to enable technology to spy on his employees, so he could listen into the phone calls that his employees were making with their company phones.

Some may say this is reasonable – company phone for company business only, right? Wrong. Those phones were provided as a perk of working at the company, and with the expectation that the individual could use the phone and service as his or her own primary personal and business phone (within reason, no 6 day long phone calls non stop).

Thankfully, the gentlemen who was told to do this had a spine: he told the company owner that he would absolutely not allow that, and didn’t care if the he was fired because of it.

But let’s be frank: most engineers who get put in similar positions don’t have the available “fall back plan” or latitude to tell an employer NO when they’re commanded to create technology that will give that employer too much control, or too much information.

We may also not see, when certain technology is created, how that technology may be later used for nefarious purposes. For example, you may implement logging data in an application and not realize that later that data could be linked back to specific individuals through foreign keys in your database structure.

In addition to the need for a strong center of technical ethics upon which engineers generally agree, we need a regulatory body or system of law that can protect us from unemployment if we choose to stand up for those ethics.

Right now, if an employer comes to an engineer and says, “build me a system that will track a user’s every movement via the GPS in their phone whether it’s on or not, and if you tell anybody about this I’ll not only fire you, but sue you and possibly do ‘other bad things’”, what is that engineer going to do? Say no? And then what – look for another job when he’s already been blackballed in the industry for having a sense of honor?

We need a regulatory body that can, and will, stand up for the same ethics that most engineers already have and hold dear. We need some way to say “NO!” with impunity.

Imagine how insane it would be to ask a doctor to give a patient the wrong medicine so that the patient – who happens to have fantastic insurance – would be in the hospital longer, thus creating additional profit. The hospital administrator’s goal isn’t to harm the patient, but to use them as a cash cow. Wouldn’t this be far outside the realm of anything considered ethical? If the doctor said “no”, without medical ethics panels and laws surrounding that, he’d be fired and another doctor who was willing to compromise those principles would be hired, thus putting the patient’s health at risk (despite the fact that they aren’t seeking to harm the patient, just ‘use’ them).

Laws surrounding this give a physician a way to say “no” without risking his or her job. We, as engineers, need the same thing – some form of authority who can and will review and enforce technical ethics in corporate and governmental areas, where if we say “no”, that employer CAN’T hire some one else who will say yes – because there won’t be anyone else, and because we could call that employer out for clear ethics violations, which would carry some very heavy penalties.

What I’m suggesting here is that not only do we, as a global engineering community, need to agree on some basic ethics that we hold as sacred and never violate, but that we also need governmental assistance to fall back on, so that when an employer asks us to violate these ethics, they know that our answer has to be no, and therefore won’t toss us out for having a sense of honor.

What should these principles be? Well, that’s up to you, but here are mine:

  • I’m not going to keep any customer data that I don’t absolutely have to. This prevents the possibility that such data would be leaked, hacked, sold or used to surreptitiously track the user. (It also makes for more efficient systems in general.)
  • I’ll keep any feature ideas that seem likely to be twisted for nefarious means to myself.
  • I’ll do everything I can to protect user information from future misuse.
  • I will do everything I can to build secure, reliable systems that prevent tampering and/or hacking so that user data stays confidential.
  • I’ll keep a watchful eye for insecure processes within the company and point these out to management, and lobby for correcting those processes.
  • Perhaps most importantly, I currently, and always will, put the importance of individual privacy above the importance of a paycheck.

Password Rules SUCK!

Everyone and their dog has probably seen password rules before. The annoying little notice that says, “you must have …” and then goes on to list a litany of criteria, all in the name of making a password “more secure”.

Tonight I ran into this with GitHub, and I’m finally pissed off enough that I’m going to say something about it:

PASSWORD RULES SUCK.

There, I said it.  I’m sure some wannabe security “expert” is out there right now flipping his shit, but the above is simply true.  This post details why.

Password rules do nothing to enhance security.

The idea behind establishing a litany of criteria for a password, such as, “you must have at least one uppercase letter, one lowercase letter, one number, and one special character” is that with a non-dictionary based password (i.e., a bunch of gibberish), it wouldn’t be all that easy to brute-force and crack.

However, people who need this kind of “enforcement” in the first place are, by definition, ignorant about security.  If you understand security, you understand how brute force passwords work, and therefore don’t actually need this kind of hand-holding.  Those who don’t understand how security works, or even understand the concept of brute-forcing a password, are likely to then write down the gibberish random password they’d have to come up with anyway.  And the truth is, a password written down is every bit as insecure as a password that’s brute-forceable, because they can both be cracked through well known means (i.e. look at the paper or fire up your brute force cracker of choice).

On the other hand, those of us who don’t need this kind of hand holding get penalized.  They say the best password is the one that not even YOU can remember.  That’s bullshit - what’s the point of having a password if you can’t even remember it to use it?  I say, “the best password is the one ONLY you can remember.”

But then we have this strange problem: how do you create a password that adheres to all the rules, is secure, and is memorable?  Short answer: you don’t.  If you do, you wind up entering it into a password manager or writing it down, and as mentioned above, that’s not as secure as something you can only keep in your brain (issues of kidnap, torture, or crazy trips on sodium pentathol withstanding).

How to create a GOOD password.

In my book, a good password meets these criteria:

  1. Far too long to brute force
  2. Wide variance of characters, including spaces and punctuation
  3. Multiple cases of letters
  4. Easy to remember

What meets all these criteria?  A sentence.  I’ve been a proponent of creating sentences and/or key phrases for passwords for several years, and I’ve never seen that philosophy steer me wrong.  However, using password rules that disallow spaces, punctuation, etc. and force you to use letters, numbers, and !@#$%^&*() (which are not punctuation, but are considered “special characters” in most cases) inherently gimp your password’s security by limiting what you can input, thereby creating a situation where you are forced to use some bass-ackwards crazy combination of stuff that makes no sense, and therefore, requires you to memorize it or write it down.

“What’s wrong with memorizing it? You’re just bein’ lazy!”

Not necessarily.  Consider, for the moment, the fact that most web applications have some form of password rules similar to what I’ve outlined here.  Also consider that for real security, you need a different password for every single account you own.  There’s no way any human being is going to be able to memorize several hundred weird crazy passwords like that.  So what happens is that they usually figure out a password that tends to meet most criteria, then use it everywhere.  The guys at LulzSec love people like this.

So, what’s the answer?  Create a SENTENCE for a password that has something to do with the service you’re logging into.  Rely heavily on personal interpretations and neuro-associations with the subject matter on said website, the more personal (and embarrassing) to you, the better.  The reason for this is that the more personal/embarrassing your neuro-association to the subject matter at hand, the less likely you’ve ever told any other person about the event, action, mishap, thought or idea that occurred.  And that’s what we want – total privacy that only you know.

In my experience, I can create a totally different password for each service I have an account with by creating a sentence-based password (when rules don’t bite me in the ass, that is) for each.  It allows me to memorize passwords easily because I base my phrase or sentence off the name of the company or product/service they offer.  I’m not copying their name into the password, or anything like it, but a concept that’s related.  I find this works well for me because my brain has some VERY weird neuro-associations.

“Well, companies limit password length so they can save space in their database.”

If they’re doing this, they’re storing passwords DEAD WRONG – another thing guys like LulzSec absolutely love.  Storing passwords using reversible encryption, or even worse, plain text, is one way to absolutely guarantee that said passwords will, with complete certainty, one day be divulged.

When storing a password in a database, the software doesn’t have any reason to try remembering what said password was, because it can calculate a cryptographic hash of that password – one that can’t ever be reversed – and simply store that hash.  Then, when the user logs in again later, the password they enter is put through that same encryption process again, and then compared with the encrypted password on file.  If the encrypted passwords match, the login is correct.  Off by even the slightest bit, and it’s a total failure.

The truth is that once these passwords are stored in encrypted form, they’re all the same length.  A password of one character (the letter ‘a’ for example) is just as long as a password comprised of the totality of War and Peace.  The only difference is that the longer the password is, the more CPU cycles are spent on calculating the cryptographic hash to store and compare.  However, in just about any case, a password the length of the average sentence (so, say, under 200 characters), is going to be just fine and won’t cause any serious impact with respect to application latency.

Which is easier to brute force?

Consider the following passwords: which is easier to crack?

  • c%Xd98F2
  • Now is the time for all good men to come to the aid of their country.

The first password is 8 characters long.  If one assumes that each character can be UTF-8, the total number of possible combinations is (1,112,064 ^ 8).

However, as experience has shown, that’s usually not the case.  Most of the time, passwords can consist of approximately these characters:

  • A-Z, a-z
  • 0-9
  • !@#$%^&*.

Depending on the service, others may be available as well, but in my experience, this seems to be the most common set.

Looking at this list, we see (26 * 2) + 10 + 9 = 71 characters.  26 (* 2 for both cases) for the English alphabet, 10 for digits 0-9, and 9 for the most common special characters.  Spaces are not included.

Now that we have this new data, we can say that the total number of possible combinations for the first password is (71 ^ 8), far less than if all characters were allowed.  Not only that, but chances are, a person forced into that kind of password system is going to have that password written down on a sticky note either attached to their monitor or in an adjacent desk drawer.

Look at the second password example.  That sentence was a phrase my grandfather used to teach me how to type (on a typewriter) when I was a school kid in junior high.  Assuming that the length of the password was between 1 and 200 characters, and any series of characters even in just the narrow (by comparison to UTF-8) character set of ISO-8859-1, we come up with (224 ^ 200) (assuming the reference for the ISO-8859-1 character table I looked at was correct).  This goes even further – by far – if all characters under UTF-8 are fair game, which in my view, they should be.

“But without password rules, people would get hacked ALL THE TIME!”

That’s right, they would.  Anyone who’s worked with end users (read: computer illiterates) knows that when it comes to security, the vast majority of them see it as a massive hurdle, thinking it’s just there as a formality to make their lives harder.  Some go so far as to say, “oh nobody wants any of my account details – I’m not on the radar.”

I’m sure that’s what these people thought at one point in time, too.

The bottom line is that it’s going to take a serious security breach for people with this mentality to shape up.  They don’t get it because they don’t want to get it – not because it’s all that hard.  They aren’t paying attention and think, “oh it can’t happen to me”.

By coddling these users with password rules, we’re just delaying the inevitable, and not even doing a good job at it. Better to get it over with, let the user learn their lesson (since they haven’t been paying attention anyway), and stop burdening the rest of us who know better with “security” models designed during the Reagan administration.

Introducing Vanities for Rails 3

When working on a recent “top secret” project, I decided that my users needed vanity URLs. So, a user would have a “vanity name” of “foo”, and then requests to http://example.com/foo would need to redirect to that user’s profile (i.e. http://example.com/users/1). So I built a simple interface for doing that.

Then I thought, “hey, somebody could find this useful as a gem.” So I made one.

Vanities is a Rails 3 gem that gives you a quick and easy way to set up a model-agnostic system for vanity URLs throughout your Rails app.

It’s available as a gem via rubygems, and open source through GitHub.

A Quick Walkthrough

Vanities is meant to be really simple and easy to use.  It’s not at all a groundbreaking, revolutionary concept or anything, merely a convenience for what could otherwise be a pain-in-the-ass kind of setup.  Ergo, using it is a piece of cake.

Let’s start by creating a new Rails 3 application.  This application is going to be a very (and by very I mean “extremely”) simple scaffold of users.  We won’t be bothering with authentication or encrypted passwords, etc. in this example, as it’s just an example.  But in reality, of course you’d have all that stuff.

Start by creating a new Rails application:

rails new mytestapp
cd mytestapp

The first thing we need to do is tell Bundler to use Vanities.  Open your Gemfile and add this code:

gem 'vanities'

Next, do

bundle install

Next, create a User scaffold.  This is what will allow us to see Vanities working.

rails g scaffold user

Now, open the database migration for User and make it look like this:

class CreateUsers < ActiveRecord::Migration
  def self.up
    create_table :users do |t|
      t.string        :name
      t.timestamps
    end
  end

  def self.down
    drop_table :users
  end
end

Next, open app/views/users/show.html.erb and make it look like this:

<p id="notice"><%= notice %></p>

<h1><%= @user.name %></h1>

<%= link_to 'Edit', edit_user_path(@user) %> |
<%= link_to 'Back', users_path %>

Now, type this in your terminal:

rails g vanities

This installs Vanities for you, including the route and controller responsible for making it all work.  There’s only one more minor bit of code left to write – the has_vanity method.  Open app/models/user.rb and make it look like this:

class User < ActiveRecord::Base
  has_vanity
end

Finally, go back to your terminal and migrate your database:

rake db:migrate

Setting Up a Fake User

Now that we have Vanities set up, let’s create a fake user to illustrate.  Issue the following commands in your terminal:

rails c
u = User.new(:name => “J. Austin Hughey”) # your name is fine here, too
u.vanity = Vanity.new(:name => “jah”) # you can use your initials here if you want
u.save

We just used the Rails console to create a user and a vanity for it, then saved it.  This is necessary for our next step:

rails s

Now, go to
http://localhost:3000/jah
in your browser.  You’ll be automatically redirected to the user’s view page.  Welcome to Vanities! :)

HDD – Hack Driven Development

The modern web applications development “industry” (if you will) seems to be a dizzying array of buzzwords like “Agile” or “XP” or “pair programming”, etc. “Test Driven Development” and “Behavior Driven Development” are all the rage these days, and for good reason – they’re excellent approaches to getting things done. But I’d like to expand on these notions by introducing another initialism – HDD, or “Hack Driven Development”.

For the purposes of this discussion, let’s clarify what I mean by “hacker” or “hack”. These days it’s common to refer to some one who may not employ all the latest and greatest “best practices” as a hack(er), implying that s/he doesn’t know what they’re doing. This is NOT what I’m getting at.

My use of the word “hack/er” is meant in the pseudo-original context of a malicious user looking for vulnerabilities to exploit for his or her own gain.

Embrace Your Ignorance

It’s been said that a wise man has more questions than answers. Ergo, the wise man recognizes his own ignorance and embraces it to make himself a wiser man. I recommend following a similar approach with respect to HDD.

You see, when developing a web application (or any application, really), it’s common to focus on what a co worker of mine once referred to as “the happy path” – when everything works as it’s expected to. But what happens when things don’t work as you expect?

In any logically designed system, multiple – possibly infinite, if one looks deeply enough – potential points of failure exist. This is true of everything from governmental policy all the way down to the simplest “Hello World” program written in high schools across the nation.

With all such systems, the origin of any such failure is always in some way related to its implementation by a human being. A fallible being cannot logically be expected to create an infallible system. Thus, we have to consider what would happen when things don’t go according to plan. This is where, I find, many traditional developers fail miserably in their thought process. Instead of recognizing that their class could have an implementation error, or recognizing that DoS attack vectors could exist when blindly passing input to an underlying library, they suddenly abandon logic in all forms and say, “awww heck, that can’t happen to me!”

By accepting the fact that there are infinite unknown unknowns within any given system – anything from the incredibly complex IRS tax code, all the way down to the word “hi” written on a sheet of paper – developers can begin to uncover possibilities that could cause their application to crash and burn. This lays the foundation required to re-train some one to think like a hacker.

Think Like a Hacker

The entire point of TDD/BDD is to express the behavior of the application via formalized, well documented and spelled-out tests, to ensure that the developer knows when s/he is done, does not do more than is required, and that the application can continue to behave as expected later on down the road, as more functionality is introduced. Re-running the tests at a later point can show where existing functionality is broken within the application, alerting the developer to incompatibilities between planned releases and existing application logic.

Hack Driven Development is meant as a supplement to this process. In addition to expressing the behavior of an application, class, etc. via tests, one should also express the behavior of that same unit when a hack is attempted.

To accomplish this, the developer has to learn how to think like a hacker. Most developers that have been in the industry even a small amount of time know what OWASP and their Top 10 list is/are, but unfortunately, many of them have no idea how to make those things actually happen in a real world, live application.

For example, many developers know how to protect against SQL injection, and tools like ActiveRecord and its various equivalents in other languages help make things easier. But do they know how to actually make a SQL injection query successful against a vulnerable codebase? If not, how can they be reasonably sure that their application isn’t vulnerable?

“Well,” I can hear some one saying, “ActiveRecord protects against that by default.”

Not if you don’t use it right. Observe:

# A very basic AR class in Ruby:
class Post < ActiveRecord::Base
end

# Updating an existing record
@post = Post.find(“id = #{params[:id]}”)

This implementation is vulnerable to a very basic SQL injection query even though it uses ActiveRecord, because the find method has several different ways it can be used – this particular syntax happens to accept the string passed in literally. In this case, the programmer is literally injecting whatever is in the ID parameter of the request – be it GET/POST/PUT/DELETE or other – directly into the query string.

Although this is an overly simplistic example, it shows how one must learn to think like a hacker in order to identify these kinds of issues. If one was not actively looking for ways to break the application, the above query would likely be glossed over.

A better example

Given the following model and controller setup in a Rails application, there are at least four gaping security holes that can be addressed rather easily.

class User < ActiveRecord::Base
  # Assume various authentication methods are inside this model
  # and are working correctly, simply for the sake of brevity and sanity
  # in explaining this concept.
end

class Post < ActiveRecord::Base
  belongs_to :user
end

class PostsController < ApplicationController

  def index
    @posts = Post.all
  end

  def new
    @post = Post.new
  end

  def create
    @post = Post.new(params[:post])

    if @post.save
      flash[:notice] = "Post saved!"
      redirect_to @post
    end
  end
end

Can you spot them all? If not, here they are:

  1. No authentication filter on the PostsController class. Needs a before_filter that specifies some authentication step that the user should be forced through on everything except possibly the index action. In case you’re thinking, “oh that wouldn’t happen in the real world with experienced developers!”, think again – this example is taken directly from a codebase the licensing of which costs millions of dollars, and the architects of which include some very talented and intelligent people, including speakers at various Ruby conferences and contributors to Rails itself. Even the best can make honest mistakes.
  2. The index action calls Post.all. This is potentially, especially when combined with point #1, a DoS vulnerability. Some may say, “oh that’s a performance problem” – I say “tomato”, you say “tah-mah-toe”. “Performance issue” is the Fisher-Price/Playskool way of saying “DoS vulnerability”. The fix is the same in both cases – properly scope the find statement via pagination or with some form of limit on the number of records returned.
  3. In the create action, the instance variable @post is being created blindly, based on the parameters passed in. Note that a post belongs_to a User object. In Rails’ convention(s), this means that the Posts table has a user_id foreign key column defined for each entry. This means that I can arbitrarily assign a post to belong to any user I want, effectively performing a pseudo-injection of code. It may not be as apparent or dangerous-looking on the surface as a raw SQL injection hole, but it can wreak just as much havoc when thought of in a legal context; “spoofing” some one’s words can get them in a lot of trouble, especially if what they’re saying is of particular threat to a given set of people (think about political speech in countries without a “Freedom of Speech” law, like China, or about how this could cost a politician an election by having racist remarks show up on his/her blog; whether it’s “real” or not doesn’t matter to a sensational press and media who will have a field day with things like this).
  4. Look at how the @post object is saved. It’s logically wrapped in an “if” block, meaning that if the save fails, the contents of the statement aren’t executed by the interpreter. But there isn’t anything after the if statement that fails – there’s no “else” clause. This essentially “breaks” the application’s flow – it simply does nothing after the save call fails. Depending on how your server is configured, this could do anything from displaying an error – that could contain technical information, like the version of your Apache or Nginx installation, for example – or possibly nothing at all. Some wouldn’t consider this a security issue, but I do, because in the first hypothetical (error message with technical info), data about your server’s configuration is divulged to an attacker, who can use that information to more easily profile your server configuration. The second hypothetical – showing the user a blank page – isn’t as much of a security issue, but it’s still quite worthy of attention.

Down the infinitely deep rabbit hole

If you look far enough, you can find potential security issues and “what ifs” in literally every portion of any application or logically designed system. The task, therefore, isn’t necessarily to define all of these “what if’s”, but the most important ones that are most likely to be vulnerabilities that invite attack. Thankfully, research by the Open Web Application Security Project (OWASP), called the “OWASP Top 10”, tells us what these most likely vulnerabilities are:


http://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project

So how does HDD play into this concept?

Developers are practicing (T|B)DD more and more with each passing day. I recommend that in addition to defining an application’s behavior through tests, that a developer also add, with their existing programmatic tests, attempts to hack their system by throwing all kinds of tainted input at it and accessing it in unusual ways.

The general rule of thumb I follow is, “if it occurs to me, it needs a test.” In other words, while writing code, my mind automatically tends to identify various potential ways to “break” things. This is likely a result of my focus on items like this for many years, but others can practice this by making an explicit effort to examine their code visually for ways to attack their system(s). After the developer identifies possible areas for attacks to occur, s/he should add functional/unit/integration/etc. tests to the project that attacks the application as conceived, and then asserts that the attack did not succeed.

This effectively makes at least a reasonable effort at identifying security issues with an application before it’s released, and preserving any security hardening efforts through future releases by using the standard (T|B)DD process to integrate security testing as part of your development lifecycle.

Development, Quality Assurance, Security, and Performance Tuning go together.

In many organizations, it’s common to separate, at least in terms of duties, development, QA, security and possibly performance into different teams or separate areas of responsibility. I contend that they should all be wrapped up together as one common organizational entity (a department or team, or individual for smaller organizations), because issues that affect one of these concepts can often affect others.

Look at the example above, where Post.all was called. This effectively performs the following query against a SQL database:

“SELECT * FROM posts”

While this line of code, in and of itself, is rather innocuous, consider what it would do with a database containing potentially billions of posts. Not only would the query take quite a while to execute – especially if information isn’t properly indexed – it would also potentially fill the memory on the system, slowing other HTTP requests to the application to a complete crawl, potentially drop connections, and potentially hang other processes being performed on the machine at the same time. The question then becomes, “what was depending on this system that is now not functioning correctly?” Additionally, a fair management-style question would be, “what resources will I have to spend to triage this issue, and then to protect against it happening in the future?”

By implementing a HDD process of some form, along with QA and/or performance tuning, this entire issue could be avoided, meaning that triage for it would never be required, nor would additional resources (calling in a systems administrator at night, paying an attorney to prepare a legal defense, settling with a plaintiff before court, court costs, losing a lawsuit, etc.) be required to fix a problem that didn’t exist.

Specifying different gems in Bundler’s Gemfile when using JRuby

UPDATE: Thanks to Nick Sieger (@nicksieger), who saw my tweet about this and suggested I take a look at Rails own Gemfile for an example of the “platforms” support in Bundler.  It’s a much cleaner approach to implementing the below, and I recommend using it instead of my little hack here.  A much better example than the original would be:

source 'http://rubygems.org'

# All environments:
gem 'rails', '3.0.0.rc'
gem 'authlogic'

# If running on JRuby, use the activerecord-jdbc-adapter
platforms :jruby do
  gem 'activerecord-jdbc-adapter', :require => false
end

group :development, :test do
  platforms :jruby do
    gem 'jdbc-sqlite3', :require => false
  end
  platforms :ruby do
    gem 'sqlite3-ruby', :require => 'sqlite3'
  end
end

# Gems specific to the test environment only
group :test do
  gem 'shoulda'
  gem 'factory_girl_rails'
end

# Production
group :production do
  platforms :jruby do
    gem 'jdbc-mysql'
  end
  platforms :ruby do
    gem 'mysql2'
  end
end

So today, in addition to packing boxes for my upcoming move to take on a new job, I’ve been playing with JRuby, just to get my feet wet.  Up until now, I’ve only worked with MRI and REE, and played briefly with Rubinius.  But I wanted to test out JRuby, and of course found that the “standard” database drivers for MySQL, SQLite3, etc. aren’t quite going to work under JRuby.  You need to use the JDBC versions of these database drivers instead.

So I got to thinking: I don’t want my Gemfile to be “locked down” to a certain version of Ruby, in case I want to work with REE on my machine, and deploy using JRuby, for example.

The obvious question then becomes: how do I tell Bundler to install a set of gems for JRuby, and a different set of gems for other Ruby platforms?

source 'http://rubygems.org'

# All environments:
gem 'rails', '3.0.0.rc'
gem 'authlogic'

# Development
group :development do
  if defined?(JRUBY_VERSION)
    gem 'activerecord-jdbc-adapter', :require => false
    gem 'jdbc-sqlite3', :require => false
  else
    gem 'sqlite3-ruby', :require => 'sqlite3'
  end
  # gem 'ruby-debug'
end

# Test
group :test do
  if defined?(JRUBY_VERSION)
    gem 'activerecord-jdbc-adapter', :require => false
    gem 'jdbc-sqlite3', :require => false
  else
    gem 'sqlite3-ruby', :require => 'sqlite3'
  end
  gem 'shoulda'
  gem 'factory_girl_rails'
end

# Production
group :production do
  if defined?(JRUBY_VERSION)
    gem 'jdbc-mysql'
  else
    gem 'mysql2'
  end
end

As you can see with this example Gemfile, Ruby is basically using defined? to check for the presence of the JRUBY_VERSION constant.  Obviously MRI or REE won’t have a JRUBY_VERSION constant, so defined? will return false, leading Ruby to execute the else block with the standard database gem installations.  Of course, if JRUBY_VERSION is available, Bundler will install activerecord-jdbc-adapter and jdbc-sqlite3, in this case.

Another possible approach:

if RUBY_PLATFORM =~ /java/
  # JRuby based dependencies here
else
  # Regular ruby dependencies here
end

The only “difference” I care about right now is between JRuby and MRI/REE – I personally don’t really use Rubinius, IronRuby or any other implementation (and hadn’t really even messed with JRuby until today), and even so, those would probably get along just fine with the standard database drivers.  But using JRuby, you need JDBC-based database drivers instead of the standard C drivers for things like MySQL and SQLite3.  Nonetheless, this same idea can probably work for multiple Ruby platforms.