Fork me on GitHub

October 31, 2012

Resque Java

Post moved to

The title does not contain a typo, if it seems so there's likely no need to read further. It's actually about Resque - the Ruby library for creating background jobs using Redis. There's been a similar post (almost 1.5 year ago!) about bringing Delayed::Job to Java. DJ evolved since and seems to be still doing fine, Resque received a fresh breeze of activity recently and the master branch shall soon spit out a new major version.

Resque and Delayed::Job are different (although some points are not very accurate when comparing with DJ 3.x) yet serve usually the same purpose. And that is why the JRuby world should be able to run Resque workers threaded along side the (hopefully thread-safe) web application. Here's how you set it up for any Servlet container running JRuby-Rack using the Java deployment descriptor : WARNING: XML ahead!

Do not forget to copy the jruby-rack-worker.jar into your WEB-INF/lib folder or declare gem 'jruby-rack-worker' in your Gemfile and Warbler should take care of it.

It really should be stressed out that workers will run alongside your application and share resources such as memory, file descriptors with it thus you should not do this if your workers are expecting chaos and are not predictable (e.g. with memory consumption) or run jobs that take a long time to finish.

Besides, there's now a Trinidad extension and thus no excuse to keep on warbling forever, here's how a the configuration file might look like :

September 10, 2012

Spicing up Java with Ruby (a Rhino Story)

Post moved to

This post is certainly not a new comer, there's plenty of posts and a wiki covering the topic of JRuby's Java integration. But since, repetition is the mother of all wisdom, why not yet another, besides this one will be inspired by pieces of (simplified) use-cases out there, I promise.
These days more than ever, understanding the metal underneath JRuby might come very handy and there's two good reasons for that. First of all as more Java shops adopt JRuby, they might need to integrate with their existing libraries. Second, as Ruby-ist are trying out JRuby as a deployment option, they might save a wheel being re-invented by using third-party code beyond gems - by borrowing jars. Although it still seems not likely to happen a lot, cause somehow Java scares the shit out of a "traditional" Ruby monkey. I'm not going to argue here why it's needlessly counter productive to ignore Java, esp. since there's still a lot to learn from programming concepts such as POJOs.

Rhino's Story

As promised, this is going to be a Rhino story, which is a fairly old JavaScript implementation written in Java (a the very first successful polyglot JVM pioneer). It's childhood goes back to the days when JavaScript was considered nothing but pure evil. I remember a CS graduate complaining, during a lunch we shared, about Mozilla being so lame not to be able to make JavaScript work the same as Microsoft's "original" in the latest IE6. These were the dark days Rhino was growing up. Luckily we moved forward since than, to the new-age of JavaScript renaissance.
Rhino is gemified for quite some time and is known as therubyrhino gem. He's actually therubyracer's older brother and allows us to integrate Ruby objects into the JavaScript world and the other way around: call JavaScript objects like they were Ruby instances, let's take a closer look what that means.

Java Integration in JRuby (yet again)

To make a JavaScript interpreter really useful from Ruby, we'll need to be able to call a (native) function seamlessly just like a Method. And, we'd also like to use Ruby objects from JavaScript and invoke their methods like vanilla functions. These two challenges will serve really well in demonstrating the power of Ruby meeting Java. For simplicity we assume that we've gone through a few quirks e.g. we're able to evaluate strings of JavaScript already.

From Java to Ruby

The above code retrieves bar from a context and if it's a Function calls it just like a Proc. To make sure this works, we should wrap all (Java) objects we get from Rhino, and pass to the Ruby world, into a class with Proc#call semantics. Turns out JRuby itself decorates all Java instances already to provide them with a taste of Ruby, thus additional wrapping seems redundant, we rather attach Ruby-sh behaviour to Rhino's Java classes. Just like in Ruby, we might open a Java class and all instances of the given class (no matter if their were instantiated inside Java or Ruby code) will act upon those methods.

We've seen the Function interface already, all functions in Rhino implement it, although concrete classes might vary. Opening all possible implementations is certainly an option, but since there's an abstract BaseFunction they all inherit from, it's sufficient to define behaviour in one place. JRuby when dispatching methods of a concrete instance (e.g. a NativeFunction) will look for Ruby methods on it's (Ruby "decorated") Java class and all it's super classes just like you would expect for a Ruby object to resolve methods from it's class chain.
Now the tricky part is that we're name clashing, there's the (Java) call() interface method prescribed already (which is actually the method we want to delegate our call to execute the function but with a bunch of Rhino specific arguments). To demonstrate how far JRuby takes it's Java integration we used alias_method to alias a Java method, now if you're paying attention you should immediately object that this won't work as I've aliased a dummy method from a "base" class as __call__ that will eventually be overriden in an extending class. This is very true for Ruby but Java has slightly different method semantics - there's no simple way to call an arbitrary super method that has been redefined (and JRuby respects that), even a casted Java invocation ((BaseFunction) bar).call(...) will not dispatch to BaseFunction#call if it has been overiden in bar's class. And since (so far) all our functions are coming from Java it will work as expected.

From Ruby to Java

So far - so Ruby, now it's time to make our hands dirty (and I'm talking Java dirty :)) as we pass a Ruby lambda into Rhino we need to write some Java code, or maybe not ...

Here, we're "exporting" a Ruby lambda into the context and later calling it as a JavaScript function, yet again, this will require some wrapping for callables.
Fortunately, JRuby allows us to create Java classes (inherit from them and implement interfaces) with Ruby code, let the beauty of this "Java" class speak for itself:

Although 100% Ruby it might as well be considered a Java class since it will leave JRuby-land as it gets passed to Rhino's typed Java API as a Function instance (at the time it gets assigned into the context as a property).

Let's walk down the code explaining it step by step :
  • first we create a module as a namespace for all the classes from the "org.mozilla.javascript" Java package
  • than the RubyFunction class is created by extending (a Java class) BaseFunction, the very same as mentioned previously, note that we used the JS module to access it
  • a Java interface Wrapper gets included and to play by its rules we introduce an unwrap method (this is a Rhino convention used when decorating non JavaScript objects)
  • our initialize method requires an argument that will be our actual "callable" Ruby object (Method or Proc) we're calling super with no args since one of BaseFunction's constructors does not take any and we'd like to make sure we call the correct one
  • we override some getters that we inherited, notice that we do not need to specify return types, JRuby will take care of things and make sure their correct (it does now that since it detects that we're overriding a method already present)
  • the most interesting part is defining the call method (with it's Java argument semantics as explained previously above), we send a call message to the Ruby @callable and return the result
There's one last trick, we've declared a __call__ alias. Recall that we've done a similar alias previously to deal with the call (Java - Ruby) method name collision - we've expected methods with the same name to do different things based on whether they're being called from Ruby or Java. Thus to satisfy the convention we need to make sure __call__ points to the updated method in case it's called from Ruby. After all, we might end up calling a piece of JavaScript that in turn returns a RubyFunction that gets invoked (Ruby) call style, for such a case the code ain't complete.

As mentioned in the beginning we did a few simplifications, our BaseFunction#call needs exception handling and RubyFunction#call should perform JavaScript style argument slicing or filling. Feel free to tune into the actual sources (and specs) if you'd interested in how the wheels turn. Also do not forget to checkout JRuby's wiki page on the topic.

April 9, 2012

More server dependencies (with JRuby)

Post moved to

Previously we've touched the topic of managing server dependencies with Bundler in general. Now we're going to look into some specifics with JRuby servers particularly Trinidad, with outcomes applicable to non-JRuby deployments as well.

Traditionally in the Java world deployment has been separated from development (at least in theory), often developers had no choice whatsoever on picking a production server. This might sound weird for a rubyist but that is how the enterprise world usually rolls. And there's noting wrong with that either, if you're in a team that counts a few or you run multiple apps simultaneously or just happen to have an operations guy, you most probably end up about the same - after all the server is kind of detail your app should not know about. But let's consider the server being part of the bundle for a while, as it happens sometimes you might need to restrict your server's dependencies (e.g. there's a regression) or use extensions, in the spirit of the previous post - do not forget to not auto-require them (as well) :

The server loads all it's dependencies and extensions when you start up e.g. using bundle exec trinidad, no need to have them auto loaded otherwise.

Now, let's throw in another common requirement to make things more realistic - we'd like to run our server as a daemon - and look into Trinidad's specifics. Since JRuby runs on the JVM and "fork-exec"-ing the JVM ain't safe due it's multi-threaded nature (the VM might perform work such as GC in a thread parallel with the main thread between fork and exec) there's platform specific libraries to "daemonize" Java such as Akuma.
The first approach, very similar to daemons, fork and move to process to the background is achievable with a Trinidad extension on UNIX. In this scenario it's fine for the server to be part of the bundle :

Than, during deployment you would bundle exec, say with Capistrano :

But how about your machine gets restarted or the server process dies unexpectedly, wouldn't it be much nicer if the server was kinda part of the OS just like everything else you're using e.g. the web server ? Trinidad has an answer for that as well, enabling itself to be part of the underlying system services. This might smell enterprise-y at first but has certain advantages if you think about it (and works on Windows :)), of course it assumes you're in control of the machine.
So where does bundler come in here, surprise, it does not at least no necessarily ! Well you can hack the generated init.d script on Unix but what's the point ? Instead you've reached a point when your server knows the bare minimum - the base directory. Trinidad boots up, from JRuby's gems, without Bundler and the application than loads Bundler and sets up all your dependencies. Server management became essentially a separate concern and it's certainly worth rethinking if any server specific configuration files should still be changeable while delivering application code during deployment (might want to extract config/trinidad.yml out to #{shared_path}/config and update the daemon script).
You can still keep gem 'trinidad' and related for testing but I would put them in a separate group e.g. group :server as seen above and tell Capistrano to bundle without it :

Every time you feel the need to update the server, you do so "without" the application. It sure will require some discipline as servers are usually just plain-old gems, thus only the latest version you desire to run with should be installed. And since it's a production machine not a local playground a slice of discipline sounds about fine.

March 7, 2012

Managing server dependencies with Bundler

Post moved to

Ever since Bundler popped in, rubyists felt a huge relief. Esp. with Rails, gem install rails alone installs 27 gems by default (a clean install of 3.2.0 on MRI) and one most likely adds a whole bunch of others as the application evolves. Effectively managing those dependencies slightly became rocket science and Bundler is undoubtedly doing a great job safely landing on the moon every time we change a dependency ingredient.

And one day you decide to bring in a web server for development. Maybe you're even confident about the production server requirements already. So you uncomment the gem 'unicorn', hey it's already there! Add gem 'thin' for development and bundle install.

Seems about right, but wait let's think about that for a second. We've specified Unicorn as a dependency for our application, every time we run a rake task, console, thin or any other bundle exec it gets loaded (with a require). If you don't believe me just rails c and see if the Unicorn constant is defined, in development Thin should be present as well. Feels a bit wrong (unless of course your code depends on the server API for whatever reason). Let's blame Bundler, but wait turns out it's his job to go through your gem dependencies and load them. Maybe there's someone else to blame after all, sure it was really easy to uncomment a line and hit bundle install in the same second. Moral lesson taken: we should think (at least) twice whenever editing Gemfile.
Let's roll right away, a quick second think and we're at :

Much better, see we're telling bundler that it is a dependency but should not be required, servers usually come with they very own commands to be run (and load them).

This is also crucial when declaring JRuby servers such as gem 'trinidad' as your dependencies (otherwise you might end up with a scary error such as: java.lang.NoClassDefFoundError: javax/servlet/ServletContext). And actually, for more complicated deployments declaring a server dependency may just be a little unnecessary and counterproductive. But I'll rather try to explain all that in a following post.

February 2, 2012

Preventing Recursion in Ruby (without Thread.current)

Post moved to

This post is inspired by Preventing Recursion in Ruby.

First a confession: every time I see Thread.current[] I start thinking of ways to eliminate it. Thus I believe there's at least one thing wrong with the approach presented in the post. Let's try rebuilding prevent_recursion :

It seems to serve almost the same purpose and using an object's singleton class just feels right. Please note, that this does not behave exactly the same way compared to the threaded solution - we prevented a method from recursing once (or "forever" if we remove the (class << self) line), however this should most likely suffice.

Of course, I only got "smart" by failing previously, but since than I always think twice over storing anything in Thread.current, which is one of the most abused patterns in Ruby and no matter how uniquely you name those variables it's still a smell. Maybe it's about time thread-safe frameworks, such as Rails, should provide us with an abstraction for safely storing thread local data.

Now, I should probably end this right here - point proven. But since the post mentioned infinite recursion with ActiveRecord::Callbacks I immediately remembered my very own case where I would be of no luck using the above approach. Try the following sample :

A real world scenario would be keeping an array-like index for items in a basket. Each basket has it's items ordered and changing an index of an item "swaps" it with the other item at the given position. One way of doing it, which requires a recursion prevention would be :

Notice how the method gets redefined with a dummy one to prevent recursion.

I'm a huge fan of doing things DRY, on the other hand sometimes it might be contra-productive, especially when it can't cover all cases or is only used once or twice. Not to mention how much easier it would be for a new comer to read a method such as assign_index and understand the trick immediately then having to dive deep into another module's meta-stuff.

UPDATE: As @avdi pointed me out, I realized I'm actually promoting non-thread-safe code. I narrowed my recursion prevention domain to short lived AR instances that are never accessed beyond a single request, however this is clearly not a completely valid assumption to make. Especially since the original prevent_recursion targets a much wider audience.
Notice the difference in behavior if the prevented (thread-safe) instance method or a shared object is accessed by two concurrent threads, Thread.current will guarantee a correct result for each caller, however my instance_eval approach will not. And so the moral of this story seems that a slice of Thread.current[] "pollution" to prevent recursion is inevitable after all.