Teixo: de Rails 2.3 a Rails 4.2
Este artículo está únicamente disponible en inglés.
We are TEIMAS, one of the few software companies worldwide specialized in digitalizing the waste value chain. We help companies in their transition to the circular economy and decarbonisation in order to protect resources, the environment and people's health. Our software solutions favour the reintegration of waste into the production chain to reduce the consumption of new resources.
One of those solutions is Teixo, it’s a SaaS software designed for Waste Management Companies. Some of the features Teixo has are:
Teixo i’s the leading tool in Spain implemented in more than 700 waste management facilities with more than 3000 active users per day.
This series of posts talks about the project of updating our main product, Teixo, without downtime and granting the service to all of our customers.
This project did not only involved technical staff, but also customer success department people and of course many users and clients. So those posts will not only cover technical issues but also customer management and procedures to grant Teixo’s quality.
The first lines of code in Teixo were written in 2008 using those days leading technology, Rails 2.3 over Ruby 1.8.3. Teixo works with a MySQL database that has a very good performance despite its large size.
Teixo evolved at a very high rate for more than twelve years since those first lines of code, mostly adding features and adapting to new infrastructure needs. And it did it well.
The main parts of Teixo are:
Those parts are built together on a modular ‘monolith’.
Teixo currently runs over AWS infrastructure using two different auto scaling groups, one for web and API capabilities and the other for background processes.
Also, this is an important point, we updated the Ruby version from 1.8.3 to 2.7.3. For doing this update we had the help of a special LTS version of Rails 2.3 (https://railslts.com/) with support for Ruby 2.7.3.
But despite having an updated Ruby version, Teixo had many problems, like outdated gems, an obsolete Activerecord API, or deployment troubleshooting, etc. Thus new developments became more complex and fault prone with the passing of the years.
In 2021 we wanted to reduce our technical debt so we decided to move forward and schedule a project to grant the viability of Teixo for at least 10 more years. We decided to update Teixo´s Rails version and do it in different steps or phases, one for each mayor version of Rails:
At this point we still didn’t knew if it would be better to split those steps on smaller ones (one for each minor version, i.e Rails 3.0, later Rails 3.1 and finally 3.2), or doing each mayor change at a time.
Rails has a good documentation specially about updating apps from one version to the next one, the Rails Guides, but even with this help the project we faced had a great complexity.
To help understand the difficulty of the project, here we put some data about Teixo:
Declared Gems 59, Total Gems 134.
By the end of 2021 we began to specify ‘The Plan’. It had some key requirements:
We also started to search for companies that would help us on this process, a ‘Guide’ for our journey. We found one Philadelphia’s company specialized on updating Rails Projects. We started to work with them in a early phase where they would check over Teixo’s codebase and prepare a report about the steps and possible problems we will found in this project.
When the analysis was finished they proposed us a step by step update process, one step for every minor Rails version. That is, Rails 3.0, 3.1, 3.2, 4.0, and so on. This approach would take years to be done so we preferred to do bigger steps in order to achieve the update in a reasonable time. Fortunately we had an ace up the sleeve, a great QA and Customer Success department that would help us to test and validate the update steps in a fast and reliable way (even with big platform changes).
We knew, and the analysis confirmed that we had to prepare this update process. For several months we enforced QA on Teixo, developing a lot of new automatic tests on our CI environment and expanding the existing ones. We also strengthen our Customer Success department protocols for checking Teixo quality.
The analysis also proposed how to deal with the update process, having one codebase suitable to boot on two versions of Rails (dual boot mechanism). This would imply to fill all the codebase with ifs to separate code from the different versions of Rails in the cases that it should be different. Later the ifs should be removed leaving only the modern Rails code. Finally this should be done on each step of the project. This was the company’s usual procedure on this kind of projects. We thought this approach would be very hard to handle and would increase the overhead of an already very complex project.
Finally the report included a summary of the changes needed to be made in Teixo on each Rails step, it was an adapted version of the Rails Guides to updating.
We proposed to approach the project handling two versions of Teixo at the same time on each step of the process. I.E., one version with Rails 2.3 and other with Rails 3.2 for the first phase.
In the staging and production environment they should work with the same database schema. Only one version should handle database migrations but both versions should be ‘compatible’ with the schema. This leads to be especially careful with migrations, having the same migrations on both versions of the code, or at least do no delete attributes/columns in only one version.
The update process should be done along with the normal development workflow so we had to tweak our development process a bit.
On each step of the update process there are two phases of work to be done.
If we would have chosen to do smaller steps (minor Rails version changes) the overhead of the second phase would be excessive, specially for the Customer Success department.
Meanwhile the normal Teixo’s development workflow continued, so on this second phase we had to change our usual procedure on the next aspects:
Our process for new Teixo releases was also modified in order to handle this duality.
On April 2022 we started to work on the first and (by far) most complex of the steps we would make: Updating from Rails 2.3 to 3.2 without intermediate steps.
We hired one expert developer of our ‘Guide’ company to team up with two more Teimas senior developers. But this collaboration didn’t help the project. Many obstacles made collaboration difficult:
Finally our collaboration finished in a friendly and mutual way before the first step end. So we finished the first update without external help.
Before starting to work with a new branch for Rails 3.2 we made a great effort on increasing Teixo’s automated test coverage. We reached nearly the 70%. This work is one of the most relevant for the success of the project.
Also we worked on fixing some Deprecation Warnings like ‘ActiveRecord::Base#class_name is deprecated’ and other improvements like improving code organization and cleaning unused parts of the project.
This part details the different technical changes, problems and issues we faced when migrating from Rails 2.3 to 3.2. This work was mostly made on the first part of this update step but some was also made on the ‘reactive work’ phase solving previously undetected issues and errors that the work of Customer Success department and even any trusted customer brought out.
As it was mention we upgraded to a Rails 3.2 LTS version supported by Makandra, as they define themselves, a team of veteran Rails developers and operations engineers.
We use Bundler and once we’d updated the gemfile to use:
gem 'rails', '~> 3.2.22.27'
we had to fix some Gem issues.
gem 'mysql2', git: 'https://github.com/makandra/mysql2', branch: 'master'
require 'active_record/connection_adapters/abstract_mysql_adapter'
gem 'mysql2', '> 0.3.10'
require 'mysql2'
module ActiveRecord
class Base
....
Configuration and many other files required dependencies using File API that has changed, so we must make changes on some places, for example:
require File.dirname(__FILE__) + '/../config/boot'
To
require File.expand_path('../../config/boot', __FILE__)
We also had to update several boot and configuration files along with many initializers.
Config: confg.ru for example, was changed on various lines, mainly changed:
run ActionController::Dispatcher.new
to
run Teixo::Application
Application: We had to build a new config/application.rb with the app configuration (extracted from the config/environment.rb file) declaring Teixo app as:
module Teixo
class Application < Rails::Application
Boot: config/boot.rb was simplified to simply load the Gemfile and bundler.
require 'rubygems'
# Set up gems listed in the Gemfile.
ENV['BUNDLE_GEMFILE'] ||= File.expand_path('../../Gemfile', __FILE__)
require 'bundler/setup' if File.exists?(ENV['BUNDLE_GEMFILE'])
Rakefile: has to define how to load configuration and the Application
require File.expand_path('../config/application', __FILE__)
...
Teixo::Application.load_tasks
Routes: The most time consuming change was to rewrite the config/routes.rb file. Route declaration changed a lot from Rails 2 to 3. Despite existing tool for translating routes from 2 to 3 our routes.rb file was not very well defined so we have to manually port many of the routes to the new format.
It was very handy to try routes helpers from the console (when we got it to work), for example:
irb(main):002:0> app.admin_users_path
=> "/admin/users"
Despite of the Gems update and configuration changes the project was not able to launch a Rails console nor the Server. We needed to make several more changes.
RAILS_ENV was no longer supported, we had to change it to Rails.env in all the project.
All references to the package ActionController::xxxxxx had changed to ActionDispatch::xxxxx (for example ActionController::Routing::Routes)
All scope declarations must change from ‘named_scope’ to simply ‘scope’. Also, some scopes on superclasses didn’t work well in subclases so we had to rewrite some of them using ‘scoped’. For example
named_scope :not_draft, lambda { {:conditions => ["#{table_name}.state != ?", DOCUMENT_STATES[:draft]]} }
Became:
def self.not_draft
scoped(:conditions => ["#{table_name}.state != ?", DOCUMENT_STATES[:draft]])
end
We changed a few attr_accessor_with_default used in Teixo because it was no longer supported. We fixed each case depending on the need. Mostly using attr_writer to declare the field and/or defining a getter method.
Saving without validations was no longer available with a ‘save(false)’ call, now it is save(:validate => false).
Several helper methods used in Haml (and Erb) partials no longer worked using the - operator, we had to change it to the = operator.
- form_tag session_path, :id => 'login_form' do
Changed to
= form_tag session_path, :id => 'login_form' do
The flash method on Rails 2 returned an object that responded to the Hash API, on Rails 3 this is no longer true so we had to call to_h on every use:
if flash.values.empty?
Changed to
f_hash = flash.to_h
if f_hash.values.empty?
The use of @template is no longer supported inside controllers. Recommendation is to use the view_context method instead of @template (In Rails 3 the new AbstractController was introduced).
Once the app was running we continued to work making many changes, most of them on the first phase of this update while trying to get to 0 errors on our CI environment. But many other problems were detected and fixed when the app was released on our staging environment, thanks to the Customer Experience department.
Sometimes we faced problems that would require a lot of work to fix, so we first tried to workaround it with Monkey Patchings, and scheduled the right fix work in the future so we could bypass this error and go on with the process. Later we try to do the right fix but not after assessing if it is worth the work (sometimes it wasn’t).
Many of the problems we faced where those ones:
html_safe?
Teixo has a lot of helpers that return strings with embedded html, css, or even js. Since the arrival of xss protection with Rails 3 the use of those helpers and other variables on the Erb and Haml templates returned escaped html making Teixo unusable. The right fix would have been to review all the partials and use .html_safe on every ‘unsafe’ string use. This approach was impossible due the size of the project. We knew that our helpers and rest of code on partials were safe (we filter the user input) so we workaround this with a Monkey Patch:
We defined a module CustomHtmlSafe:
module CustomHtmlSafe
def html_safe?
true
end
end
And monkey patched several classes. ActiveSupport::SafeBuffer also had to overwrite to_s to avoid rendering problems:
class ActionView::OutputBuffer
include CustomHtmlSafe
end
class ActiveSupport::SafeBuffer
include CustomHtmlSafe
def to_s
"#{self}"
end
end
class ActionView::SafeBuffer
include CustomHtmlSafe
end
class String
include CustomHtmlSafe
end
form_for, remote_form_for and link_to_remote
The method form_for no longer received a symbol as the first param, whe had to change all the form_for from this:
=form_for :document, @document, :url => url do |f|
To this
=form_for @document, :url => url do |f|
Also, on the old form_for the symbol of first param should be added as an option with an :as label (when the name of the form param doesn’t match the field of the object).
=form_for @document, :as => :document, :url => url do |f|
Teixo had a lot of partials which used remote_form_for that no longer exists. The Rails guide recommends to use form_for with the :remote option instead. We have to rewrite all those remote_forms.
Routes helper method changes
Several route helpers methods changed from Rails 2 so we had to review and rewrite them, specially in create and update methods:
update_api_v2_devices_waste_collection_path(wc.id)
To:
api_v2_devices_waste_collection_path(wc.id),
Also on remote routes helper methods like create_xxxx_remote_path or update_xxxx_remote_path.
After initialize
The after_initialize callbacks on Rails 2 were declared as a method, on Rails 3 this changed to the standard macro style:
def after_initialize
self.effective_company_type ||= DEFAULT_COMPANY_TYPE
end
To:
after_initialize do |init_object|
init_object.effective_company_type ||= DEFAULT_COMPANY_TYPE
end
Replace_html and render :update
This was one of the most complex changes we had to make. The replace_html method is no longer supported on Rails 3. Most of the dynamic behavior in Teixo’s frontend comes from this Rails feature so we cannot go without it. We can’t afford changing Teixo’s frontend behavior at this moment (it will be a huge project in a later phase of the update) so we implemented our own replace_html based on jQuery (the js library we already had in Teixo).
Our definition was included in jquery_helper.rb and looks like this:
def replace_html(element_id, html)
insert_html(:html, element_id, html)
end
def insert_html(position, element_id, html)
insertion = position.to_s.downcase
insertion = 'append' if insertion == 'bottom'
insertion = 'prepend' if insertion == 'top'
# Adds immediate timeout to execute complete and success callbacks
%Q(
setTimeout(function () {
jQuery('##{element_id}').#{insertion}('#{escape_javascript(html)}');
$(document).trigger('ajax:replaced', jQuery('##{element_id}'));
});
)
end
So replace_html receives the same parameters as before, a dom element id and the html to change the content of the dom element with. But we should also change how this response was sent to the browser, no more render :update were useful, we have to develop a new way, our generate_js_response that simply renders the js passed as parameter:
def generate_js_response(&block)
render "shared/js_response.js", :locals => {:js_content => block}
end
And the shared/js_response.js
<% content = [] %>
<% self.instance_exec(content, &js_content)%>
<%= content.join %>
So, finally we changed our uses of replace_html from this:
render :update do |page|
page.replace_html('link_add_bank_account', render(:partial => 'remote_form'))
end
To this:
generate_js_response do |page|
page << replace_html('link_add_bank_account', render(:partial => 'remote_form'))
end
Only render :update has to be changed, not the replace_html, so the work to change the 300 occurrences of replace_html was more affordable.
Mailers
Rails 3 made several changes on mailing system so we had to make some relevant changes, and not all of them were on the documentation:
def set_multipart_structure(mixed_mail)
if attachments.any?
# Set the message content-type to be 'multipart/mixed'
mixed_mail.content_type 'multipart/mixed'
mixed_mail.header['content-type'].parameters[:boundary] = mixed_mail.body.boundary
# Set Content-Disposition to nil to remove it - fixes iOS attachment viewing
mixed_mail.content_disposition = nil
end
end
Clone and dup
Rails clone method does not exist in Rails 3 (it comes back in Rails 4) so we changed call to it with dup method.
errors.add_to_base
Errors objects no longer support add_to_base. We had to change these calls to errors.add :base.
Boolean params
Params with boolean values inside where magically managed in Rails 2, The string ‘true’ was automatically parsed to true, and ‘false’ to false. With the arrival of Rails 3 this behaviour changed so we had to define a parser and use it on the different params needed:
def parse_boolean(str_bool)
ActiveRecord::ConnectionAdapters::Column.value_to_boolean(str_bool)
end
Another option would be to review al the forms using boolean values but this was an easier way.
Link_to_remote
In Rails 2, it was possible to use “link_to_remote ... :update => 'id'” for replacing the content of $('#id') automatically. It’s not possible within Rails 3 so we had to adapt our wrapper of link_to_remote on actions_link_helper.rb:
def link_to_remote(name, options = {}, html_options = nil)
.... // Custom implementation
//adding the relevant part for making the html replacement work
data_replace = options.delete(:update)
html_options = html_options.merge(:"data-replace" => "##{data_replace}") if data_replace.present?
data_complete = options.delete(:complete)
html_options = html_options.merge(:"data-complete" => "#{data_complete}") if data_complete.present?
data_before = options.delete(:before)
html_options = html_options.merge(:"data-before" => "#{data_before}") if data_before.present?
data_error = options.delete(:error)
html_options = html_options.merge(:"data-error" => "#{data_error}") if data_error.present?
link_to(name, path, options.merge(:remote => true).merge(html_options))
end
and define this helper in our application.js to make it work:
$('[data-remote][data-replace]')
.data('type', 'html')
.live('ajax:success', function(event, data) {
var $this = $(this);
$($this.data('replace')).html(data);
$this.trigger('ajax:replaced');
});
Finder_sql
Relations with a finder_sql had to be changed by the syntax change to a proc on Rails 3.
From:
has_many :formations, :class_name => "Formation",
:finder_sql => %q(SELECT DISTINCT ...)
To:
has_many :formations, :class_name => "Formation",
:finder_sql => proc {"SELECT DISTINCT ...”}
attributes=
The behavior of this method changed from Rails 2 to 3. We had a several uses of this method where the hash used had virtual params or non existing attributes, this raises an error on Rails 3. So we had to monkeypatch this method, for filtering the received hash and allow only existing attributes. The monkeypatch looks like this:
class ActiveRecord::Base
alias_method :super_attributes=, :attributes=
def attributes=(hash = {})
hash ||= {}
self.super_attributes = hash.select{|k,v| self.class.column_names.member?(k.to_s) || k.to_s.match(/_attributes\z/) || self.respond_to?(:"{k}=")}
end
end
Submit_to_remote
The helper method submit_to_remote is no longer available on Rails 3. So we had to define one own in application_helper.rb
def submit_to_remote(name, value, options = {})
html_options = options.delete(:html) || {}
submit_tag value, options.merge(html_options).merge(:id => name)
end
In some cases this was not enough so we had to replace it with a specific link_to_remote.
Errors.full_message
Teixo use the errors.full_message to display the problems a page form has (required fields, wrong format, length issues, etc.). But the behavior has changed in Rails 3 and the nested objects errors are not included in the full_message. So we had to monkeypatch it, as you can see:
class ActiveModel::Errors
alias_method :old_full_message, :full_message
def full_message(attribute, message)
if (splitted_attribute = attribute.to_s.split(".")).count > 1
translated_attribute = if @base.send(splitted_attribute.first).respond_to?(:any?)
@base.send(splitted_attribute.first).first.class.human_attribute_name(splitted_attribute.second)
else
@base.send(splitted_attribute.first).class.human_attribute_name(splitted_attribute.second)
end
old_full_message(translated_attribute, message)
else
old_full_message(attribute, message)
end
end
end
Caching views
The use of cache in views changed slightly and we had to struggle a lot to find out what was happening. We had some partials which used cache defined in a special controller, with code like this (action_cache_key is a method to get current partial cache key):
Rails.cache.fetch(action_cache_key(opts)) do
render opts[:action]
end
If the partial is not cached the result was not rendered, but the second time we accessed this page/partial, it was cached fine and was rendered fine.
Finally we notice that we have to return always a String inside the Rails.cache.fetch block:
Rails.cache.fetch(action_cache_key(opts)) do
block.call if block_given?
render(opts[:action]).join("")
end
readonly(false)
Many object relations in Teixo were loaded with attributes chaining and later updated somehow. With the update to Rails 3 those loads were by default marked as readonly, So further attempts to update those related objects were failing. We had to manually check those relations and mark these loads as readonly(false). For example:
outgoing_line.outgoing.update_me
Changed to
outgoing_line.outgoing(:readonly => false).update_me
Reload
Reload method if called on a deleted object on Rails 2 simply returned nil, on Rails 3 it raises an Exception. We had to review some callbacks that did reload and failed when runned after a delete.
render_optional_error_file
This method no longer exist so whe changed part of the error control to use config/routes.rb instead of this mechanism.
Flash errors
The behavior of flash messages feature had changed between Rails 2.3 and 3.2 had changed. The messages are no longer available through redirects so we had to do some flash.keep on certain callbacks and filters.
Log_error
Rails 2 has a default mechanism to handle errors on controllers. Whenever an error is raised Rails 2 controllers called to a method named log_error. Since Rails 3 that is no longer true. We had to configure on our base controller a explicit call on this method whenever a error is raised:
class ApplicationController < ActionController::Base
rescue_from StandardError, with: :log_error
Time.zone.parse
In some places throughout the app (mostly on reports) users can choose to filter data by dates, usually for full month length, so we used Time.zone.parse to parse partial date param strings to get data time boundaries. So users who wanted a report of certain data for the month of July sent params like date1_str: "/07/2022" and date2_str: "/08/2022". In Rails 2 it worked like this:
> date1 = Time.zone.parse("/07/2022")
Fri, 01 Jul 2022 00:00:00 CEST +02:00
But when we changed to Rails 3 the behavior was:
> date1 = Time.zone.parse("/07/2022")
Sun, 31 Jul 2022 00:00:00 CEST +02:00
So un Rails 3 users had data for the month of august. We fixed it adding a call to beginning_of_month when needed.
Respond_to
In Rails 3, respond_to works differently than Rails 2 so we had to review and test different occurrences.
Also this behavior changes when no Accept header is attached to the request and this was affecting several clients. In Rails 2 default response type in this case was the same as the content-type on the request. We had to implement a before_filter on API controllers to add a default Accept header if none was declared, so that Teixo will work as in Rails 2.
def add_accept_header_if_necesary
if request.headers['HTTP_ACCEPT'].blank?
Rails.logger.info("ApplicationController: Accept header empty for #{request.host}/#{request.path}")
if request.headers['CONTENT_TYPE'].present?
new_format = Mime::Type.lookup(request.headers['CONTENT_TYPE'])
if new_format.present?
request.format = new_format.ref
end
end
end
end
Render with a proc {}
In Rails 3.2 it is not possible to use a render text with a proc as an argument. Fortunately we had only a few uses of this behavior.
Disabled form fields
On Rails 2 disabled form fields act as readonly fields, so data is sent to the server when the form is submitted. From Rails 3 onwards disabled fields are not sent to the server. So we had to review those disabled fields and mark it as readonly instead, or keep them disabled, depending on the case.
BigDecimal gem
We had to add the gem BigDecimal (to continue using this type). We also had to keep it in a compatible version with Rails 3 because it uses BigDecimal.new for initializing attributes of this class and recent versions of the gem do not support this behavior.
The upgrade to Rails 3 also came with the need to change the servers we were using with Rails 2. We had to configure and deploy new servers with an updated OS and installed libraries. Also update all the scripts and be aware of the problems that came from having two staging and two production environments with different configurations.
Nevertheless the most relevant change came from switching the server from Unicorn to Puma. Despite it is not necessary to use Puma with Rails 3 we decided to go on with this change in order to advance this point for the next Rails upgrade we will face (Rails 4.2).
This was the most headaching part because we face a loss of performance that we do not understood at the beginning. With Rails 2 we configured several Unicorn processes on each server, depending on the CPU and RAM of the server. With Puma we wanted to change this behavior and take advantage of Puma’s threads. So initially we configured only one Puma process with several threads on every frontend server,, so our puma.rb looked like this:
puts "1 workers and 7 threads"
workers 1
# Min and Max threads per worker
threads 1, 7
At this point we didn’t knew that Rails 3.2 is not multithreading. So Puma with 1 worker and 7 threads is the same than 1 worker and 1 thread. With this configuration each server could only dispatch one request at a time. The other requests were waiting in Puma’s queue. We notice it thanks to New Relic’s monitoring tool.
When we noticed it we changed our Puma configuration to this:
puts "5 workers and 1 threads"
workers 5
# Min and Max threads per worker
threads 1, 1
And performance returned to normal levels. We also had to fine tuning memory usage and the number of processes. Finally we adopted a gem called puma_worker_killer to keep an old behavior we had with Unicorn using a gem called unicorn_worker_killer to avoid memory problems. Those problems seem to have gone on Rails 3 so we plan to remove puma_worker_killer in the future.
At this point we had two Teixos in the Staging environment, one on Rails 2.3 (Stable) and other on Rails 3.2 (Edge). Both working with the same database and with the same Memcached server. Every version had its own set of DelayedJob workers but they all worked against the same database searching for Jobs to run. Also user sessions were stored on the database and all those things together led us to some new problems we had to deal with.
Sessions
User sessions were stored on the database for both channels. This led us to deserialization issues when a user that had a stored session on Edge (Rails 3.2) attempted to use Stable version (Rails 2.3) because Rails 2.3 didn't knew how to deserialize Rails 3 objects. There is no unique solution for this, it depends on the date attached to the session and how deserialization works for each one. In our case we simply added this on config/initializers/a_config.rb
ActionController::ParamsHashWithIndifferentAccess = ActionDispatch::Http::ParamsHashWithIndifferentAccess
So Rails 3 could unmarshal Rails 2 objects of this class ParamsHashWithIndifferentAccess and vice versa.
And defining a SessionStore initializer (config/initializers/session_store.rb) to deal with FlashMessages stored on the Session.
module ActionController
module Flash
class FlashHash < Hash
def method_missing(m, *a, &b)
end
end
end
end
Cache problems
Another similar problem we faced was to deal with deserialization issues on Rails Cache. On this case we changed Teixo to add an environment variable on both channels Stable and Edge. This variable was used on the Rails 3.2 version of Teixo to be appended at the beginning of each cache key use. Doing so makes cache entries for Rails 3 different for Rails 2 (keys are similar but Rails 3 keys has a prefix), so Stable and Edge channels do not share cache entries.
Be aware that this can led to some inconsistencies between environments. For example a partial of an object show cached on Stable is not anulled when this object is modified on Edge and vice versa. So be conscious of cache problems if users work on both environments at the same time.
DelayedJobs, workers and offsets
As it was said previously Teixo uses the DelayedJob gem. It defines tasks as different job classes extending Delayed::Backend::ActiveRecord::Job. Also we simulate different Job categories grouping Job types by priorities. I.E.: priority 0 to 4 means job category A, priorities 5 to 9 means category B, and so on. In the production environment we have different number of Workers to run those Jobs depending on the category: Two workers for category A, one for category B, etc.
When we planned to have two channels we needed a way to ensure that workers on Stable will run only Jobs of the Rails 2 version and the same with Rails 3.
We solved it with a priority offset. Depending on the environment we have a DelayedJob offset parameter. This parameter was used during Job creation and added to the base priority of the job. On the Stable environment the offset was 0, on Edge it was 100. So jobs on Stable had priorities between 0 and 99, and Edge jobs had priorities between 100 and 199. We needed to adapt Workers initialization to add this offset when launching.
Worker for Jobs with A category Stable was launched like this:
/script/delayed_job -n 2 --min-priority 0 --max-priority 4 run
Same worker on Edge
/script/delayed_job -n 2 --min-priority 100 --max-priority 104 run
QA has a huge impact in the migration process ensuring that automated testing covers most of the features of Teixo. But not all the features and user workflows are covered by automated testing. Some features, procedures and/or behaviors are complex and specific for certain customers and thus they are not test covered.
This is where Customer success department comes to play, ensuring that all those specific points are working fine, with a previously defined Test plan. They did several manual test validations, following the test plan, detecting various issues, and most valuable, sources of issues even in non tested parts of the application. All those testing was made in our staging environment on the edge channel.
When all the (known) issues were fixed, we deployed our new brand Teixo in production, on the Edge channel, accessible via specific url. At this point we had two different versions of Teixo working with the same database instance:
We started working on this edge version internally for a few days.
Meanwhile Customer Success department started to talk to a selected group of customers asking them if they would like to try the new Teixo Release before it was fully open to every customer.
The advantages for them were that if there were any problems with their workflows on Teixo we could fix these issues very soon. Many customers agreed and we started to allow them access to the edge version. Each week more customers started to work on the new version until we had enough customers working on the platform.
This double production environment led us to an overhead work as explained early (section Double Trouble) but it was worth the effort.
When we considered the edge version was stable enough the final step was planned. It would require several stages:
When everything was ready and the date arrived the change was relatively easy. We still had to face some issues when all the customers started using the new Teixo versión but they were easily handled and in a few days the situation was stable, the more complex problems were already solved.