N
N
Nikolai Markov2015-05-25 00:58:25
Ruby on Rails
Nikolai Markov, 2015-05-25 00:58:25

How to painlessly implement bulk actions with the database?

How can I implement mass insert/update records into the database without duplicating all the validation/callback logic, etc. in yet another place. I don't want to maintain two places in the code.
Or what are alternative solutions to this problem?
UPD: the problem is that doing 1000 inserts individually is slower than doing everything in one query. And so I thought as it is possible to implement it with the minimum losses. There is an idea to use the Ruby Object Mapper (implementation of the data mapper pattern) to write an adapter for it that will store everything in memory, and then build an sql query from this for a bulk insert / update.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
vsuhachev, 2015-05-25
@manameiz

There is activerecord-import , which does pretty much what you want to do, only without the Ruby Object Mapper.
If you have a PG, then there is a copy from csv command for it, you can use it something like this:

task :ips, [:filename] => :environment do |_, args|
      filename = args[:filename] || Rails.root.join(*%w(tmp geoip cidr_optim.txt))
      file = File.new(filename, 'r', encoding: 'windows-1251')

      puts "Import #{filename}"

      csv = CSV.new(file,
                    col_sep: "\t",
                    quote_char: '$',
                    headers: [:range_begin, :range_end, :title, :country, :city_id]
      )

      GeoIp.transaction do

        connection = ActiveRecord::Base.connection_pool.checkout

        begin

          sql = <<-SQL
          COPY geo_ips(ip_range, country, city_id)
          FROM STDIN
          WITH CSV NULL '-'
          SQL

          pg_exec_with_stdin(connection, sql) do
            line = csv.shift
            if line
              line = CSV::Row.new(
                  [],
                  [
                      "[#{line[:range_begin]},#{line[:range_end].to_i + 1})",
                      line[:country],
                      line[:city_id]
                  ]
              )
            end
            line.try(:to_s)
          end

        ensure
          ActiveRecord::Base.connection_pool.checkin(connection)
        end

      end

    end
  end

def pg_exec_with_stdin(conn, sql)
  raw  = conn.raw_connection
  raw.exec(sql)

  while (line = yield)
    raw.put_copy_data line
  end

  raw.put_copy_end
  while raw.get_result do; end # very important to do this after a copy
end

UPD: added pg_exec_with_stdin

K
kkrieger, 2015-05-25
@kkrieger

If you understand correctly, look at concerns in rails 4 allows you to put the general logic of the model into a separate module and connect it to the desired model, here is the article artemeff.com/2013/04/21/concerns-v-rails-4.html

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question