Tuesday 15 February 2011

CSV read v/s Temp table read from database, optimization of the loop and active record usage . Ruby -


Was slow parsing CSV file so I was trying to load the file into a temporary table directly in direct database Calculating as the following:

Before it was such, it took 13 minutes to add entries using the method below:

  CSV Do forek (filename) Line | completePath = line [0] num_of_bps = line [1] completePath = cluster_path + '/' + completePath inode = FileOrFolder.find_by_fullpath (completePath ,: select = & gt; "ID") metric_instance = MetricInstance.find (: first ,: status = & gt; [ "File_or_folder_id =? AND dataset_id =?", Inode.id, dataset_id]) add_entry (metric_instance.id, num_of_bps, num_of_bp_tests) end Diaf self.add_entry (Metadet, num_of_bps, num_of_bp_tests) entry = Bp.new entry .metric_instance_id = metaid entry.num_of_bps = num_of_bps entry.num_of_bp_tests = num_of_bp_tests entry.save return entry end   

now I've changed the method to do it, now takes 52 minutes (< / p>

  @beepis = Temptblekal @ Beepis.ac two | bp | completePath = Bpkfrst_colm Nm_ofa_beepis = Bpksekand_colm Nm_ofa_beepis 3 = bp.third_column full path = cluster_path + '/' + full path inode = file or folder.find_b_flippath (full path, select =>? Id = "id") num_of_bp_tests = 0 if (inode.nil) if any other (num_of_bps = ' 0 ') num_of_bp_tests = 1 end metric_instance = Metr IcInstance.find (: first, conditions = & gt; [ "File_or_folder_id =? And dataset_id =?", Inode.id, dataset_id]) add_entry (metric_instance.id, num_of_bps, num_of_bp_tests) end end   

Please optimize the code to help me Or I know if you think CSV .each is faster than reading the database!

When you load the CSV in the database:

  • Load n Select CSV lines
  • Enter n record int dB
  • and the instance EN active record model
  • more iterative

    When you work with raw CSV only

    • Load N CSV lines
    • iterate

      Of course this is fast .

No comments:

Post a Comment