This is a Ruby design problem. How can I make a reusable flat file parser that can perform different data scrubbing operations per call, return the emitted results from each scrubbing operation to the caller and perform bulk SQL insertions?
Now, before anyone gets narky/concerned, I have written this code already in a very unDRY fashion. Which is why I am asking any Ruby rockstars our there for some assitance.
Basically, everytime I want to perform this logic, I create two nested loops, with custom processing in between, buffer each processed line to an array, and output to the DB as a bulk insert when the buffer size limit is reached.
Although I have written lots of helpers, the main pattern is being copy pasted everytime. Not very DRY!
Here is a Ruby/Pseudo code example of what I am repeating.
lines_from_file.each do |line|
line.match(/some regex/).each do |sub_str|
# Process substring into useful format
# EG1: Simple gsub() call
# EG2: Custom function call to do complex scrubbing
# and matching, emitting results to array
# EG3: Loop to match opening/closing/nested brackets
# or other delimiters and emit results to array
end
# Add processed lines to a buffer as SQL insert statement
@buffer << PREPARED INSERT STATEMENT
# Flush buffer when "buffer size limit reached" or "end of file"
if sql_buffer_full || last_line_reached
@dbc.insert(SQL INSERTS FROM BUFFER)
@buffer = nil
end
end
I am familiar with Proc/Lambda functions. However, because I want to pass two separate procs to the one function, I am not sure how to proceed. I have some idea about how to solve this, but I would really like to see what the real Rubyists suggest?
Over to you. Thanks in advance :D