Combining CSV Files and Verifying Line Count

Situation:
1500 .csv files bogging down an import system due to the number of files.

Solution:
Combine files into 1 file.

Problem #1:
All files have a header row with field names.

Solution #1:
Combine all the files while stripping off the

find . -name "*.csv" | xargs -n 1 tail -n +2 > big.csv

Problem #2:
I’m not sure if I have all the lines I should.

Solution #2:
Verify number of lines in the file and compare to row count from a database.

sed -n '$=' big.csv

Problem #3:
This removed the header row that I need at the very top of the file.

Solution #3:
Create file (header.csv) with a single line containing the field names that we stripped off earlier.
Then take header and big and combing them into our final csv file:

cat header.csv big.csv > final.csv

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s