Combining CSV Files and Verifying Line Count

Situation:
1500 .csv files bogging down an import system due to the number of files.

Solution:
Combine files into 1 file.

Problem #1:
All files have a header row with field names.

Solution #1:
Combine all the files while stripping off the

find . -name "*.csv" | xargs -n 1 tail -n +2 > big.csv

Problem #2:
I’m not sure if I have all the lines I should.

Solution #2:
Verify number of lines in the file and compare to row count from a database.

sed -n '$=' big.csv

Problem #3:
This removed the header row that I need at the very top of the file.

Solution #3:
Create file (header.csv) with a single line containing the field names that we stripped off earlier.
Then take header and big and combing them into our final csv file:

cat header.csv big.csv > final.csv

Advertisement

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.