By Volkan TUNALI, November 4, 2010 11:34 pm

Last time I used awk to split single cisi.all file into small files like cisi.1, cisi.2 etc. Now, I have needed to join these small files into a single one in a kind of XML format. I have read some tutorials on awk but I am unable to find such a thing as looping over many text files. So, I have used another solution with Windows BATCH file scripting. I have written a little awk program to format and output the content of a given file to some output file. Then, in a batch file, I loop over the files in a directory and for each file, I run the awk program.

Here’s the batch file named JOIN.BAT:

del output.xml
for /r %%X in (dataset\*.*) do (awk -f join.awk %%X)

Here’s the awk file named JOIN.AWK:

BEGIN { print "<DOC>\n<BODY>" >>"output.xml"}
{print $0 >>"output.xml"}
END { print "</BODY>\n</DOC>\n">>"output.xml"}

As you see in the awk program, the content of each file is appended to the file output.xml. On Unix-like systems, you can write similar shell scripts instead of batch file.

