Daily wordcount script · 3:57am Apr 4th, 2016
Disclaimer: This is mainly for me to keep this at hand and not lose it.
This is a small, simple script that registers the total wordcount in a determinate folder per day. The output format is a space-delimited CVS file:
64487 04/04/2016
The full script, which for now is a .bat file, is:
find [FOLDER] -name "*.txt" -print0 | wc -w --files0-from=- | gawk '{print $1}' | tail -n 1 >>data.txt
date /T | gawk '{print $2}'>>data.txt
paste -d " " - - < data.txt >>dataproc.txt
del data.txt
Which is a freaking mess but it works. It requires cygwin, but it works. For those who don't know, "|" allows you to execute multiple commands in tandem, often passing the output from one to the next.
The first comand, "find -name "*.txt" -print0" searches all the .txt files in a folder and its subfolders. By itself, it simply outputs a files list. " -print0 " loads said file list into a buffer.
The second could be said to be the guts of the script: It checks all the files that "Find" loaded into the buffer ( --files0-from=- ) and counts how many words they have. Unfiltered, the output of this is rather messy and filled with information that isn't strictly necesary for the purposes of the script. Which is where
gawk '{print $1}' | tail -n 1
come into play.
Gawk prints out only the first column, and tail only the last line of the last column. The result is a nice, tiddy,
64487
But it lacks context. When did I have a total of 64487 words in my writing folder?
For that I use date /T:
Mon 04/04/2016
A bit of adjusting and gawk magic...
04/04/2016
Voila!
Now I have a text file that looks like
64487
04/04/2016
"Paste" first reads the file and then outputs a nicer version into dataproc.txt.
64487 04/04/2016
So I can just check the change in wordcount per day. This, however, has a few faults:
A) It only reads TXT files.
Reason: Good luck finding good comand line software to give a wordcount of odt, lyx, ltx, doc, docx, and who knows what other formats.
B) It doesn't care about words deleted or added, it just gives a plain number.
Reason: Good luck finding a program that actually registers the words you write/delete/edit per day. This is the less bad solution I have, sadly.
But I think that, overall, this works, if only in a pretty spartan way.