Datareign

Also known as “how to”. There are more tips on some of the software pages…

Reformatting Text Files With a Word Processor

Sometimes you have a stream of text that has been created with hard carriage returns at the end of each line. If you view such a file through a display that has a shorter line length than the file lines, you get 'hanging' line ends, which look horrible.

The solution is quite simple, once you know that virtually all word processors now in use, regard two hard carriage returns together as being a paragraph break…

  • Read the text file into your word processor
  • Do a global search and replace: find ^P^P and replace it with AMANAPLANACANALPANAMA (that is, search for two carriage returns, next to each other and replace them with a “silly string” that shouldn't occur normally in the text)
  • Do a global search and replace: find ^P and replace it with a single space. That zaps the odd line endings.
  • Do a global search and replace: find AMANAPLANACANALPANAMA and replace with ^P^P. Puts back the double carriage returns.
  • Save the file.

Finding Multiple Instances When Only One is Expected

This is a solution for Unix/Linux…

cat FileToCheck | awk -F\| '{print $1}' | uniq -c | grep -v "      1 "

We dump the file into a pipe, then we use awk to extract the field we're interested in (in this case, the first field delimited by a pipe '|' character). Next, we run the stream through uniq, telling it to count the instances of each value and finally, we delete all the single instances, leaving only the key to the duplicates.

And For Lots More Tips...

Do a web search for 'Unix One Liners' You'll be amazed by how many helpful people there are out on the Web!

Last modified: 2009/11/30 12:34