Posted: . At: 9:32 AM. This was 6 years ago. Post ID: 12083
Page permalink. WordPress uses cookies, or tiny pieces of information stored on your computer, to verify who you are. There are cookies for logged in users and for commenters.
These cookies expire two weeks after they are set.


Very cool Linux text processing tricks.


Align all text right on an 80 column width.

jason@jason-Lenovo-H50-55:~/Documents$ ls -hula | sed -e :a -e 's/^.\{1,80\}$/ &/;ta'
                                                                       total 872K
                                    drwxr-xr-x  2 jason jason 4.0K May  4 08:48 .
                                   drwxr-xr-x 23 jason jason 4.0K May  3 20:51 ..
           -rw-r--r--  1 jason jason 848K Apr 21 13:01 altis_insurgency_altis.pbo
                  -rwxrwxr-x  1 jason jason  166 Apr 22 12:22 rhsafrf.0.4.5.bikey
                  -rwxrwxr-x  1 jason jason  166 Apr 22 12:22 rhsgref.0.4.5.bikey
                   -rwxrwxr-x  1 jason jason  165 Apr 22 12:22 rhssaf.0.4.5.bikey
                  -rwxrwxr-x  1 jason jason  166 Apr 22 12:22 rhsusaf.0.4.5.bikey

Add a paragraph break after every 5 lines of output. This could make it more readable.

jason@jason-Lenovo-H50-55:~/Documents$ ls -hula | sed '0~5G'
total 872K
drwxr-xr-x  2 jason jason 4.0K May  4 08:48 .
drwxr-xr-x 23 jason jason 4.0K May  3 20:51 ..
-rw-r--r--  1 jason jason 848K Apr 21 13:01 altis_insurgency_altis.pbo
-rwxrwxr-x  1 jason jason  166 Apr 22 12:22 rhsafrf.0.4.5.bikey
 
-rwxrwxr-x  1 jason jason  166 Apr 22 12:22 rhsgref.0.4.5.bikey
-rwxrwxr-x  1 jason jason  165 Apr 22 12:22 rhssaf.0.4.5.bikey
-rwxrwxr-x  1 jason jason  166 Apr 22 12:22 rhsusaf.0.4.5.bikey

Add a set of commas to a long numeric string to make it more readable.

jason@jason-Lenovo-H50-55:/usr/share/backgrounds$ echo 1423765347 | sed ':a;s/\B[0-9]\{3\}\>/,&/;ta'
1,423,765,347

Print a list of the most-used words in a text file. This could be very useful to someone indeed. This also prints a number next to each word that tells the user how many times it appeared.

jason@jason-Lenovo-H50-55:~/Documents$ awk '{for(w=1;w<=NF;w++) print $w}' pg768.txt | sort | uniq -c | sort -nr | head -n 20
   3983 the
   3885 and
   3225 to
   2976 I
   2168
   2141 of
   2089 a
   1407 he
   1310 in
   1223 his
   1180 you
   1163 her
    993 was
    983 that
    982 she
    926 my
    851 as
    796 not
    770 with
    761 it

This will also do the same thing. This is one hell of a one-liner. It gives different values though…

jason@jason-Lenovo-H50-55:~/Documents$ perl -n -e 'foreach ${k} (split(/\s+/)){++$h{$k}};END{foreach $l (keys(%h)){print "$h{$l}: ${l}\n"}}' pg768.txt | sort -n -k 1 | tail -n 20
813: for
816: it
824: with
869: not
896: as
1023: my
1058: that
1063: she
1075: was
1265: her
1267: you
1353: his
1391: in
1518: he
2297: a
2312: of
3215: I
3506: to
4255: and
4431: the

The SSH command has been run 4 times before according to this simple one-liner.

jason@jason-Lenovo-H50-55:~$ grep -c "ssh" .bash_history
4

Translate all spaces in a text-file into backspaces…

jason@jason-Lenovo-H50-55:~/Documents$ tr -s ' ' '\b' < pg768.txt  | head -n 10
ThProjecGutenbereBookWutherinHeightsbEmilBronte
 
 
ThieBooifothusoanyonanywherancosanwith
almosnrestrictionwhatsoeverYomacopitgiviawaor
re-usiundethtermothProjecGutenberLicensincluded
witthieBoooonlinawww.gutenberg.org

Change all instances of ‘or’ to ‘co’. This makes it barely readable.

jason@jason-Lenovo-H50-55:~/Documents$ tr -s 'or' 'co' < pg768.txt  | head -n 10
The Pocject Gutenbeog eBck, Wutheoing Heights, by Emily Bocnte
 
 
This eBck is fco the use cf anycne anywheoe at nc cst and with
almcst nc oestoicticns whatsceveo.  Ycu may cpy it, give it away co
oe-use it undeo the teoms cf the Pocject Gutenbeog License included
with this eBck co cnline at www.gutenbeog.cog

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.