



(2 ratings)
Linux Advanced Text Processing Tools
Create an ascii "banner" with the width of 79 characters. The output is sent to file marie.txt. Funny, old-fashioned tool. Another utilty for asci text art is figlet. E.g. figlet "Funny!" produces this on in my terminal (I always use a fixed-size font to display ascii art):
_____ _
| ___| _ _ __ _ __ _ _| |
| |_ | | | | '_ \| '_ \| | | | |
| _|| |_| | | | | | | | |_| |_|
|_| \__,_|_| |_|_| |_|\__, (_)
|___/
script
Start logging my current session in the text terminal into a text file typescript (this is the default filename). The logging finishes when I type exit or press <Ctrl>d. Then, I can raname, email (or whatever I want to do with it) the file typescript.
emacs&
(in X-terminal) The emacs text editor. Advanced and sophisticated text
editor. Seems for gurus only: "emacs is not just an editor, it is a way
of living". Emacs surely seems rich or bloated, depending on your point
of view. There are likely 3 versions of emacs installed on your system:
(1) text-only: type emacs in a text (not X-windows) terminal (I avoid this like fire); (2) graphical-mode: type emacs
in an X-windows terminal (fairly usable even for a newbie if you take
some time to learn it); and (3) X-windows mode: type "xemacs" in an
X-windows terminal.
vi
The
famous (notorious?) "vi" text editor (definitely not recommended for
newbies). To exit "vi" (no changes saved) use these five characters: <ESC>:q!<Enter>
I use the "kate&" (under X) or "pico" (command line) or "nano"
(command line) text editors and don't ever need vi (well, unless I have
to unmount the /usr subsystem and modify/edit some configuration files,
then vi is the only editor avialable). To be fair, modern Linux
distributions use vim (="vi improved") in place of vi, and
vim is somewhat better than the original vi. The GUI version of vi is
also available (type gvim in an X terminal). Here is one
response I have seen to the criticism of vi interface being not
"intuitive": "The only intuitive interface is the nipple. The rest
must be learned." (Well, so much for MS Windows being an "intuitive"
interface.)
Experts do like vi, but vi is definitely difficult unless you use it very often. Here is a non-newbie opinion on vi (http://linuxtoday.com/stories/16620.html):
"I was first introduced to vi in 1988 and I hated it. I was a freshman in college... VI seemed archaic, complicated and unforgiving... It is now 12 years later and I love vi, in fact it is almost the only editor I use. Why the change? I actually learned to use vi... Now I see vi for what it really is, a powerful, full featured, and flexible editor..."
Brief Introduction to vim (="visual editor improved") which is a modern Linux version of vi. The main reason why a newbie like myself ever needs vi is for rescue--sometimes it is the only editor available. The most important thing to understand about vi is a "modal" editor, i.e., it has a few modes of operation between which user must switch. The quick reference is below, the 4 essential commands are in bold.
The commands to switch modes:
The key Enters the mode Remarks
<ESC> command mode (get back to the command mode from any editing mode)
i "insert" editing mode(start inserting before the current position of the cursor)
DO NOT PRESS ANY OTHER KEYES IN THE COMMAND MODE. THERE ARE MORE COMMANDS AND MODES IN THE COMMAND MODE!
Copying, cutting and pasting (in the command mode):
v start highlighting text. Then, move the cursor to highlight textcopy highlighted text
yxcut highlighted textppaste text that has been cut/copied
Saving and quitting (from the command mode):
:w write (=save)
:w filename write the contents to the file "filename"
:x save and exit
:q quit (it won't let you if changes not saved)
:q! quit discarding changes (you will not be prompted if changes not saved)
nano
This is a brand new (March 2001) GNU replacement for pico. Works and
looks like pico, but it is smaller, better, and licenced as expected
for a decent piece of Linux software (i.e., General Public Licence,
GPL). Not included with RH7.0 or MDK7.2, but expect it soon.
khexedit
(in X terminal) Simple hexadecimal editor. Another hexadecimal editor is hexedit (text based, less user friendly). Hex editors are used for editing binary (non-ASCII) files.
diff file1 file2 > patchfile
Compare contents of two files and list any differences. Save the output to the file patchfile.
sdiff file1 file2
Side-by-side comparison of two text files. Output goes to the "standard output" which normally is the screen.
patch file_to_patch patchfile
Apply the patch (a file produced by diff, which lists differences between two files) called patchfilefile_to_patch. If the patch was created using the previous command, I would use: patch file1 patchfile to change file1 to file2. to the file
grep filter
Search content of text files for matching patterns. It is definitely worth learning at least the basics of this command.
A simple example. The command:
cat * | grep my_word | more
will search all the files in the current working directory (except files starting with a dot) and print the lines which contain the string "my_word".
A shorter form to achieve the same may be:
grep my_word * |more
The patterns are specified using a powerful and standard notation called "regular expressions".
There is also a "recursive" version of grep called rgrep. This will search all the files in the current directory and all its subdirectories for my_word and print the names of the files and the matching line:
rgrep -r my_word . | more
Regular experessions are used for "pattern" matching in search, replace, etc. They are often used with utilities (e.g., grep,sed) and programming languages (e.g., perl). The shell command dir, uses a slightly modifed flavour of regular expressions (the two main differences are noted below). This brief writeup includes almost all the features of standard regular expression--regexpressions are not as complicated as they might seem at first. Definitely worth a closer look at.
In regular expressions, most characters just match themselves. So to search for string "peter", I would just use a searchstring "peter". The exceptions are so-called "special characters" ("metacharacters"), which have special meaning.
The regexpr special characters are: "\" (backslash), "." (dot), "*" (asterisk), "[" (bracket), "^" (caret, special only at the beginnig of a string), "$" (dollar sign, special only at the end of a string). A character terminating a pattern string is also special for this string.
The backslash, "\" is used as an "escape" character, i.e., to quote a subsequent special character.
Thus, "\\" searches for a backslash, "\." searches for a dot, "\*" searches for the asterisk, "\[" searches for the bracket, "\^" searches for the caret even at the begining of the string, "\$" searches for the dollar sign even at the end of the string.Backslash followed by a regular (non-special) character may gain a special meaning. Thus, the symbols \< and \> match an empty string at the beginning and the end of a word, respectively. The symbol \b matches the empty string at the edge of a word, and \B matches the empty string provided it's not at the edge of a word.The dot, ".", matches any single character. [The dir command uses "?" in this place.] Thus, "m.a" matches "mpa" and "mea" but not "ma" or "mppa".Any string is matched by ".*" (dot and asterisk). [The dir command uses "*" instead.] In general, any pattern followed by "*" matches zero or more occurences of this pattern. Thus, "m*" matches zero or more occurances of "m". To search for one or more "m", I could use "mm*".
The * is a repetition operator. Other repetition operators are used less often--here is the full list:The caret, "^", means "the beginning of the line". So "^a" means "find a line starting with an "a".
* the proceding item is to be matched zero or more times;
\+ the preceding item is to be matched one or more times;
\? the preceding item is optional and matched at most once;
\{n} the preceding item is to be matched exactly n times;
\{n,} the preceding item is to be matched n or more times;
\{n,m} the preceding item is to be matched at least n times, but not more than m times.The dollar sign, "$", means "the end of the line". So "a$" means "find a line ending with an "a".
Example. This command searches the file myfile for lines starting with an "s" and ending with an "n", and prints them to the standard output (screen):cat myfile | grep '^s.*n$'
Any character terminating the pattern string is special, precede it with a backslash if you want to use it within this string.The bracket, "[" introduces a set. Thus [abD] means: either a or b or D. [a-zA-C] means any character from a to z or from A to C.
Attention with some characters inside sets. Within a set, the only special characters are "[", "]", "-", and "^", and the combinations "[:", "[=", and "[.". The backslash is not special within a set.Useful categories of characters are (as definded by the POSIX standard): [:upper:] =upper-case letters, [:lower:] =lower-case letters, [:alpha:] =alphabetic (letters) meaning upper+lower, [:digit:] =0 to 9, [:alnum:] =alphanumeric meaning alpha+digits, [:space:] =whitespace meaning <Space>+<Tab>+<Newline> and similar, [:graph:] =graphically printable characters except space, [:print:] =printable characters including space, [:punct:] =punctuation characters meaning graphical characters minus alpha and digits, [:cntrl:] =control characters meaning non-printable characters, [:xdigit:] = characters that are hexadecimal digits.
Example. This command scans the output of the dir command, and prints lines containing a capital letter followed by a digit:dir -l | grep '[[:upper:]][[:digit:]]'
(=translation). A filter useful to replace all instances of characters in a text file or "squeeze" the white space.
Example :
cat my_file | tr 1 2 > new_file
This command takes the content of the file my_file, pipes it to the translation utility tr, the tr utility replaces all instances of the character "1" with "2", the output from the process is directed to the file new_file.
(=stream editor) I use sed to filter text files. The pattern to match is typically included between a pair of slashes // and quoted.
For example, to print lines containing the string "1024", I may use:
cat filename | sed -n '/1024/p'
Here, sed filters the output from the cat command. The option "-n" tells sed to block all the incoming lines but those explicitly matching my expression. The sed action on a match is "p"= print.Another example, this time for deleting selected lines:
cat filename | sed '/.*o$/d' > new_file
In this example, lines ending the an "o" will be deleted. I used a regular expression for matching any string followed by an "o" and the end of the line. The output (i.e., all lines but those ending with "o") is directed to new_file.Another example. To search and replace, I use the sed 's' action, which comes in front of two expressions:
cat filename | sed 's/string_old/string_new/' > newfile
A shorter form for the last command is:
sed 's/string_old/string_new/' filename > newfileTo insert a text from a text file into an html file called "index_master_file.html", I may use a script containing:
sed '/text_which_is_a_placeholder_in__my_html_file/r text_file_to_insert.txt' index_master_file.html > index.html
(=GNU awk. The awk command is a traditional UNIX tool.) A tool for processing text files, in many respects similar to sed, but more powerful. Perl can do all that gawk can, and more, so I don't bother with gawk too much. For simple tasks, I use sed, for more complicated tasks, I use perl. In some instances, however, awk scripts can be much shorter, easier to understand and maintain, and faster than an equivalent perl program.
gawk is particularly suitable for processing text-based tables. A table consists of records (each line is normally one record). The records contain fields separated by a delimiter. Often used delimiters are whitespace (gawk default), comma, or colon. All gawk expressions have a form: gawk 'pattern {action}' my_file. You can ommit the patern or action: the default pattern is "match everything" and the default action is "print the line". gawk can also be used as a filter (to process the output from another command, as used in our examples).
Example. To print lines containing the string "1024", I may use:
cat filename | gawk '/1024/ {print}'
Like in sed, the patterns to match are enclosed in a pair of "/ /".What makes gawk more powerful than sed is the operations on fields. $1 means "the first field", $2 means "the second field", etc. $0 means "the entire line". The next example extracts fields 3 and 2 from lines containing "1024" and prints them with added labels "Name" and "ID". The printing goes to a file called "newfile":
cat filename | gawk '/1024/ {print "Name: " $3 "ID: " $2}' > newfileThe third example finds and prints lines with the third field equal to "peter" or containing the string "marie":
cat filename | gawk '$3 == "peter" || $3 ~ /marie/ '
To understand the last command, here is the list of logical tests in gawk: == equal, !=< less than, > greater than, <= less than or equal to, >= greater than or equal to, ~ matching a regular expression, !~ not matching a regular expression, || logical OR, && logical AND, ! logical NOT. not equal,
Concurrent versions system. Try: info cvs for more information. Useful to keep the "source code repository" when several programmers are working on the same computer program.
cervisia
(in X-terminal). A GUI front-end to the cvs versioning system.
file -z filename
Determine the type of the file filename. The option -z makes file
look also inside compressed files to determine what the compressed file
is (instead of just telling you that this is a compressed file).
To determine the type of content, file looks inside the file to find particular patterns in contents ("magic numbers")--it does not just look at the filename extension like MS Windows does. The "magic numbers" are stored in the text file /usr/share/magic--really impressive database of filetypes.touch filename
Change the date/time stamp of the file filename to the current time. Create an empty file if the file does not exist. You can change the stamp to any date using touch -t 200201311759.30 (year 2002 January day 31 time 17:59:30).
There are three date/time values associated with every file on an ext2 filesystem:stat filename
- the time of last access to the file (atime)
- the time of last modification to the file (mtime)
- the time of last change to the file's inode (ctime).
Touch will change the first two to the value specified, and the last one always to the current system time. They can all be read using the stat command (see the next entry).
Print general info about a file (the contents of the so-called inode).
strings filename | more
Display the strings contained in the binary file called filename. "strings" could, for example, be a useful first step to a close examination of an unknown executable.
od
(=octal dump). Display contents as octal numbers. This can be useful
when the output contains non-printable characters. For example, a
filename may contain non-printable characters and be a real pain. This
can also be handy to view binary files.
Examples:wc
dir | od -c | more
(I would probably rather do: ls -b to see any non-printable characters in filenames).
cat my_file | od -c |more
od my_file |more
Comparison of different outputs:
Show 16 first characters from a binary (/bin/sh) as ASCII characters or backslash escapes (octal):
od -N 16 -c /bin/sh
output:
0000000 177 E L F 001 001 001 \0 \0 \0 \0 \0 \0 \0 \0 \0
Show the same binary as named ASCII characters:
od -N 16 -a /bin/sh
output:
0000000 del E L F soh soh soh nul nul nul nul nul nul nul nul nul
Show the same binary as short hexcadecimals:
od -N 16 -t x1 /bin/sh
output:
0000000 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Show the same binary as octal numbers:
od -N 16 /bin/sh
output:
0000000 042577 043114 000401 000001 000000 000000 000000 000000
(=word count) Print the number of lines, words, and bytes in the file.
Examples:cksum filename
dir | wc
cat my_file | wc
wc myfile
Compute the CRC (="cyclic redundancy check") for file filename to verify its integrity.
md5sum filename
Compute a md5 checksum (128-bit) for file filename to verify its integrity.
mkpasswd -l 10
Make a hard-to-guess, random password of the length of 10 characters.
sort -f filename
Arrange the lines in filename according to the ascii order. The option -f tells sort to ignore the upper and lower character case. The ascii character set is (see man ascii):
Dec Hex Char Dec Hex Char Dec Hex Char Dec Hex Char
---------------------------------------------------------------------------
0 00 NUL '\0' 32 20 SPACE 64 40 @ 96 60 `
1 01 SOH 33 21 ! 65 41 A 97 61 a
2 02 STX 34 22 " 66 42 B 98 62 b
3 03 ETX 35 23 # 67 43 C 99 63 c
4 04 EOT 36 24 $ 68 44 D 100 64 d
5 05 ENQ 37 25 % 69 45 E 101 65 e
6 06 ACK 38 26 & 70 46 F 102 66 f
7 07 BEL '\a' 39 27 ' 71 47 G 103 67 g
8 08 BS '\b' 40 28 ( 72 48 H 104 68 h
9 09 HT '\t' 41 29 ) 73 49 I 105 69 i
10 0A LF '\n' 42 2A * 74 4A J 106 6A j
11 0B VT '\v' 43 2B + 75 4B K 107 6B k
12 0C FF '\f' 44 2C , 76 4C L 108 6C l
13 0D CR '\r' 45 2D - 77 4D M 109 6D m
14 0E SO 46 2E . 78 4E N 110 6E n
15 0F SI 47 2F / 79 4F O 111 6F o
16 10 DLE 48 30 0 80 50 P 112 70 p
17 11 DC1 49 31 1 81 51 Q 113 71 q
18 12 DC2 50 32 2 82 52 R 114 72 r
19 13 DC3 51 33 3 83 53 S 115 73 s
20 14 DC4 52 34 4 84 54 T 116 74 t
21 15 NAK 53 35 5 85 55 U 117 75 u
22 16 SYN 54 36 6 86 56 V 118 76 v
23 17 ETB 55 37 7 87 57 W 119 77 w
24 18 CAN 56 38 8 88 58 X 120 78 x
25 19 EM 57 39 9 89 59 Y 121 79 y
26 1A SUB 58 3A : 90 5A Z 122 7A z
27 1B ESC 59 3B ; 91 5B [ 123 7B {
28 1C FS 60 3C < 92 5C \ '\\' 124 7C |
29 1D GS 61 3D = 93 5D ] 125 7D }
30 1E RS 62 3E > 94 5E ^ 126 7E ~
31 1F US 63 3F ? 95 5F _ 127 7F DEL
If you wondered about the control characters, here is the meaning of some of them on the console (Source: man console_codes). Each line below gives the code mnemonics, its ASCII decimal number, the key combination to produce the code on the console, and a short description:uniq
BEL (7, <Ctrl>G) bell (=alarm, beep).
BS (8, <Ctrl>H) backspaces one column (but not past the beginning of the line).
HT (9, <Ctrl>I) horizonal tab, goes to the next tab stop or to the end of the line if there is no earlier tab stop.
LF (10, <Ctrl>J), VT (11, <Ctrl>K) and FF (12, <Ctrl>L) all three give a linefeed.
CR (13, <Ctrl>M) gives a carriage return.
SO (14, <Ctrl>N) activates the G1 character set, and if LF/NL (new line mode) is set also a carriage return.
SI (15, <Ctrl>O) activates the G0 character set.
CAN (24, <Ctrl>X) and SUB (26, <Ctrl>Z) interrupt escape sequences.
ESC (27, <Ctrl>[) starts an escape sequence.
DEL (127) is ignored.
CSI (155) control sequence introducer.
(=unique) Eliminate duplicate lines in sorted input. Example: sort myfile | uniq
fold -w 30 -s my_file.txt > new_file.txt
Wrap the lines in the text file my_file.txt so that there is 30 characters per line. Break the lines on spaces. Output goes to new_file.txt.
fmt -w 75 my_file.txt > new_file.txt
Format the lines in the text file to the width of 75 characters. Break
long lines and join short lines as required, but don't remove empty
lines.
nl myfile > myfile_lines_numbered
Number the lines in the file myfile. Put the output to the file myfiles_lines_numbered.
indent -kr -i8 -ts8 -sob -l80 -ss -bs -psl "$@" *.c
Change the appearance of "C" source code by inserting or deleting white
space. The formatting options in the above example conform to the style
used in the Linux kernel source code (script /usr/src/linux/scripts/Lindent). See man indent for the description of the meaning of the options. The existing files are backed up and then replaced with the formatted ones.
rev filename > filename1
Print the file filename, each line in reversed order. In the example above, the output is directed to the file filename1.
shred filename
Repeatedly overwrite the contents of the file filename with garbage, so that nobody will ever be able to read its original contents again.
paste file1 file2 > file3
Merge two or more text files on lines using <Tab> as delimiter (use option "d=" to specify your own delimiter(s).
Example. If the content of file1 was:join file1 file2 > file3
1
2
3
and file2 was:
a
b
c
d
the resulting file3 would be:
1 a
2 b
3 c
d
Join lines of two files on a common field. join parallels the database operation "join tables", but works on text tables. The default is to join on the first field of the first table, and the default delimiter is white space. To adjust the defauls, I use options which I find using man join).
Example. if the content of file1 was:
1 Barbara
2 Peter
3 Stan
4 Marie
and file2 was:
2 Dog
4 Car
7 Cat
the resulting file3 would be:
2 Peter Dog
4 Marie Car
des -e plain_file encrypted_file
(="Data Encryption Standard") Encrypt plain_file. You will be ask for a key that the program will use for encryption. Output goes to encrypted_file. To decrypt use
des -d encrypted_file decrypted_file.
gpg
"Gnu Privacy Guard"--a free equivalent of PGP ("Pretty Good Privacy"). gpg is more secure than PGP and does not use any patented algorithms. gpg
is mostly used for signing your e-mail messages and checking signatures
of others. You can also use it to encrypt/decrypt messages. http://www.gnupg.org/ contains all the details, including a legible, detailed manual.
To start, I needed a pair of keys: private and public. The private key is used for signing my messages. The public key I give away so that others can use it to verify my signatures. [One can also use a public key to encrypt a message so it can only be read using my private key.] I generated my keypair using this command:
gpg --gen-key
My keys are stored in the directory ~/.gnupg (encrypted using a passphrase I supplied during the key generation). To list my public key in plain text file, I use:
gpg --armor --export my_email_address > public_key_stan.gpgwhich created a file public_key_stan.gpg containing something like this:-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: For info see http://www.gnupg.orgmQGiBDmnzEYRBACoN438rxANaMfCy5bfj6KWM0/TR6x6HZ0gpmhGeuouM/SOR2IU
/G30NdCuzHeFs93BhtY0IdzoEMtMyZHnvdhZC2bx/jhgaaMbEaSsXwRhVB0xVPYx
rHbsgSULHYzRFF34MS3/Lse3QWfWxzA7I0lbXB7nLwZKZqaNONRFRR42owCg60hV
TDPEB2N0llMyt12R4ZByFSsEAJ1tE7pb9b6TP7cw21vkIjc+BI2uSzn/B4vNlCWK
TTuZHVv0w0jFcbd8DB0/1tlZUOrIzLSqJyQDGiNn258+7LetQ+LKG/1YKbiAcosz
4QirBuLIeF2M9GuXYCwZypE3Dwv+4YupvybR31CgLTJ8p4sKqC5n0eSr2oSrtdHZ
yuJtA/9v2HcebOncfCNOK+cVRmcTB1Frl/Gh/vNCfeZyXaJxlqDfCU2vJHtBemiE
AtcfZHB/iHy0DM68LfRJSAIFAa5um9iWHh5/vWCGZLqtpwZ7kyMw+2D6CFkWATsy
wQA1g1VcGkNc14Crrd36qf60bI+b8pn2zDhwZtLsELsXyXkNhbQmU3RhbiBKIEts
aW1hcyA8U3RhbktsaW1hc0B3ZWJoYXJ0Lm5ldD6IVgQTEQIAFgUCOafMRgQLCgQD
AxUDAgMWAgECF4AACgkQt+ZBooH8bHd2kwCghAt9aKIk0mRJv+g7YcRPotVtrwkA
n1a4xEVEyaKgKoMaJnopf69K9+vouQENBDmnzH4QBADgFpLP+tWZPnVYg47cn+9b
XQRjdOtNsDE6BYH872/sR1oCrdH6k+gXFOiZxRZ3PElK2/olo59kh5xa9aBxNdEC
FuXJN0UelmhOFbDtqVksIqVWyYfXnLz+wtcXg0Q0L0q8vY4IuTzw2WkV6EkM+/x8
6UhA2XVaMJKBdRKFSVilbwADBQP+JCzLj5HDgpRvf+KM72nzSg7sp8Tki7nF9wNA
PODK0SeQgI3dwXYyF6AVenlETE/3xRWoYQN1bxVZsOex9vzqPrQC3dR0NBljd74r
kfXwUTl2fNQX4N9iuVCo2gCGbi5+gfEk1GhsWDsq0z40f+18k+XBdWmY8sCNiolT
tnvm1QeIRgQYEQIABgUCOafMfgAKCRC35kGigfxsd9SGAJ9/FWSkEfgbE/Yc46d8
Ef1gYg3I1ACff3oLeAMeGGO79gW6UGp9RJ6mRao=
=X1k2
-----END PGP PUBLIC KEY BLOCK-----Now, I can e-mail my public key to the people with whom I want to communicate securely. They can store it on their pgp system using;
gpg --import public_key_stan.gpgEven better, I can submit my public key to a public key server. To find a server near me, I used:
host -l pgp.net | grep wwwkeysand to submit the key, I did (can take a couple of minutes, and I am connected to the Internet):
gpg --keyserver wwwkeys.pgp.net --send-keys linux_nag@canada.comThe "wwwkeys.pgp.net" is the key server I selected, and "linux_nag@canada.com" is my email address that identifies me on my local key ring. I need to submit myself only to one public key server (they all synchronize).
Now, I can start using gpg. To manually sign a plain text file my_message, I could use:
gpg --clearsign my_messageThis created file my_message.asc which may contain something like:-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Hello World!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.1 (GNU/Linux)
Comment: For info see http://www.gnupg.orgiD8DBQE5p9+3t+ZBooH8bHcRApn/AJ9kx9+pU3GJBuvJN9Bo3bW3ku/5PwCgquht
mfrPrt7PQtdmGox72jkY0lo=
=rtK0
-----END PGP SIGNATURE-----To verify a signed message, I could do:
gpg --verify my_message.asc
If the contents of the signed section in my_message.asc was even slightly modified, the signature will not check.
Manual signing can be awkward. But, for example, kmail can sign the electronic signatures automatically for me.
"docbook" tools
Docbook is the incoming standard for document depository. The docbooks
tools are included with RH6.2 (and later) in the package "jade" and
include the following converters: db2ps, db2pdf,db2dvi,db2html,db2rtf
which convert docbook files to: postscript (*.ps), Adobe Portable
Document Format (*.pdf), device independent file format (*.dvi),
HyperText Markup Language (*.html), and Rich Text Format (*.rtf),
respectively.
"Document depository" means the document is in a format that can be automatically translated into other useful formats. For example, consider a document (under development) which may, in the future, need to be published as a report, a journal paper, a newspaper article, a webpage, perhaps a book, I (the author) am still uncertain. Formatting the document using "hard codes" (fonts, font sizes, page breaks, line centering, etc.) is rather a waste of time--styles vary very much between the particular document types and are publisher-dependent. The solution is to format the document using "logical" layout elements which may include the document title, chapter titles, subchapters, emphasis style, picture filenames, caption text, tables, etc. Thats what "docbook" does--it is a description of styles (using xml, a superset of html, and a close relative of sgml)--a so-called stylesheet. The logical layout is rendered to a physical appearance when the document is being published.
This section will be expanded in the future as we learn to use docbook.
7.2 Simple Programming under Linux
Powerful and widely used scripting language, very popular among gurus. Perl looks cryptic yet it is quite straight-forward if you need to achieve simple tasks. Think of perl as a swiss-army knife for simple programming. Perl's syntax parallels that of the "C" language. Excellent implementation of the perl interpreter is available for MS Windows so you code can be cross-platform. Here is how Eric Reymond (famous Linux guru) describes perl: "Perl, of course, is the 800-pound gorilla of modern scripting languages. It has largely replaced shell as the scripting language of choice for system administrators, thanks partly to its comprehensive set of UNIX library and system calls, and partly to the huge collection of Perl modules built by a very active Perl community. The language is commonly estimated to be the CGI language behind about 85% of the ``live'' content on the Net. Larry Wall, its creator, is rightly considered one of the most important leaders in the Open Source community, and often ranks third behind Linus Torvalds and Richard Stallman in the current pantheon of hacker demigods."
How do I write a simple perl script?
I may use pico (or any other text editor of my choice) to type in a simple perl script:
pico try_perl
The example script below does nothing useful, except illustrates some features of perl:
#!/usr/bin/perl -w
# a stupid example perl program
# the lines starting with # are comments except for the first line
# names of scalar variables start with $
$a=2;
$b=3;
# each instruction ends with a semicolon, like in "c"
print $a**$b,"\n";
$hello_world='Hello World';
print $hello_world,"\n";
system "ls";
The first line tells the shell how to execute my text file. The option "-w" causes perl to print some additional warnings, etc. that may be useful for debugging your script. The next 3 lines (starting with #) are comments. The following lines are almost self explanatory: I assign some values to two variables ($a and $b), put $a to power $b and print the result. The "\n" prints a new line, just like in the "c" programming language. Then I define another variable to contain the string "Hello World" and, in the next line, I print it to the screen. Finally, I execute the local operating system command "ls", which on Linux prints the listing of the current directory content. Really stupid script.
After saving the file, I make it executable:
chmod a+x try_perlNow, I can run the script by typing:
./try_perlHere is somewhat longer script that does something very simple yet useful to me. I take a long text file which is generatated by a data acquisition system. I need to erase every other line (or so) so that the file can be crammed into MS Excel (as required):
#!/usr/bin/perl -w
# Create a text file containing a selection of lines from an original file. This is needed
# so that data for manual postprocessing are fewer.
#
# Prompt the user for the filename, and the selection of lines to preserve in the output.
print STDOUT "Enter the filename: ";
chomp($infile=<STDIN>);
open(INFILE,"<$infile"); # open the file for reading.
print STDOUT "Enter the number of initial lines to preserve: ";
chomp($iskip=<STDIN>); # the first lines may contain column headings etc
print STDOUT "Enter the skip: ";
chomp($skip=<STDIN>);
#
# The name of the output file is created automatically on the basis of the
# input file and the selection of lines. It is always of type CSV, so preserve is so.
$outfile=$infile.'-pro'.$iskip.'-'.$skip.'.csv'; #glue strings together using the dot operator.
open(OUTFILE,">$outfile"); # open file for writing.
#
# write the "initial" lines to the output file.
for($a=0;$a<$iskip;$a++) {
$line=<INFILE>;
print OUTFILE $line;
}
#
# do the rest of the file
$c=0;$w=0;$skip++;
while($line=<INFILE>){
$c++;
if(!($c%$skip)) { #use % for remainder of integer division
print OUTFILE $line;
$w++;
}
}
#
close(OUTFILE);
print STDOUT "Read Lines: ", $c+$iskip," Wrote lines: ", $w+$iskip,"\n";
Modern and very elegant object oriented interpreter. Powerful and (arguably) more legible than perl. Very good (and large) free handbooks by G. van Rossum (the Python creator) are available on the net (try: http://www.python.org/doc/ for browsing or ftp://ftp.python.org for downloading).
How do I write a simple Python program?
Edit a text file that will contain your Python program. I can use the kde "kate" editor to do it (under X):
kate try_python.py &
Type in some simple python code to see if it works:
#!/usr/bin/env python
print 2+2
The first line (starting with the "pound-bang") tells the shell how to execute this text file--it must be there (always as the first line) for Linux to know that this particular text file is a Python script. The second line is a simple Python expression.
After saving the file, I make it executable:
chmod a+x try_python.py
after which I can run it by typing:
./try_python.py
Python is an excellent, and very modern programming language. Give it a try, particularly if you like object oriented programming. There are numerous libaries/extensions available on the Internet. For example, scientific python (http://starship.python.net/crew/hinsen/scientific.html) and numeric python (http://sourceforge.net/projects/numpy) are popular libraries used in engineering.
Here is a slightly longer, but still (hopefully) self-explanatory python code. A quick note: python flow control depends on the code indentation--it makes it very natural looking and forcing legibility, but takes an hour to get used to.
#!/usr/bin/env python
# All comments start with a the character "#"
# This program converts human years to dog years# get the original age
age = input("Enter your age (in human years): ")
print # print a blank line# check if the age is valid using a simple if statement
if age < 0:
print "A negative age is not possible."
elif age < 3 or age > 110:
print "Frankly, I don't believe you."
else:
print "That's the same as a", age/7, "year old dog."
#!/usr/bin/tclsh
puts stdout {Hello World!}
(type in X-terminal ) A front-end to Tk, an X-windows extension of tcl. Often used for building front-ends of a program.
How do I write a simple GUI program (using Tk)?
Tk is a GUI extension of the easy yet powerful tcl programming language. For example, I may use pico to create a text file that will contain a simple tk program:
pico try_tk
and type in a simple example of tk code to see if it works:
#!/usr/bin/wish
button .my_button -text "Hello World" -command exit
pack .my_button
The first line (starting with the "#!" pound-bang) tells the shell what utility to use to execute my text file. The next two lines are an example of a simple tk program. First, I created a button called "my_button" and placed it at the root of my class hierarchy (the dot in front of "my_button"). To the button, I tied the text "Hello World" and a command that exits the program (when the button is pressed). Last line makes my program's window adjust its size to just big enough to contain my button.
After saving the file, I make it executable:
chmod a+x try_tk
after which I can run it by typing (in the X-terminal, because it requires X-windows to run):
./try_tk
Tk is very popular for building GUI front ends.ruby
A purely object-oriented scripting language. This language is a relative newcomer, but it is rapidly gaining popularity, and may well be the flavour of the future of programming.
To write a simple program in ruby, I open my favorite text editor and start a program with the following first line:
#!/usr/bin/ruby
Here is an example of a program that I wrote to help me understand the basics of the ruby language:
#!/usr/bin/ruby
#This is a comment
a = Array.new
print "Please enter a few words (type EXIT to stop):\n"i = 0
while enterWord = STDIN.gets
enterWord.chop!
if enterWord == "EXIT"
break
end
a[i] = enterWord
i += 1
end#sort the array
for i in 0...a.length-1 do
for j in i+1...a.length do
if a[j] < a[i]
tmp = a[i]
a[i] = a[j]
a[j] = tmp
end
end
end#Output the results
print "You entered " + a.length.to_s + " entries.\n\n"
for i in 0...a.length do
print "Entry " + (i+1).to_s + ": "+ a[i] + "\n"
endI save my ruby script to file "myprogram". To execute it, I need to type on the command line:
./myprogram
20 Random Tutorials from the same category :













