#[1]Linux Today » Feed * [2]News * [3]IT Management * [4]Infrastructure * [5]Developer * [6]Security * [7]High Performance * [8]Storage * [9]Blog Search ____________________ [10]LinuxToday LinuxToday --- [11]LinuxToday LinuxToday * [12]News * [13]IT Management * [14]Infrastructure * [15]Developer * [16]Security * [17]High Performance * [18]Storage * [19]Blog ____________________ (BUTTON) Search [20]Home [21]Blog Macro Magic: M4 Complete Guide By Jerry Peek May 9, 2019 A macro processor scans input text for defined symbols -- the macros -- and replaces that text by other text, or possibly by other symbols. For instance, a macro processor can convert one language into another. If you're a C programmer, you know cpp, the C preprocessor, a simple macro processor. m4 is a powerful macro processor that's been part of Unix for some 30 years, but it's almost unknown -- except for special purposes, such as generating the sendmail.cf file. It's worth knowing because you can do things with m4 that are hard to do any other way. The GNU version of m4 has some extensions from the original V7 version. (You'll see some of them.) As of this writing, the latest GNU version was 1.4.2, released in August 2004. Version 2.0 is under development. While you won't become an m4 wizard in three pages (or in six, as the discussion of m4continues next month), but you can master the basics. So, let's dig in. Simple Macro Processing A simple way to do macro substitution is with tools like sed and cpp. For instance, the command sed's/XPRESIDENTX/President Bush/' reads lines of text, changing every occurrence of XPRESIDENTX to President Bush. sed can also test and branch, for some rudimentary decision-making. As another example, here's a C program with a cpp macro named ABSDIFF() that accepts two arguments, a and b. #define ABSDIFF(a, b) ((a)>(b) ? (a)-(b) : (b)-(a)) Given that definition, cpp will replace the code... diff = ABSDIFF(v1, v2); ... with diff = ((v1)>(v2) ? (v1)-(v2) : (v2)-(v1)); v1 replaces a everywhere, and v2 replace b. ABSDIFF() saves typing -- and the chance for error. Introducing m4 Unlike sed and other languages, m4 is designed specifically for macro processing. m4manipulates files, performs arithmetic, has functions for handling strings, and can do much more. m4 copies its input (from files or standard input) to standard output. It checks each token (a name, a quoted string, or any single character that's not a part of either a name or a string) to see if it's the name of a macro. If so, the token is replaced by the macro's value, and then that text is pushed back onto the input to be rescanned. (If you're new to m4, this repeated scanning may surprise you, but it's one key to m4 s power.) Quoting text, like ` text`, prevents expansion. (See the section on "Quoting.") m4 comes with a number of predefined macros, or you can write your own macros by calling the define() function. A macro can have multiple arguments- up to 9 in original m4, and an unlimited number in GNU m4. Macro arguments are substituted before the resulting text is rescanned. Here's a simple example (saved in a file named foo.m4): one define(`one', `ONE')dnl one define(`ONE', `two')dnl one ONE oneONE `one' The file defines two macros named one and ONE. It also has four lines of text. If you feed the file to m4 using m4 foo.m4, m4 produces: one ONE two two oneONE one Here's what's happening: *Line 1 of the input, which is simply the characters one and a newline, doesn't match any macro (so far), so it's copied to the output as-is. *Line 2 defines a macro named one(). (The opening parenthesis before the arguments must come just after define with no whitespace between.) From this point on, any input string one will be replaced with ONE. (The dnl is explained below.) *Line 3, which is again the characters one and a newline, is affected by the just-defined macro one(). So, the text one is converted to the text ONE and a newline. *Line 4 defines a new macro named ONE(). Macro names are case-sensitive. *Line 5 has three space-separated tokens. The first two are one and ONE. The first is converted to ONE by the macro named one(), then both are converted to two by the macro named ONE(). Rescanning doesn't find any additional matches (there's no macro named two()), so the first two words are output as two two. The rest of line 5 (a space, oneONE, and a newline) doesn't match a macro so it's output as-is. In other words, a macro name is only recognized when it's surrounded by non-alphanumerics. *Line 6 contains the text one inside a pair of quotes, then a newline. (As you've seen, the opening quote is a backquote or grave accent; the closing quote is a single quote or acute accent.) Quoted text doesn't match any macros, so it's output as-is: one. Next comes the final newline. Input text is copied to the output as-is and that includes newlines. The built-in dnlfunction, which stands for "delete to new line," reads and discards all characters up to and including the next newline. (One of its uses is to put comments into an m4 file.) Without dnl, the newline after each of our calls to define would be output as-is. We could demonstrate that by editing foo.m4 to remove the two dnl s. But, to stretch things a bit, let's use sed to remove those two calls from the file and pipe the result to m4: $ sed `s/dnl//' foo.m4 | m4 one ONE two two oneONE one If you compare this example to the previous one, you'll see that there are two extra newlines at the places where dnl used to be. Let's summarize. You've seen that input is read from the first character to the last. Macros affect input text only after they're defined. Input tokens are compared to macro names and, if they match, replaced by the macro's value. Any input modified by a macro is pushed back onto the input and is rescanned for possible modification. Other text (that isn't modified by a macro) is passed to the output as-is. Quoting Any text surrounded by `' (a grave accent and an acute accent) isn't expanded immediately. Whenever m4 evaluates something, it strips off one level of quotes. When you define a macro, you'll often want to quote the arguments -- but not always. Listing One has a demo. It uses m4 interactively, typing text to its standard input. Listing One: Quoting demonstration $ m4 define(A, 100)dnl define(B, A)dnl define(C, `A')dnl dumpdef(`A', `B', `C')dnl A: 100 B: 100 C: A dumpdef(A, B, C)dnl stdin:5: m4: Undefined name 100 stdin:5: m4: Undefined name 100 stdin:5: m4: Undefined name 100 A B C 100 100 100 CTRL-D $ The listing starts by defining three macros A, B, and C. A has the value 100. So does B: because its argument A isn't quoted, m4 replaces A with 100 before assigning that value to B. While defining C, though, quoting the argument means that its value becomes literal A. You can see the values of macros by calling the built-in function dumpdef with the names of the macros. As expected, A and B have the value 100, but C has A. In the second call to dumpdef, the names are not quoted, so each name is expanded to 100before dumpdef sees them. That explains the error messages, because there's no macro named 100. In the same way, if we simply enter the macro names, the three tokens are scanned repeatedly, and they all end up as 100. You can change the quoting characters at any time by calling changequote. For instance, in text containing lots of quote marks, you could call changequote({,})dnl to change the quoting characters to curly braces. To restore the defaults, simply call changequote with no arguments. In general, for safety, it's a good idea to quote all input text that isn't a macro call. This avoids m4 interpreting a literal word as a call to a macro. Another way to avoid this problem is by using the GNU m4 option --prefix-builtins or -P. It changes all built-in macro names to be prefixed by m4_. (The option doesn't affect user-defined macros.) So, under this option, you'd write m4_dnl and m4_define instead of dnl and define, respectively. Keep quoting and rescanning in mind as you use m4. Not to be tedious, but remember that m4 does rescan its input. For some in-depth tips, see "Web Paging: Tips and Hints on m4Quoting" by R.K. Owen, Ph.D., at [22]http://owen.sj.ca.us/rkowen/howto/webpaging/m4tipsquote.html. Decisions and Math m4 can do arithmetic with its built-in functions eval, incr, and decr. m4 doesn't support loops directly, but you can combine recursion and the decision macro ifelse to write loops. Let's start with an example adapted from the file /usr/share/doc/m4/examples/debug.m4(on a Debian system). It defines the macro countdown(). Evaluating the macro with an argument of 5 -- as in countdown(5) -- outputs the text 5, 4, 3, 2, 1, 0, Liftoff!. $ cat countdown.m4 define(`countdown', `$1, ifelse(eval($1 > 0), 1, `countdown(decr($1))', `Liftoff!')')dnl countdown(5) $ m4 countdown.m4 5, 4, 3, 2, 1, 0, Liftoff! The countdown() macro has a single argument. It's broken across two lines.That's fine in m4 because macro arguments are delimited by parentheses which don't have to be on the same line. Here's the argument without its surrounding quotes: $1, ifelse(eval($1 > 0), 1, `countdown(decr($1))', `Liftoff!') ) $1 expands to the macro's first argument. When m4 evaluates that countdown macro with an argument of 5, the result is: 5, ifelse(eval(5 > 0), 1, `countdown(decr(5))', `Liftoff!') The leading " 5, " is plain text that's output as-is as the first number in the countdown. The rest of the argument is a call to ifelse. Ifelse compares its first two arguments. If they're equal, the third argument is evaluated; otherwise, the (optional) fourth argument is evaluated. Here, the first argument to ifelse, eval(5> 0), evaluates as 1 (logical" true") if the test is true (if 5 is greater than 0). So the first two arguments are equal, and m4 evaluates countdown(decr(5)). This starts the recursion by calling countdown(4). Once we reach the base condition of countdown(0), the test eval(0> 0) fails and the ifelsecall evaluates `Liftoff!'. (If recursion is new to you, you can read about it in books on computer science and programming techniques.) Note that, with more than four arguments, ifelse can work like a case or switch in other languages. For instance, in ifelse(a,b,c,d,e,f,g), if a matches b, then c; else if d matches ethen f; else g. The m4 info file shows more looping and decision techniques, including a macro named forloop() that implements a nestable for-loop. This section showed some basic math operations. (The info file shows more.) You've seen that you can quote a single macro argument that contains a completely separate string (in this case, a string that prints a number, then runs ifelse to do some more work). This one-line example (broken onto two lines here) is a good hint of m4' s power. It's a mimimalist language, for sure, and you'd be right to complain about its tricky evaluation in a global environment, leaving lots of room for trouble if you aren't careful. But you might find this expressive little language to be challenging enough that it's addictive. Building Web Pages Let's wrap up this m4 introduction with a typical use: feeding an input file to a set of macros to generate an output file. Here, the macro file html.m4 defines three macros: _startpage(), _ul(), and _endpage(). (The names start with underscore characters to help prevent false matches with non-macro text. For instance, _ul() won't match the HTML tag
Last change: Fri Jan 14 15:32:06 MST 2005
In Listing Four, both _startpage() and _endpage() are straightforward. The esyscmdmacro is one of the many m4 macros we haven't covered -- it runs a Linux command line, then uses the command's output as input to m4. The _ul() macro outputs opening and closing HTMLLast change: esyscmd(date)