nawk(1) 맨 페이지 - 윈디하나의 솔라나라

개요

섹션
맨페이지이름
검색(S)

nawk

Name
     nawk - pattern scanning and processing language

Synopsis
     /usr/bin/nawk [-F ERE] [-v assignment] `program' | -f progfile...
          [argument]...


     /usr/xpg4/bin/awk [-F ERE] [-v assignment]... `program' | -f progfile...
          [argument]...

Description
     The /usr/bin/nawk and  /usr/xpg4/bin/awk  utilities  execute
     programs  written in the nawk programming language, which is
     specialized for textual data manipulation. A nawk program is
     a sequence of patterns and corresponding actions. The string
     specifying program must be enclosed in single quotes (')  to
     protect it from interpretation by the shell. The sequence of
     pattern - action statements can be specified in the  command
     line as program or in one, or more, file(s) specified by the
     -fprogfile option. When input is read that  matches  a  pat-
     tern, the action associated with the pattern is performed.


     Input is interpreted as a sequence of records. By default, a
     record  is  a  line, but this can be changed by using the RS
     built-in variable. Each record of input is matched  to  each
     pattern  in the program. For each pattern matched, the asso-
     ciated action is executed.


     The nawk utility interprets each input record as a  sequence
     of  fields  where,  by  default, a field is a string of non-
     blank characters. This default white-space  field  delimiter
     (blanks and/or tabs) can be changed by using the FS built-in
     variable or the -FERE option. The nawk utility  denotes  the
     first field in a record $1, the second $2, and so forth. The
     symbol $0 refers to the entire  record;  setting  any  other
     field  causes the reevaluation of $0. Assigning to $0 resets
     the values of all fields and the NF built-in variable.

Options
     The following options are supported:

     -F ERE
                      Define the input field separator to be  the
                      extended regular expression ERE, before any
                      input is read (can be a character).


     -f progfile
                      Specifies the pathname of the file progfile
                      containing  a  nawk  program.  If  multiple
                      instances of this option are specified, the
                      concatenation  of  the  files  specified as
                      progfile in the order specified is the nawk
                      program. The nawk program can alternatively
                      be specified in the command line as a  sin-
                      gle argument.


     -v assignment
                      The assignment argument must be in the same
                      form  as an assignment operand. The assign-
                      ment is of the form var=value, where var is
                      the  name of one of the variables described
                      below.  The  specified  assignment   occurs
                      before  executing the nawk program, includ-
                      ing the actions associated with BEGIN  pat-
                      terns  (if  any).  Multiple  occurrences of
                      this option can be specified.

Operands
     The following operands are supported:

     program
                 If no -f option is specified, the first  operand
                 to  nawk  is  the  text of the nawk program. The
                 application supplies the program  operand  as  a
                 single  argument  to  nawk. If the text does not
                 end in a newline character, nawk interprets  the
                 text as if it did.


     argument
                 Either of the following two  types  of  argument
                 can be intermixed:

                 file
                               A pathname of a file that contains
                               the  input  to  be  read, which is
                               matched against the  set  of  pat-
                               terns  in  the program. If no file
                               operands are specified,  or  if  a
                               file  operand  is  -, the standard
                               input is used.


                 assignment
                               An operand  that  begins  with  an
                               underscore or alphabetic character
                               from the portable  character  set,
                               followed  by  a sequence of under-
                               scores,  digits  and   alphabetics
                               from  the  portable character set,
                               followed by the = character speci-
                               fies  a variable assignment rather
                               than a  pathname.  The  characters
                               before the = represent the name of
                               a nawk variable. If that name is a
                               nawk  reserved  word, the behavior
                               is undefined. The characters  fol-
                               lowing  the  equal  sign is inter-
                               preted as if they appeared in  the
                               nawk program preceded and followed
                               by a double-quote  (")  character,
                               as a STRING token , except that if
                               the last character is an unescaped
                               backslash,  it is interpreted as a
                               literal backslash rather  than  as
                               the   first   character   of   the
                               sequence  \..  The   variable   is
                               assigned  the value of that STRING
                               token. If the value is  considered
                               a  numericstring,  the variable is
                               assigned its numeric  value.  Each
                               such  variable  assignment is per-
                               formed just before the  processing
                               of  the  following  file,  if any.
                               Thus,  an  assignment  before  the
                               first  file  argument  is executed
                               after the BEGIN actions (if  any),
                               while an assignment after the last
                               file argument is  executed  before
                               the  END  actions  (if  any).   If
                               there  are  no   file   arguments,
                               assignments  are  executed  before
                               processing the standard input.

Input Files
     Input files to the nawk program from any  of  the  following
     sources:

         o    any file operands or their equivalents, achieved by
              modifying the nawk variables ARGV and ARGC

         o    standard input in the absence of any file operands

         o    arguments to the getline function


     must be text files. Whether the variable  RS  is  set  to  a
     value  other  than  a  newline  character  or not, for these
     files, implementations support records terminated  with  the
     specified  separator  up to {LINE_MAX} bytes and can support
     longer records.

     If -f progfile is specified, the files named by each of  the
     progfile  option-arguments  must be text files containing an
     nawk program.


     The standard input are used only if  no  file  operands  are
     specified, or if a file operand is -.

Extended Description
     A nawk program is composed of pairs of the form:

       pattern { action }



     Either the pattern or the action  (including  the  enclosing
     brace  characters) can be omitted. Pattern-action statements
     are separated by a semicolon or by a newline.


     A missing pattern matches any record of input, and a missing
     action  is  equivalent  to an action that writes the matched
     record of input to standard output.


     Execution of the nawk program starts by first executing  the
     actions associated with all BEGIN patterns in the order they
     occur in the program. Then each file  operand  (or  standard
     input  if  no  files were specified) is processed by reading
     data from the file until a record separator is seen (a  new-
     line  character  by  default),  splitting the current record
     into fields using the current value of FS,  evaluating  each
     pattern  in the program in the order of occurrence, and exe-
     cuting the action associated with each pattern that  matches
     the  current  record.  The  action for a matching pattern is
     executed before evaluating subsequent  patterns.  Last,  the
     actions  associated with all END patterns is executed in the
     order they occur in the program.

  Expressions in nawk
     Expressions  describe  computations  used  in  patterns  and
     actions. In the following table, valid expression operations
     are given in groups from highest precedence first to  lowest
     precedence  last,  with  equal-precedence  operators grouped
     between horizontal lines. In  expression  evaluation,  where
     the  grammar is formally ambiguous, higher precedence opera-
     tors are evaluated before lower  precedence  operators.   In
     this  table  expr,  expr1,  expr2,  and  expr3 represent any
     expression, while lvalue represents any entity that  can  be
     assigned  to  (that  is,  on  the left side of an assignment
     operator).
     tab(); cw(1.38i)  cw(1.38i)  cw(1.35i)  cw(1.39i)  lw(1.38i)
     lw(1.38i)     lw(1.35i)    lw(1.39i)    SyntaxNameType    of
     ResultAssociativity _ (  expr  )Groupingtype  of  exprn/a  _
     $exprField     referencestringn/a     _     ++    lvaluePre-
     incrementnumericn/a
      --lvaluePre-decrementnumericn/a       lvalue        ++Post-
     incrementnumericn/a  lvalue  --Post-decrement  numericn/a  _
     expr  ^  exprExponentiationnumericright  _   !   exprLogical
     notnumericn/a   +   exprUnary   plusnumericn/a  -  exprUnary
     minusnumericn/a _ expr * exprMultiplicationnumericleft  expr
     /  exprDivisionnumericleft  expr  % exprModulusnumericleft _
     expr + exprAdditionnumericleft expr - exprSubtractionnumeric
     left  _  expr  exprString  concatenationstringleft  _ expr <
     exprLess thannumericnone expr  <=  exprLess  than  or  equal
     tonumericnone  expr  !=  exprNot equal tonumericnone expr ==
     exprEqual tonumericnone expr >  exprGreater  thannumericnone
     expr  >=  exprGreater  than  or equal tonumericnone _ expr ~
     exprERE   matchnumericnone   expr   !~   exprERE   non-match
     numericnone  _  expr  in  arrayArray membershipnumericleft (
     index ) inMulti-dimension arraynumericleft
         array    membership _ expr && exprLogical ANDnumericleft
     _    expr   ||   exprLogical   ORnumericleft   _   expr1   ?
     expr2Conditional expressiontype of selectedright
         :   expr3      expr2    or    expr3    _    lvalue    ^=
     exprExponentiationnumericright    assignment    lvalue    %=
     exprModulus      assignmentnumericright      lvalue       *=
     exprMultiplicationnumericright    assignment    lvalue    /=
     exprDivision assignmentnumericright lvalue +=   exprAddition
     assignmentnumericright     lvalue     -=     exprSubtraction
     assignmentnumericright  lvalue   =   exprAssignmenttype   of
     exprright



     Each expression has either a string value, a  numeric  value
     or  both.  Except as stated for specific contexts, the value
     of an expression is implicitly converted to the type  needed
     for the context in which it is used.  A string value is con-
     verted to a numeric value by the equivalent of the following
     calls:

       setlocale(LC_NUMERIC, "");
       numeric_value = atof(string_value);



     A numeric value that is exactly equal to  the  value  of  an
     integer is converted to a string by the equivalent of a call
     to the sprintf function with the string %d as the fmt  argu-
     ment  and the numeric value being converted as the first and
     only expr argument.  Any other numeric value is converted to
     a string by the equivalent of a call to the sprintf function
     with the value of the variable CONVFMT as the  fmt  argument
     and  the numeric value being converted as the first and only
     expr argument.


     A string value is considered to be a numeric string  in  the
     following case:

         1.   Any  leading  and  trailing  blank  characters   is
              ignored.

         2.   If the first unignored character is a + or -, it is
              ignored.

         3.   If the remaining unignored characters would be lex-
              ically  recognized as a NUMBER token, the string is
              considered a numeric string.


     If a - character is ignored in the above steps, the  numeric
     value  of  the numeric string is the negation of the numeric
     value of the recognized NUMBER token. Otherwise the  numeric
     value  of  the  numeric  string  is the numeric value of the
     recognized NUMBER token.  Whether  or  not  a  string  is  a
     numeric  string is relevant only in contexts where that term
     is used in this section.


     When an expression is used in a Boolean context, if it has a
     numeric  value,  a value of zero is treated as false and any
     other value is treated as true. Otherwise, a string value of
     the  null  string is treated as false and any other value is
     treated as true. A Boolean context is one of the following:

         o    the first subexpression of  a  conditional  expres-
              sion.

         o    an expression operated on by logical  NOT,  logical
              AND, or logical OR.

         o    the second expression of a for statement.

         o    the expression of an if statement.

         o    the expression of the  while  clause  in  either  a
              while or do ... while statement.

         o    an expression used as a pattern (as in Overall Pro-
              gram Structure).

     The nawk language supplies arrays that are used for  storing
     numbers  or  strings.  Arrays need not be declared. They are
     initially empty, and their sizes  changes  dynamically.  The
     subscripts, or element identifiers, are strings, providing a
     type of associative array capability. An array name followed
     by  a  subscript  within  square  brackets can be used as an
     lvalue and as an expression, as described  in  the  grammar.
     Unsubscripted  array  names  are  used in only the following
     contexts:

         o    a parameter in a function  definition  or  function
              call.

         o    the NAME token following any use of the keyword in.


     A valid array index consists of one or more  comma-separated
     expressions,  similar  to the way in which multi-dimensional
     arrays are indexed in some  programming  languages.  Because
     nawk  arrays  are  really  one-dimensional,  such  a  comma-
     separated list is converted  to  a  single  string  by  con-
     catenating  the  string  values of the separate expressions,
     each separated from the other by the  value  of  the  SUBSEP
     variable.


     Thus, the following two index operations are equivalent:

       var[expr1, expr2, ... exprn]
       var[expr1 SUBSEP expr2 SUBSEP ... SUBSEP exprn]



     A multi-dimensioned index used with the in operator must  be
     put  in  parentheses.  The  in operator, which tests for the
     existence of a particular array element, does not create the
     element  if  it  does  not  exist.  Any other reference to a
     non-existent array element automatically creates it.

  Variables and Special Variables
     Variables can be used in  an  nawk  program  by  referencing
     them.  With  the  exception of function parameters, they are
     not explicitly declared. Uninitialized scalar variables  and
     array  elements  have  both  a  numeric  value of zero and a
     string value of the empty string.


     Field variables are designated by a $ followed by  a  number
     or  numerical  expression.  The  effect  of the field number
     expression evaluating to anything other than a  non-negative
     integer  is  unspecified.  Uninitialized variables or string
     values need not be  converted  to  numeric  values  in  this
     context.  New  field  variables  are  created by assigning a
     value to them. References to non-existent fields  (that  is,
     fields  after $NF) produce the null string. However, assign-
     ing to a non-existent  field  (for  example,  $(NF+2)  =  5)
     increases  the  value  of  NF, create any intervening fields
     with the null string as their values and cause the value  of
     $0  to be recomputed, with the fields being separated by the
     value of OFS. Each field variable has a  string  value  when
     created.  If the string, with any occurrence of the decimal-
     point character from the current locale changed to a  period
     character,  is  considered a numeric string (see Expressions
     in nawk above), the field  variable  also  has  the  numeric
     value of the numeric string.

  /usr/bin/nawk, /usr/xpg4/bin/awk
     nawk sets the following special variables that are supported
     by both /usr/bin/nawk and /usr/xpg4/bin/awk:

     ARGC
                 The number of elements in the ARGV array.


     ARGV
                 An array of command  line  arguments,  excluding
                 options  and the program argument, numbered from
                 zero to ARGC-1.

                 The arguments in ARGV can be modified  or  added
                 to;  ARGC  can  be  altered.  As each input file
                 ends, nawk treats the next non-null  element  of
                 ARGV,   up  to  the  current  value  of  ARGC-1,
                 inclusive, as the name of the next  input  file.
                 Setting an element of ARGV to null means that it
                 is not treated as an  input  file.  The  name  -
                 indicates  the  standard  input.  If an argument
                 matches the format  of  an  assignment  operand,
                 this argument is treated as an assignment rather
                 than a file argument.


     ENVIRON
                 The variable ENVIRON is  an  array  representing
                 the value of the environment. The indices of the
                 array are strings consisting of the names of the
                 environment  variables,  and  the  value of each
                 array element is  a  string  consisting  of  the
                 value  of  that  variable.  If  the  value of an
                 environment variable  is  considered  a  numeric
                 string,  the  array element also has its numeric
                 value.

                 In all cases where nawk behavior is affected  by
                 environment variables (including the environment
                 of any commands that nawk executes via the  sys-
                 tem  function  or via pipeline redirections with
                 the print statement, the  printf  statement,  or
                 the  getline  function), the environment used is
                 the environment at the time nawk  began  execut-
                 ing.


     FILENAME
                 A pathname of the current input file.  Inside  a
                 BEGIN  action  the value is undefined. Inside an
                 END action the value is the  name  of  the  last
                 input file processed.


     FNR
                 The ordinal number of the current record in  the
                 current file. Inside a BEGIN action the value is
                 zero. Inside an END  action  the  value  is  the
                 number  of the last record processed in the last
                 file processed.


     FS
                 Input  field  separator  regular  expression;  a
                 space character by default.


     NF
                 The number of  fields  in  the  current  record.
                 Inside  a  BEGIN  action, the use of NF is unde-
                 fined unless a getline function  without  a  var
                 argument  is  executed previously. Inside an END
                 action, NF retains the value it had for the last
                 record  read,  unless  a subsequent, redirected,
                 getline function without a var argument is  per-
                 formed prior to entering the END action.


     NR
                 The ordinal number of the  current  record  from
                 the  start  of  input. Inside a BEGIN action the
                 value is zero. Inside an END action the value is
                 the number of the last record processed.


     OFMT
                 The printf  format  for  converting  numbers  to
                 strings  in output statements "%.6g" by default.
                 The result of the conversion is  unspecified  if
                 the value of OFMT is not a floating-point format
                 specification.


     OFS
                 The print statement output  field  separator;  a
                 space character by default.


     ORS
                 The print output  record  separator;  a  newline
                 character by default.

     LENGTH
                 The length of the string matched  by  the  match
                 function.


     RS
                 The first character of the string value of RS is
                 the  input record separator; a newline character
                 by default. If RS contains more than one charac-
                 ter, the results are unspecified. If RS is null,
                 then records are separated by sequences  of  one
                 or  more  blank lines. Leading or trailing blank
                 lines do not produce empty records at the begin-
                 ning or end of input, and the field separator is
                 always newline, no matter what the value of FS.


     RSTART
                 The starting position of the string  matched  by
                 the  match  function,  numbering from 1. This is
                 always equivalent to the  return  value  of  the
                 match function.


     SUBSEP
                 The  subscript  separator  string   for   multi-
                 dimensional arrays. The default value is \034.


  /usr/xpg4/bin/awk
     The following variable is  supported  for  /usr/xpg4/bin/awk
     only:

     CONVFMT
                The  printf  format  for  converting  numbers  to
                strings (except for output statements, where OFMT
                is used). The default is %.6g.


  Regular Expressions
     The /usr/xpg4/bin/nawk utility makes  use  of  the  extended
     regular  expression  notation  (see regex(5)) except that it
     allows the use of C-language conventions to  escape  special
     characters  within  the EREs, namely \\, \a, \b, \f, \n, \r,
     \t, \v, and those specified in the following  table.   These
     escape  sequences  are  recognized  both  inside and outside
     bracket expressions. Records need not be separated  by  new-
     line  characters  and  string  constants can contain newline
     characters, so even the \n sequence is valid in  nawk  EREs.
     Using  a  slash  character  within  the  regular  expression
     requires escaping as shown in the table below:



     tab();  lw(.61i)  lw(2.44i)  lw(2.44i)  lw(.61i)   lw(2.44i)
     lw(2.44i)  Escape  SequenceDescriptionMeaning  _ \"Backslash
     quotation-markQuotation-mark   character    _    \/Backslash
     slashSlash character _ \dddT{ A backslash character followed
     by the longest sequence of one, two,  or  three  octal-digit
     characters  (01234567).   If  all of the digits are 0, (that
     is, representation of the NULL character), the  behavior  is
     undefined.   T}T{ The character encoded by the one-, two- or
     three-digit octal  integer.  Multi-byte  characters  require
     multiple, concatenated escape sequences, including the lead-
     ing \ for each byte.  T} _ \cT{ A backslash  character  fol-
     lowed  by  any character not described in this table or spe-
     cial  characters  (\\,  \a,  \b,  \f,  \n,  \r,   \t,   \v).
     T}Undefined



     A regular expression can be matched against a specific field
     or  string by using one of the two regular expression match-
     ing operators, ~ and !~.  These  operators  interpret  their
     right-hand  operand  as a regular expression and their left-
     hand operand as a string. If the regular expression  matches
     the  string,  the ~ expression evaluates to the value 1, and
     the !~ expression evaluates to the value 0. If  the  regular
     expression  does  not  match  the  string,  the ~ expression
     evaluates to the value 0, and the !~ expression evaluates to
     the  value  1.  If  the right-hand operand is any expression
     other than the lexical token ERE, the string  value  of  the
     expression is interpreted as an extended regular expression,
     including the escape  conventions  described  above.  Notice
     that  these  same escape conventions also are applied in the
     determining the value of a string literal (the lexical token
     STRING),  and is applied a second time when a string literal
     is used in this context.


     When an ERE token appears as an expression  in  any  context
     other  than  as the right-hand of the ~ or !~ operator or as
     one of the built-in function arguments described below,  the
     value of the resulting expression is the equivalent of:

       $0 ~ /ere/



     The ere argument to the gsub, match, sub functions, and  the
     fs  argument to the split function (see String Functions) is
     interpreted as extended regular expressions.  These  can  be
     either  ERE  tokens or arbitrary expressions, and are inter-
     preted in the same manner as the right-hand side of the ~ or
     !~ operator.


     An extended regular  expression  can  be  used  to  separate
     fields  by  using the -F ERE option or by assigning a string
     containing the expression to the built-in variable  FS.  The
     default  value  of the FS variable is a single space charac-
     ter. The following describes FS behavior:

         1.   If FS is a single character:

             o    If FS is the space character, skip leading  and
                  trailing blank characters; fields are delimited
                  by sets of one or more blank characters.

             o    Otherwise, if FS  is  any  other  character  c,
                  fields  are delimited by each single occurrence
                  of c.

         2.   Otherwise, the string value of FS is considered  to
              be  an extended regular expression. Each occurrence
              of a sequence matching the extended regular expres-
              sion delimits fields.


     Except in the gsub, match, split,  and  sub  built-in  func-
     tions,   regular  expression  matching  is  based  on  input
     records. That is, record  separator  characters  (the  first
     character of the value of the variable RS, a newline charac-
     ter by default) cannot be embedded in the expression, and no
     expression  matches  the  record separator character. If the
     record separator is not a newline character, newline charac-
     ters  embedded  in  the  expression can be matched. In those
     four built-in functions,  regular  expression  matching  are
     based on text strings. So, any character (including the new-
     line character and the record separator) can be embedded  in
     the  pattern  and an appropriate pattern matches any charac-
     ter. However, in all nawk regular expression  matching,  the
     use  of  one  or  more NULL characters in the pattern, input
     record or text string produces undefined results.

  Patterns
     A pattern is any valid expression, a range specified by  two
     expressions  separated  by  comma, or one of the two special
     patterns BEGIN or END.

  Special Patterns
     The nawk utility recognizes two special patterns, BEGIN  and
     END.  Each  BEGIN pattern is matched once and its associated
     action executed before the first record  of  input  is  read
     (except  possibly  by use of the getline function in a prior
     BEGIN action) and before command line  assignment  is  done.
     Each  END  pattern is matched once and its associated action
     executed after the last record of input has been read. These
     two patterns have associated actions.

     BEGIN and END do not combine with other patterns.   Multiple
     BEGIN  and  END patterns are allowed. The actions associated
     with the BEGIN patterns are executed in the order  specified
     in  the  program, as are the END actions. An END pattern can
     precede a BEGIN pattern in a program.


     If an nawk program consists of only actions with the pattern
     BEGIN,  and  the  BEGIN action contains no getline function,
     nawk exits without reading its input when the last statement
     in  the  last  BEGIN  action is executed. If an nawk program
     consists of only  actions  with  the  pattern  END  or  only
     actions  with  the patterns BEGIN and END, the input is read
     before the statements in the END actions are executed.

  Expression Patterns
     An expression pattern is evaluated as if it were an  expres-
     sion  in  a Boolean context. If the result is true, the pat-
     tern is considered to match, and the associated  action  (if
     any)  is executed. If the result is false, the action is not
     executed.

  Pattern Ranges
     A pattern range consists of two expressions separated  by  a
     comma. In this case, the action is performed for all records
     between a match of the first expression  and  the  following
     match  of  the  second expression, inclusive. At this point,
     the pattern range can be repeated starting at input  records
     subsequent to the end of the matched range.

  Actions
     An action is a sequence of statements. A  statement  can  be
     one of the following:

       if ( expression ) statement [ else statement ]
       while ( expression ) statement
       do statement while ( expression )
       for ( expression ; expression ; expression ) statement
       for ( var in array ) statement
       delete array[subscript] #delete an array element
       break
       continue
       { [ statement ] ... }
       expression        # commonly variable = expression
       print [ expression-list ] [ >expression ]
       printf format [ ,expression-list ] [ >expression ]
       next              # skip remaining patterns on this input line
       exit [expr] # skip the rest of the input; exit status is expr
       return [expr]

     Any single statement can be replaced  by  a  statement  list
     enclosed  in  braces.  The statements are terminated by new-
     line characters or semicolons, and are executed sequentially
     in the order that they appear.


     The next statement causes  all  further  processing  of  the
     current  input record to be abandoned. The behavior is unde-
     fined if a next statement appears or is invoked in  a  BEGIN
     or END action.


     The exit statement invokes all END actions in the  order  in
     which  they  occur  in the program source and then terminate
     the program without reading further input. An exit statement
     inside  an END action terminates the program without further
     execution of END actions.  If an expression is specified  in
     an  exit  statement, its numeric value is the exit status of
     nawk, unless subsequent errors are encountered or  a  subse-
     quent exit statement with an expression is executed.

  Output Statements
     Both print and printf statements write to standard output by
     default.  The output is written to the location specified by
     output_redirection if one is supplied, as follows:

       > expression>> expression| expression



     In all cases, the  expression  is  evaluated  to  produce  a
     string  that is used as a full pathname to write into (for >
     or >>) or as a command to be executed  (for  |).  Using  the
     first  two  forms, if the file of that name is not currently
     open, it is opened, creating it if necessary and  using  the
     first form, truncating the file. The output then is appended
     to the file.  As long as the file remains  open,  subsequent
     calls in which expression evaluates to the same string value
     simply appends output to the file.  The  file  remains  open
     until the close function, which is called with an expression
     that evaluates to the same string value.


     The third form writes output onto  a  stream  piped  to  the
     input  of  a  command. The stream is created if no stream is
     currently open with the value of expression as  its  command
     name.   The stream created is equivalent to one created by a
     call to the popen(3C) function with the value of  expression
     as  the  command argument and a value of w as the mode argu-
     ment.  As long as the stream remains open, subsequent  calls
     in  which  expression  evaluates  to  the  same string value
     writes output to the existing  stream.  The  stream  remains
     open  until  the close function is called with an expression
     that evaluates to the same string value.  At that time,  the
     stream is closed as if by a call to the pclose function.


     These output  statements  take  a  comma-separated  list  of
     expression  s  referred  in  the grammar by the non-terminal
     symbols expr_list, print_expr_list  or  print_expr_list_opt.
     This  list  is  referred to here as the expression list, and
     each member is referred to as an expression argument.


     The print statement writes  the  value  of  each  expression
     argument  onto  the indicated output stream separated by the
     current output field separator (see variable OFS above), and
     terminated  by the output record separator (see variable ORS
     above). All expression arguments is taken as strings,  being
     converted  if  necessary; with the exception that the printf
     format in OFMT is used instead of the value in  CONVFMT.  An
     empty  expression  list  stands  for  the whole input record
     ($0).


     The printf statement produces output  based  on  a  notation
     similar  to  the  File Format Notation used to describe file
     formats in this document Output  is  produced  as  specified
     with  the first expression argument as the string format and
     subsequent expression arguments as the strings arg1 to argn,
     inclusive, with the following exceptions:

         1.   The format is an  actual  character  string  rather
              than a graphical representation. Therefore, it can-
              not contain empty character  positions.  The  space
              character  in  the  format  string,  in any context
              other than a flag of a conversion specification, is
              treated  as an ordinary character that is copied to
              the output.

         2.   If the character set contains a Delta character and
              that  character appears in the format string, it is
              treated as an ordinary character that is copied  to
              the output.

         3.   The escape sequences  beginning  with  a  backslash
              character is treated as sequences of ordinary char-
              acters that are copied to  the  output.  Note  that
              these  same  sequences  is interpreted lexically by
              nawk when they appear in literal strings, but  they
              is not treated specially by the printf statement.

         4.   A field width or precision can be specified as  the
              * character instead of a digit string. In this case
              the next  argument  from  the  expression  list  is
              fetched  and  its  numeric value taken as the field
              width or precision.

         5.   The implementation does not precede or follow  out-
              put  from the d or u conversion specifications with
              blank  characters  not  specified  by  the   format
              string.

         6.   The implementation does not precede output from the
              o  conversion  specification with leading zeros not
              specified by the format string.

         7.   For the c conversion specification: if the argument
              has  a  numeric value, the character whose encoding
              is that value is output.  If the value is  zero  or
              is not the encoding of any character in the charac-
              ter set, the behavior is undefined.  If  the  argu-
              ment does not have a numeric value, the first char-
              acter of the string value is output; if the  string
              does  not  contain  any  characters the behavior is
              undefined.

         8.   For each conversion specification that consumes  an
              argument,   the   next   expression   argument   is
              evaluated. With the exception of the c  conversion,
              the  value is converted to the appropriate type for
              the conversion specification.

         9.   If there are insufficient expression  arguments  to
              satisfy  all  the  conversion specifications in the
              format string, the behavior is undefined.

         10.  If any character  sequence  in  the  format  string
              begins  with  a  %  character,  but does not form a
              valid conversion  specification,  the  behavior  is
              unspecified.


     Both print and printf can output at least {LINE_MAX} bytes.

  Functions
     The nawk language  has  a  variety  of  built-in  functions:
     arithmetic, string, input/output and general.

  Arithmetic Functions
     The arithmetic functions, except for int, are based  on  the
     ISO C standard. The behavior is undefined in cases where the
     ISO C standard specifies that an error be returned  or  that
     the  behavior  is  undefined.  Although  the grammar permits
     built-in  functions  to  appear   with   no   arguments   or
     parentheses,   unless   the   argument  or  parentheses  are
     indicated as optional in the following list  (by  displaying
     them within the [ ] brackets), such use is undefined.

     atan2(y,x)
                      Return arctangent of y/x.


     cos(x)
                      Return cosine of x, where x is in radians.


     sin(x)
                      Return sine of x, where x is in radians.


     exp(x)
                      Return the exponential function of x.


     log(x)
                      Return the natural logarithm of x.


     sqrt(x)
                      Return the square root of x.


     int(x)
                      Truncate its argument to an integer. It  is
                      truncated toward 0 when x > 0.


     rand()
                      Return a random number n, such that 0 < n <
                      1.


     srand([expr])
                      Set the seed value for rand to expr or  use
                      the  time  of  day  if expr is omitted. The
                      previous seed value is returned.


  String Functions
     The string functions in the following  list  shall  be  sup-
     ported.  Although  the grammar permits built-in functions to
     appear with no arguments or parentheses, unless the argument
     or  parentheses  are  indicated as optional in the following
     list (by displaying them within the [ ] brackets), such  use
     is undefined.

     gsub(ere,repl[,in])
         Behave like sub (see below), except that it replaces all
         occurrences of the regular expression (like the ed util-
         ity global substitute) in $0 or in the in argument, when
         specified.


     index(s,t)
         Return the position, in characters, numbering from 1, in
         string s where string t first occurs, or zero if it does
         not occur at all.


     length[([s])]
         Return the length, in characters, of its argument  taken
         as  a string, or of the whole record, $0, if there is no
         argument.


     match(s,ere)
         Return the position, in characters, numbering from 1, in
         string  s  where  the  extended  regular  expression ere
         occurs, or zero if it does not occur at all.  RSTART  is
         set  to  the starting position (which is the same as the
         returned value), zero if no match is found;  RLENGTH  is
         set  to the length of the matched string, -1 if no match
         is found.


     split(s,a[,fs])
         Split the string s into array elements a[1], a[2],  ...,
         a[n],  and  return  n.  The  separation is done with the
         extended regular expression fs or with the field separa-
         tor  FS  if  fs  is  not given. Each array element has a
         string value when created. If the string assigned to any
         array  element, with any occurrence of the decimal-point
         character from the current locale changed  to  a  period
         character,  would  be  considered  a numeric string; the
         array element also has the numeric value of the  numeric
         string.  The  effect of a null string as the value of fs
         is unspecified.


     sprintf(fmt,expr,expr,...)
         Format the expressions according to  the  printf  format
         given by fmt and return the resulting string.


     sub(ere,repl[,in])
         Substitute  the  string  repl  in  place  of  the  first
         instance  of  the  extended  regular  expression  ERE in
         string in and return the  number  of  substitutions.  An
         ampersand ( & ) appearing in the string repl is replaced
         by the string from in that matches the  regular  expres-
         sion.  An  ampersand  preceded with a backslash ( \ ) is
         interpreted  as  the  literal  ampersand  character.  An
         occurrence of two consecutive backslashes is interpreted
         as just a single literal backslash character. Any  other
         occurrence  of  a  backslash (for example, preceding any
         other character) is treated as a literal backslash char-
         acter.  If repl is a string literal, the handling of the
         ampersand character occurs after any lexical processing,
         including any lexical backslash escape sequence process-
         ing. If in is specified and it  is  not  an  lvalue  the
         behavior  is  undefined. If in is omitted, nawk uses the
         current record ($0) in its place.


     substr(s,m[,n])
         Return the at  most  n-character  substring  of  s  that
         begins at position m, numbering from 1. If n is missing,
         the length of the substring is limited by the length  of
         the string s.


     tolower(s)
         Return a string based on the string s. Each character in
         s  that  is  an  upper-case  letter  specified to have a
         tolower mapping by the LC_CTYPE category of the  current
         locale  is replaced in the returned string by the lower-
         case letter specified by the mapping.  Other  characters
         in s are unchanged in the returned string.


     toupper(s)
         Return a string based on the string s. Each character in
         s  that  is  a  lower-case  letter  specified  to have a
         toupper mapping by the LC_CTYPE category of the  current
         locale  is replaced in the returned string by the upper-
         case letter specified by the mapping.  Other  characters
         in s are unchanged in the returned string.



     All of the preceding functions that take ERE as a  parameter
     expect  a  pattern  or  a string valued expression that is a
     regular expression as defined below.

  Input/Output and General Functions
     The input/output and general functions are:

     close(expression)
                                Close the file or pipe opened  by
                                a  print or printf statement or a
                                call to  getline  with  the  same
                                string-valued  expression. If the
                                close was successful,  the  func-
                                tion  returns  0;  otherwise,  it
                                returns non-zero.

     expression|getline[var]
                                Read a record  of  input  from  a
                                stream piped from the output of a
                                command. The stream is created if
                                no  stream is currently open with
                                the value of  expression  as  its
                                command  name. The stream created
                                is equivalent to one created by a
                                call  to  the popen function with
                                the value of  expression  as  the
                                command argument and a value of r
                                as the mode argument. As long  as
                                the  stream  remains open, subse-
                                quent calls in  which  expression
                                evaluates   to  the  same  string
                                value  reads  subsequent  records
                                from the file. The stream remains
                                open until the close function  is
                                called  with  an  expression that
                                evaluates  to  the  same   string
                                value.  At  that time, the stream
                                is closed as if by a call to  the
                                pclose  function. If var is miss-
                                ing, $0 and NF is set. Otherwise,
                                var is set.

                                The  getline  operator  can  form
                                ambiguous  constructs  when there
                                are operators  that  are  not  in
                                parentheses  (including concaten-
                                ate) to the left of the | (to the
                                beginning  of the expression con-
                                taining getline). In the  context
                                of  the  $ operator, | behaves as
                                if it had a lower precedence than
                                $. The result of evaluating other
                                operators is unspecified, and all
                                such  uses  of  portable applica-
                                tions must be put in  parentheses
                                properly.


     getline
                                   Set  $0  to  the  next   input
                                   record  from the current input
                                   file.  This  form  of  getline
                                   sets the NF, NR, and FNR vari-
                                   ables.


     getline var
                                   Set variable var to  the  next
                                   input  record from the current
                                   input file. This form of  get-
                                   line   sets  the  FNR  and  NR
                                   variables.


     getline [var] < expression
                                   Read the next record of  input
                                   from a named file. The expres-
                                   sion is evaluated to produce a
                                   string  that is used as a full
                                   pathname. If the file of  that
                                   name is not currently open, it
                                   is  opened.  As  long  as  the
                                   stream  remains  open,  subse-
                                   quent calls in  which  expres-
                                   sion  evaluates  to  the  same
                                   string value reads  subsequent
                                   records  from  the  file.  The
                                   file remains  open  until  the
                                   close  function is called with
                                   an expression  that  evaluates
                                   to  the  same string value. If
                                   var is missing, $0 and  NF  is
                                   set. Otherwise, var is set.

                                   The getline operator can  form
                                   ambiguous    constructs   when
                                   there  are  binary   operators
                                   that  are  not  in parentheses
                                   (including concatenate) to the
                                   right  of the < (up to the end
                                   of the  expression  containing
                                   the  getline).  The  result of
                                   evaluating such a construct is
                                   unspecified, and all such uses
                                   of portable applications  must
                                   be  put  in  parentheses prop-
                                   erly.


     system(expression)
                                   Execute the command  given  by
                                   expression    in   a   manner
                                   equivalent to  the  system(3C)
                                   function  and  return the exit
                                   status of the command.



     All forms of getline return 1 for successful  input,  0  for
     end of file, and -1 for an error.


     Where strings are used as the name of a  file  or  pipeline,
     the  strings  must  be  textually identical. The terminology
     ``same string value'' implies that  ``equivalent  strings'',
     even  those  that differ only by space characters, represent
     different files.

  User-defined Functions
     The nawk language also provides user-defined functions. Such
     functions can be defined as:

       function name(args,...) { statements }



     A function can be referred to anywhere in an  nawk  program;
     in particular, its use can precede its definition. The scope
     of a function is global.


     Function arguments can be  either  scalars  or  arrays;  the
     behavior is undefined if an array name is passed as an argu-
     ment that the function uses as a  scalar,  or  if  a  scalar
     expression  is  passed as an argument that the function uses
     as an array. Function  arguments  are  passed  by  value  if
     scalar  and  by  reference if array name. Argument names are
     local to the function; all other variable names are  global.
     The  same  name  is not used as both an argument name and as
     the name of a function or a special nawk variable. The  same
     name  must  not  be used both as a variable name with global
     scope and as the name of a function. The same name must  not
     be  used within the same scope both as a scalar variable and
     as an array.


     The number of parameters in the function definition need not
     match  the number of parameters in the function call. Excess
     formal parameters can be used as local variables.  If  fewer
     arguments  are  supplied  in a function call than are in the
     function definition, the extra parameters that are  used  in
     the  function  body as scalars are initialized with a string
     value of the null string and a numeric value  of  zero,  and
     the  extra  parameters that are used in the function body as
     arrays are initialized as empty arrays.  If  more  arguments
     are  supplied  in  a  function call than are in the function
     definition, the behavior is undefined.


     When invoking a function,  no  white  space  can  be  placed
     between the function name and the opening parenthesis. Func-
     tion calls can be nested and recursive  calls  can  be  made
     upon  functions.  Upon  return  from any nested or recursive
     function call, the values of all of the  calling  function's
     parameters are unchanged, except for array parameters passed
     by reference. The return statement can be used to  return  a
     value.  If  a return statement appears outside of a function
     definition, the behavior is undefined.


     In the function definition, newline characters are  optional
     before  the opening brace and after the closing brace. Func-
     tion definitions can appear anywhere in the program where  a
     pattern-action pair is allowed.

Usage
     The index, length, match, and substr functions should not be
     confused  with  similar functions in the ISO C standard; the
     nawk versions deal with characters, while the ISO C standard
     deals with bytes.


     Because the concatenation operation is represented by  adja-
     cent  expressions  rather  than  an explicit operator, it is
     often necessary to use parentheses  to  enforce  the  proper
     evaluation precedence.


     See largefile(5) for the description of the behavior of nawk
     when  encountering  files  greater  than or equal to 2 Gbyte
     (2^31 bytes).

Examples
     The nawk program specified  in  the  command  line  is  most
     easily  specified  within  single-quotes (for example, `pro-
     gram') for applications using sh, because nawk programs com-
     monly  contain  characters  that  are  special to the shell,
     including double-quotes. In the cases where a  nawk  program
     contains  single-quote  characters, it is usually easiest to
     specify most of the program as strings within  single-quotes
     concatenated  by  the shell with quoted single-quote charac-
     ters. For example:

       nawk `/'\''/ { print "quote:", $0 }'



     prints all  lines  from  the  standard  input  containing  a
     single-quote character, prefixed with quote:.


     The following are examples of simple nawk programs:

     Example 1 Write to the standard output all input  lines  for
     which field 3 is greater than 5:

       $3 > 5

     Example 2 Write every tenth line:

       (NR % 10) == 0



     Example 3 Write any line with a substring matching the regu-
     lar expression:

       /(G|D)(2[0-9][[:alpha:]]*)/



     Example 4 Print any line with a substring containing a G  or
     D, followed by a sequence of digits and characters:


     This example uses character classes digit and alpha to match
     language-independent   digit   and   alphabetic  characters,
     respectively.


       /(G|D)([[:digit:][:alpha:]]*)/



     Example 5 Write any line in which the second  field  matches
     the regular expression and the fourth field does not:

       $2 ~ /xyz/ && $4 !~ /xyz/



     Example 6 Write any line in which the second field  contains
     a backslash:

       $2 ~ /\\/



     Example 7 Write any line in which the second field  contains
     a backslash (alternate method):


     Notice that backslash escapes are interpreted twice, once in
     lexical  processing of the string and once in processing the
     regular expression.


       $2 ~ "\\\\"

     Example 8 Write the second to the last and the last field in
     each line, separating the fields by a colon:

       {OFS=":";print $(NF-1), $NF}



     Example 9 Write the line number and number of fields in each
     line:


     The three strings representing the line  number,  the  colon
     and the number of fields are concatenated and that string is
     written to standard output.


       {print NR ":" NF}



     Example 10 Write lines longer than 72 characters:

       {length($0) > 72}



     Example  11  Write  first  two  fields  in  opposite   order
     separated by the OFS:

       { print $2, $1 }



     Example 12 Same, with input fields  separated  by  comma  or
     space and tab characters, or both:

       BEGIN { FS = ",[\t]*|[\t]+" }
             { print $2, $1 }



     Example 13 Add up first column, print sum and average:

       {s += $1 }
       END {print "sum is ", s, " average is", s/NR}



     Example 14 Write fields in reverse order, one per line (many
     lines out for each line in):

       { for (i = NF; i > 0; --i) print $i }

     Example 15  Write  all  lines  between  occurrences  of  the
     strings "start" and "stop":

       /start/, /stop/



     Example 16 Write all lines whose first  field  is  different
     from the previous one:

       $1 != prev { print; prev = $1 }



     Example 17 Simulate the echo command:

       BEGIN  {
              for (i = 1; i < ARGC; ++i)
                    printf "%s%s", ARGV[i], i==ARGC-1?"\n":""
              }



     Example 18 Write the path prefixes  contained  in  the  PATH
     environment variable, one per line:

       BEGIN  {
              n = split (ENVIRON["PATH"], path, ":")
              for (i = 1; i <= n; ++i)
                     print path[i]
              }



     Example 19 Print the file "input", filling in  page  numbers
     starting at 5:


     If there is a file named input containing  page  headers  of
     the form


       Page#



     and a file named program that contains


       /Page/{ $2 = n++; }
       { print }

     then the command line


       nawk -f program n=5 input




     prints the file input, filling in page numbers  starting  at
     5.

Environment Variables
     See environ(5) for descriptions of the following environment
     variables   that  affect  execution:  LC_COLLATE,  LC_CTYPE,
     LC_MESSAGES, and NLSPATH.

     LC_NUMERIC
                   Determine the radix character used when inter-
                   preting  numeric input, performing conversions
                   between numeric and string values and  format-
                   ting numeric output. Regardless of locale, the
                   period character (the decimal-point  character
                   of  the  POSIX  locale)  is  the decimal-point
                   character recognized in  processing  awk  pro-
                   grams  (including  assignments in command-line
                   arguments).

Exit Status
     The following exit values are returned:

     0
           All input files were processed successfully.


     >0
           An error occurred.



     The exit status can be altered within the program  by  using
     an exit expression.

Attributes
     See attributes(5) for descriptions of the  following  attri-
     butes:

  /usr/bin/nawk
     tab() box; cw(2.75i) |cw(2.75i) lw(2.75i) |lw(2.75i)  ATTRI-
     BUTE TYPEATTRIBUTE VALUE _ Availabilitysystem/core-os


  /usr/xpg4/bin/awk
     tab()  box;  cw(2.75i)   |cw(2.75i)   lw(2.75i)   |lw(2.75i)

     ATTRIBUTE           TYPEATTRIBUTE           VALUE          _
     Availabilitysystem/xopen/xcu4

See Also
     awk(1), ed(1), egrep(1), grep(1), lex(1), sed(1), popen(3C),
     printf(3C),  system(3C),  attributes(5),  environ(5), largefile(5),
     regex(5), XPG4(5)


     Aho, A. V., B. W. Kernighan, and P. J. Weinberger,  The  AWK
     Programming Language, Addison-Wesley, 1988.

Diagnostics
     If any file operand is specified and the named  file  cannot
     be  accessed,  nawk  writes a diagnostic message to standard
     error and terminate without any further action.


     If the program specified by either the program operand or  a
     progfile  operand  is not a valid nawk program (as specified
     in EXTENDED DESCRIPTION), the behavior is undefined.

Notes
     Input white space is not preserved on output if  fields  are
     involved.


     There  are  no  explicit  conversions  between  numbers  and
     strings.  To  force  an expression to be treated as a number
     add 0 to it; to force it to be treated as a string concaten-
     ate the null string ("") to it.
맨 페이지 내용의 저작권은 맨 페이지 작성자에게 있습니다.
RSS ATOM XHTML 1.0 CSS3