在执行某个程序的时候,我们通常使用命令行參数来进行配置其行为。命令行选项和參数控制 UNIX 程序,告知它们怎样动作

当 gcc的程序启动代码调用我们的入口函数 main(int argc,char *argv[]) 时,已经对命令行进行了处理。argc 參数包括程序參数的个数,而 argv 包括指向这些參数的指针数组。

程序的參数能够分为三种:选项。选项的关联值,非选项參数

比如:

$gcc getopt_test.c -o testopt
getopt_test.c是非选项參数。-o是选项,testopt是-o选项的关联值。

依据Linux的惯例。程序的选项应该以一个短横线开头。后面包括单个字母或数字。选项分为:带关联值的和不带关联值的以及可选的不带关联值的选项能够在一个短横线后合并使用,比如 ls -al。

此外还有长选项。有两个短横线来指明。比方说   -o filename  --output filename  给定输出文件名称等,以下整理了一些国外的资源用来学习。


getopt():短选项处理
getopt() 函数位于 unistd.h 系统头文件里,函数原型是: 
int getopt( int argc, char *const argv[], const char *optstring );
getopt使用main函数的argc和argv作为前两个參数,optsting是一个字符列表。每一个字符代表一个单字符选项,假设一个字符后面紧跟以冒号(:),表示该字符有一个关联值作为下一个參数。两个冒号"::"代表这个选项的參数是可选的。getopt的返回值是argv数组中的下一个选项參数。由optind记录argv数组的下标,假设选项參数处理完成,函数返回-1;假设遇到一个无法识别的选项,返回问号(?)。并保存在optopt中;

假设一个选项须要一个关联值。而程序运行时没有提供,返回一个问号(?),假设将optstring的第一个字符设为冒号(:),这样的情况下,函数会返回冒号而不是问号。

选项參数处理完成后。optind会指向argv数组尾部的其它非选项參数。

实际上。getopt在运行过程中会重排argv数组,将非选项參数移到数组的尾部

getopt() 所设置的全局变量(在unistd.h中)包含:
optarg——指向当前选项參数(假设有)的指针。

optind—— getopt() 即将处理的下一个參数 argv 指针的索引。
optopt——最后一个已知选项。

以下是一个使用getopt简单样例:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main( int argc, char **argv) {
      int opt = 0;
      int i = 0;
      const char *optstring = ":vV:h:" ;  //

      for(i = 0; i < argc; i++)
           printf ("%d:%s\n" , i, argv[i]);

      //分别处理选项參数中的各个參数
      while((opt = getopt (argc, argv, optstring)) != -1){
           switch (opt){
           case 'v' :
               printf ("verbose\n" );
               break ;
           case 'V' :
               printf ("option %c:the Version is %s\n" , opt, optarg);
               break ;
           case 'h' :
               printf ("The option %c  is %s...\n" , opt, optarg);
               break ;
           case '?' :
               printf ("Unknown option %c\n" ,optopt);
               break ;
          }
     }

      //option index 终于会指向非选项參数
      printf( "After getopt the optind = %d \n" , optind);

      //在运行完getopt之后又一次打印 argv数组
      for(i = 0; i < argc; i++)
           printf ("%d:%s\n" , i, argv[i]);

      return 0;
}
结果:
X:\1.KEEP MOVING\3.C\MyCodes\GetOpt\Debug\GetOpt.exe: invalid option -- x
0:X:\1.KEEP MOVING\3.C\MyCodes\GetOpt\Debug\GetOpt.exe
1:arg1
2:-v
3:-V
4:2.1
5:-h
6:help
7:-x
8:arg2
verbose
option V:the Version is 2.1
The option h  is help...
Unknown option x
After getopt the optind = 7
0:X:\1.KEEP MOVING\3.C\MyCodes\GetOpt\Debug\GetOpt.exe
1:-v
2:-V
3:2.1
4:-h
5:help
6:-x
7:arg1
8:arg2

能够看到getopt运行完后非选项參数都移到了后面,由optind指向。


getopt_long():长选项处理
函数原型  :   int getopt_long (int argc, char *const *argv, const char *shortopts, const struct option *longopts, int *indexptr)
贴一段对这个函数比較清晰的说明:

Decode options from the vector argv (whose length is argc). The argument shortopts describes the short options to accept, just as it does in getopt. The argument longopts describes the long options to accept (see above).

When getopt_long encounters a short option, it does the same thing that getopt would do: it returns the character code for the option, and stores the options argument (if it has one) inoptarg.

When getopt_long encounters a long option, it takes actions based on the flag and val fields of the definition of that option.

If flag is a null pointer, then getopt_long returns the contents of val to indicate which option it found. You should arrange distinct values in the val field for options with different meanings, so you can decode these values after getopt_long returns. If the long option is equivalent to a short option, you can use the short option's character code in val.

If flag is not a null pointer, that means this option should just set a flag in the program. The flag is a variable of type int that you define. Put the address of the flag in the flag field. Put in the val field the value you would like this option to store in the flag. In this case, getopt_long returns 0.

For any long option, getopt_long tells you the index in the array longopts of the options definition, by storing it into *indexptr. You can get the name of the option withlongopts[*indexptr].name. So you can distinguish among long options either by the values in their val fields or by their indices. You can also distinguish in this way among long options that set flags.

When a long option has an argument, getopt_long puts the argument value in the variable optarg before returning. When the option has no argument, the value in optarg is a null pointer. This is how you can tell whether an optional argument was supplied.

When getopt_long has no more options to handle, it returns -1, and leaves in the variable optind the index in argv of the next remaining argument.

getopt_long的选项用结构体option定义:
struct option {
    char *name;   //长选项的名字
    int has_arg;  // 0/1。标志是否有选项
    int *flag; //上面有具体说明,通常为NULL
    int val;  
};
This structure describes a single long option name for the sake of getopt_long. The argument longopts must be an array of these structures, one for each long option. Terminate the array with an element containing all zeros.

The struct option structure has these fields:
name - This field is the name of the option. It is a string. 
has_arg - This field says whether the option takes an argument. It is an integer, and there are three legitimate values: no_argument,             required_argument  and optional_argument. 
flag ,val - These fields control how to report or act on the option when it occurs.
If flag is a null pointer, then the val is a value which identifies this option. Often these values are chosen to uniquely identify particular long options.
If flag is not a null pointer, it should be the address of an int variable which is the flag for this option. The value in val is the value to store in the flag to indicate that the option was seen.

上面的英文解释很清晰,以下是一个使用getopt_long简单样例:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <getopt.h>

int main( int argcchar **argv){
      const char *short_options = "vhVo:" ;

      const struct option long_options[] = {
              { "verbose" optional_argument , NULL, 'v' },
              { "help" no_argument , NULL, 'h' },
              { "version" no_argument , NULL, 'V' },
              { "output" required_argument , NULL, 'o' },
              {NULL, 0, NULL, 0} ,  /* Required at end of array. */
     };

      for (;;) {
           int c;
          c = getopt_long (argc, argv, short_options, long_options, NULL);//
           if (c == -1) {
               break ;
          }
           switch (c) {
           case 'h' :
               printf ("The usage of this program...\n" );
               break ;
           case 'v' :
               printf ("set the program's log verbose...\n");
               break ;
           case 'V' :
               printf ("The version is 0.1 ...\n" );
               break ;
           case 'o' :
               printf ("The output file is %s.\n" ,optarg);
               break ;
           case '?

:

               printf ("Invalid option , abort the program.");
               exit (-1);
           default // unexpected
             abort ();
          }
     }

      return 0;
}

參数是:
命令行參数选项处理:getopt()及getopt_long()函数使用_#include
结果:
The usage of this program...
set the program's log verbose...
The version is 0.1 ...
The output file is outputfile.

应用场景分析
在openvswitch的源代码中,每一个组件的启动过程都会牵扯到命令行參数的解析,处理思路都是类似的。以下是我对ovsdb-client中代码的这部分代码的抽离。明白这个过程做了哪些事情。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <getopt.h>
#include <limits.h>


void out_of_memory( void ){
      printf( "virtual memory exhausted\n" );
      abort();
}

// xmalloc终于还是调用标准C的 malloc,仅仅只是进行了包装。
//保证内存会分配成功。否则就因此运行终止应用程序。
void *xmalloc( size_t size){
    void *p = malloc (size ? size : 1);
    if (p == NULL) {
        out_of_memory();
    }
    return p;
}

char *xmemdup0( const char *p_, size_t length){
    char *p = xmalloc(length + 1);
    memcpy(p, p_, length);
    p[length] = '\0';
    return p;
}

//Duplicates a character string without fail, using xmalloc to obtain memory.
char *xstrdup( const char *s){
    return xmemdup0(s, strlen (s));
}

/* Given the GNU-style long options in 'options', returns a string that may be
 * passed to getopt() with the corresponding short options.  The caller is
 * responsible for freeing the string. */
char *long_options_to_short_options( const struct option options[]){
    char short_options[UCHAR_MAX * 3 + 1];
    char *p = short_options;

    for (; options-> name; options++) {
        const struct option *o = options;
        if (o->flag == NULL && o-> val > 0 && o-> val <= UCHAR_MAX) {
            *p++ = o-> val;
            if (o->has_arg == required_argument) {
                *p++ = ':';
            } else if (o->has_arg == optional_argument) {
                *p++ = ':';
                *p++ = ':';
            }
        }
    }
    *p = '\0';
    //不能直接返回局部变量:字符数组,须要在堆上分配空间。然后返回相应的指针。
    return xstrdup(short_options);
}

static void
parse_options( int argc, char *argv[])
{
    enum {
        OPT_BOOTSTRAP_CA_CERT = UCHAR_MAX + 1,
        OPT_TIMESTAMP ,
        DAEMON_OPTION_ENUMS ,
        TABLE_OPTION_ENUMS
    };
    static struct option long_options[] = {
        { "verbose" optional_argument , NULL, 'v' },
        { "help" no_argument , NULL, 'h' },
        { "version" no_argument , NULL, 'V' },
        { "timestamp "no_argument, NULL, OPT_TIMESTAMP },
        {NULL, 0, NULL, 0},
    };

    char *short_options = long_options_to_short_options(long_options);
    //当把把长短选项分离出来之后。就是上面的处理套路
    //这里只打印出short options
    printf( "%s\n" ,short_options);

    free(short_options);
}

int main( int argc, char **argv) {
     parse_options(argc, argv);

      return 0;
}

參考资料:
1.http://www.gnu.org/software/libc/manual/html_node/Getopt-Long-Options.html
2.http://www.ibm.com/developerworks/cn/aix/library/au-unix-getopt.html
3. http://www.cppblog.com/cuijixin/archive/2010/06/13/117788.html
4.OVS源代码