Thursday 15 April 2010

parsing - How should I parse bundled command line option with ambiguities? -


I am making a command line parser and want to support option bundling. However, I'm not sure how obscures and conflicts can arise, consider three such cases:

1.

  -I accept a string "-In" -> "Pieces will be parsed as".  
  - I accepts a string- n accepts an integer "-in 10 included" - & gt; "-Inclusion-N10" will be parsed as the 'Krydon 10' can not be parsed as integer after the first occurrence of 'N'   

3.

  -I accepts a string- n accepts an integer- c accepts a string - "Iin10clude" - & gt; ??? now what ???   

How can I handle the last string? There are several ways to parse it, can I just throw an error informing the user about the ambiguity, or do I choose to parse the string which produces the most, i.e. "-i In 10 -c lude "?

I could not find any detailed conventions online, but personally, I flagged it as an ambiguity error.

As far as I know, there is no standard paragraph on the command line parameter, nor Cross-platform consensus too. So the best we can do is to common sense and appeal.

Pausix suggests something to parse standard command line parameters. They are just guideline; As the linked section indicates, some standard shell utilities are not followed. And all while GNU utilities are expected to conform to the Pausx guidelines, they generally get distracted in some cases, including the use of "long" parameters.

In any event, what Posexin says about the group:

One or more options without option-logic, carrying an option-logic Should be accepted after the maximum one option, when grouped behind a '-' delimiter.

Note that POSX options are all single character options, also keep in mind that the guidelines are clear that only the last option in the option group can be considered as an option that can accept logic .

Regarding the long choice of GNU-style, I do not know getopt_long In addition to the behavior of the utility, a standard This utility implements the Posex style for single character options, the above mentioned grouping Option syntax; It allows single character option that takes the arguments immediately from the logic, or with logic one (possibly singular) option takes the form of the following word at the end of the group.

For a long time, the group is not allowed, even if the option accepts the argument. If the option logic accepts, then two styles are allowed: either option immediately = and then the argument, or the argument is the following word.

Style in Gnu, long option can not be confused with single-character option, because the long option should be specified with two dashes ( - ).

On the contrary, many TCL / based utilities (and some other command line parsers) allow long options with single - , but the options do not allow groups.

In all these styles, the options are divided into two divisible sets: those who argue, and who do not.

None of these systems are ambiguous, though the random mix of styles, as you would like to be, will be. Even with the rules of formal controversy, ambiguity is dangerous, especially in console applications where a command line can be irreversible. In addition, relevant nonconformities (even silently also) can change, which means that the set of options available in the future is being increased, which would be the source of difficult estimates in scripts.

As a result, I want to stick to a simple current practice like GNU, and do not try to make difficulty in interpreting wrong command lines which are not conformable.

No comments:

Post a Comment