My software could read a sort of limited version of PGN files.
In order to improve its handling of game files, I took a look at the PGN specification and adapted my software in an attempt to be able to parse "Recursive Annotations" without falling over, cope with "Numeric Annotation Glyphs", etc.
I ran into a few little problems, as I may have taken the specification too literally.
In one place it mentions that files are parsed as a sequence of tokens, and that there are various types of token, eg
[ left braket, introducing a tag pair, is a token
( left parenthesis, introducing a recursive annotation, is a token
symbols: an alphanumeric sequence of characters can be a token
* asterix, used as one of the game termination markers. was also defined as a token.
The characters that can occur in token of type "Symbol" were strictly defined. I think the first character had to be alphanumeric and then
symbol continuation characters could include underscores, hyphens and some other marks like maybe #, = and +.
I then ran into a few hiccups with my changed software, eg with 1/2-1/2.
My software wanted to treat it as a symbol 1, a slash, a symbol 2-1, a slash, and a symbol 2.
That was because I had implemented the PGN specification's definition of "Symbol" perhaps too literally.
Later in the specification, it says that a game terminations marker is one of the four symbols: 1-0, 0-1, 1/2-1/2, or *.
However that is in contradiction to previous text that says that the tokens 1-0 and 0-1 are tokens of type "Symbol" but that the token * is a separate type of token (asterix).
Is there a newer version of the PGN specification that clears things up?
Or do people just implement a work-around, eg
1. accepting slashes in a Symbol if it is 1/2-1/2, or
2. handling the symbol sequence 1, 2-1, and 2 specially when separated by slashes to form 1/2-1/2, or
3. ignoring the PGN specification and parsing the string "1/2-1/2" outside of software that conforms to the specification ?
I don't think it matters much. I think it is not difficult to parse a correctly formatted PGN file [ha ha: the fact I am writing this proves that I am having some problems!] and if there are errors in a PGN file there is no great need for error recovery. If there had to be a standard form of error recovery, consistent across applications, then maybe it would matter.
E.g. if someone mistakenly writes 1Nf3 in a PGN file, there doesn't seem to be any rules about whether it should be treated as
a missing dot after a move number, error reported, but accepted if the move number is right, or
if it is an invalid move number (possibly reported and ignored),
or an invalid move (and possibly the game entire game skipped, or even the file skipped or application aborted).
Another observation about PGN is that it requires a notation like e8=Q (with an equals sign) instead of Standard Algebraic Notation (as defined by FIDE).http://en.wikipedia.org/wiki/Algebraic_ ... _promotion
When a pawn moves to the last rank and promotes, the piece promoted to is indicated at the end of the move notation, for example: e8Q (promoting to queen). Sometimes an equals sign (=) or parentheses are used: e8=Q or e8(Q), but neither format is a FIDE standard. (An equals sign is also sometimes used to indicate the offer of a draw when written on the scoresheet next to a move, but this is not part of algebraic notation.) In Portable Game Notation (PGN), pawn promotion is always indicated using the equals sign format (e8=Q).
In older books, pawn promotions can be found using a forward slash: e8/Q.http://www.fide.com/FIDE/handbook/LawsOfChess.pdf
C.12 In the case of the promotion of a pawn, the actual pawn move is indicated, followed immediately by the first letter of the new piece. Examples: d8Q, f8N, b1B, g1R.