PGN for dummies

Discussion of chess software programming and technical issues.

Moderator: Ras

Sopel
Posts: 391
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: PGN for dummies

Post by Sopel »

hgm wrote: Sat Nov 13, 2021 10:41 pm Again, that has nothing to do with parsing. It is a specific processing task on the games. And again, it doesn't make the slightest difference whether the moves of the game are written as SAN or in long algebraic notation when you want to perform that task.

In fact I think this is 'out of bounds' for a format specification. I will decide myself whatever I want to encode in this format, thank you very much! As a user I should be able to decide whether I want to reject games with illegal moves (or on the contrary want to select those...). As a developer I would make sure the user has this choice. Who says I am going to feed it games with invalid moves? The games might already have been checked elsewhere. Like during creation, if they were engine-engine games. Then it would be a pure waste of time to check them again.

Even when there could be games with illegal moves in the input, I would prefer the positions in those games to go into my database, rather than be rejected. Most, if not all of these positions would be perfectly OK (just not reachable in the specified way). Chances a position from those games would ever match my position search are very slim (as they are for any individual game), and if I have reason to reject positions from games with illegal moves, I could judge that whenever I get a match, and trigger an action I deem appropriate at thye time (depending on the purpose for which I retrieve it). You really think I would make my software less useful because a specification dictates me how I should handle violations???
I hope you will never write a programming language parser that I'll have to use. Your argument is basically "I don't like the standard so I'll do whatever I think is right and I don't care what others think"
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: PGN for dummies

Post by hgm »

I think we just disagree on what the standard is. There is a PGN document, for sure, and it does describe the rules a sequence of characters have to adhere to count as PGN.

But unfortunate that document also makes other demands, which basically disqualify it as the pure specification of a game-notation format. Such as how characters would have to be encoded (which is the prerogative of the OS), or how a parser would have to handle input that violates the standard (which makes it the description of a particular piece of software). It might as well have required that PGN archives can only be interchanged on double-density 3,5" floppy discs...

I do not recognize such recommendations (which are often silly and counter-productive in a contemporary environment) as part of the PGN standard.
Sopel
Posts: 391
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: PGN for dummies

Post by Sopel »

hgm wrote: Sun Nov 14, 2021 12:01 pm I think we just disagree on what the standard is. There is a PGN document, for sure, and it does describe the rules a sequence of characters have to adhere to count as PGN.

But unfortunate that document also makes other demands, which basically disqualify it as the pure specification of a game-notation format. Such as how characters would have to be encoded (which is the prerogative of the OS), or how a parser would have to handle input that violates the standard (which makes it the description of a particular piece of software). It might as well have required that PGN archives can only be interchanged on double-density 3,5" floppy discs...

I do not recognize such recommendations (which are often silly and counter-productive in a contemporary environment) as part of the PGN standard.
So what PGN standard do you recognize and why is there more than one.
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: PGN for dummies

Post by Fulvio »

hgm wrote: Sun Nov 14, 2021 12:01 pm Such as how characters would have to be encoded (which is the prerogative of the OS)
This is unbelievable nonsense. The definition of a portable thing is precisely that of not being dependent on the O.S. Imagine if to read a PGN it was necessary to know the characteristics of the computers on which it was created!
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: PGN for dummies

Post by hgm »

Fulvio wrote: Sun Nov 14, 2021 1:25 pm
hgm wrote: Sun Nov 14, 2021 12:01 pm Such as how characters would have to be encoded (which is the prerogative of the OS)
This is unbelievable nonsense. The definition of a portable thing is precisely that of not being dependent on the O.S. Imagine if to read a PGN it was necessary to know the characteristics of the computers on which it was created!
PGN is a text format, and computer systems should know how to import text files of competing OS. Portability of text files is a much more general problem than interchanging chess games. A standard for encoding chess games should not interfere with solutions for such a general problem. That is outside its jurisdiction.

Unless you want PGN to be a binary format. But the consequence of that is that there no longer is a guarantee that text editors will be able to handle them. As indeed would be the case, when you would separate lines by just LF on Windows using NotePad. And when you would use WordPad, which does understand LF-only lines, you would 'corrupt' the file on saving, because WordPad would add the CR everywhere.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: PGN for dummies

Post by hgm »

Sopel wrote: Sun Nov 14, 2021 12:26 pm So what PGN standard do you recognize and why is there more than one.
Not sure what you are asking. Why there is an import format and an export format? Probably because the import form offers some leeway, and it was considered desirable to define a preference for how to use that in practice.

It is obvious that there is no benefit at all for complying with the export standard if all existing software is able to read every file that complies to the import standard, So effectively the import standard is the only one that matters. When you do not comply with the import standard, you will run into problems no one else can be blamed for. If you comply with the export standard when creating PGN, you will destroy compatibility with other text-processing software, and have noone to blame for it than yourself. If you comply with the import standard when creating, and fully implement it on reading, no problems can occur unless someone else is at fault.

I know which situation I would prefer...
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: PGN for dummies

Post by Fulvio »

hgm wrote: Sun Nov 14, 2021 1:49 pm Unless you want PGN to be a binary format.
A text file is a text file. Most Linux locales are now UTF-8 by default. If you write a text file in linux, it is encoded in UTF-8, and still remains a text file.

What you are saying is that if I create a PGN in Linux with Winboard and there is a comment with an accented character, it is then encoded in UTF-8. And then if I opened that PGN on my Windows laptop, the comment is not displayed correctly? And this would be your idea of improvement?

The PGNs created by SCID, Chessbase, lichess, chess.com ... are now all UTF-8 encoded and are displayed correctly regardless of the user's computer.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: PGN for dummies

Post by hgm »

Whether it displays correctly on Windows depends on whether it contains the appropriate BOM. Without that it would not display correctly, for how is the program you opened it in supposed to know that it is UTF-8? The default code page for my locale is Latin-1, which for codes 128-255 displays something completely different as the UTF-8 meaning.

I don't think Linux text files ever contain a BOM. Why should they? UTF-8 is the only text format there. The PGN standard also does not mention a BOM, (again, why should it, since it forbids use of non-ascii), so including one would be a PGN violation to start with.

So indeed, a Linux text file with non-ascii would not display correctly when you transfer it to a Windows laptop as a binary. Either the UTF-8 codes would have to be recoded as Latin-1 (if possible), or a BOM would have to be prefixed to flag the contents as UTF-8.
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: PGN for dummies

Post by Fulvio »

hgm wrote: Sun Nov 14, 2021 4:43 pm So indeed, a Linux text file with non-ascii would not display correctly when you transfer it to a Windows laptop as a binary. Either the UTF-8 codes would have to be recoded as Latin-1 (if possible), or a BOM would have to be prefixed to flag the contents as UTF-8.
This shows that you do not know the subject at all and explains all the nonsense statements.
Create a UTF-8 text file (without BOM which is used to identify endianess and is not needed and is not recommended for UTF-8) and then open it with Windows Notepad and you will see that it displays correctly.

Generally speaking, when you are convinced that you are the only one who is right, and the rest of the world is wrong, it is worth taking some time to think about it better.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: PGN for dummies

Post by dangi12012 »

Fulvio wrote: Sun Nov 14, 2021 5:21 pm
hgm wrote: Sun Nov 14, 2021 4:43 pm So indeed, a Linux text file with non-ascii would not display correctly when you transfer it to a Windows laptop as a binary. Either the UTF-8 codes would have to be recoded as Latin-1 (if possible), or a BOM would have to be prefixed to flag the contents as UTF-8.
This shows that you do not know the subject at all and explains all the nonsense statements.
Create a UTF-8 text file (without BOM which is used to identify endianess and is not needed and is not recommended for UTF-8) and then open it with Windows Notepad and you will see that it displays correctly.

Generally speaking, when you are convinced that you are the only one who is right, and the rest of the world is wrong, it is worth taking some time to think about it better.
Personal insinuations are very unconstructive. This community should be positive Fulvio...
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer