Page 1 of 1

Is there a tool which reliably finds part-doubles in a pgn?

PostPosted: 11 Jun 2005, 09:57
by Guenther Simon
I tried CB8 and SCID so far, but the first one seems buggy in this
concern, when the latter lacks the option to specify a min move
number(and does not work like it should too), for what I will define
as a part-double.

For clarification of my needs:
I need a tool which finds all games, which have identical moves for
at least a certain number of moves, I want to specify myself.
(sth. between 15 and 30 normally)

May be Yace does this already? Any other tools out which are
capable of doing this?

Guenther

Re: Is there a tool which reliably finds part-doubles in a p

PostPosted: 11 Jun 2005, 15:21
by Norm Pollock
You could try fooling around with pgn-extract.

It has a "-b" option to set boundaries for the number of moves.
It has a "-d" option to form a duplicate pgn file.
It has a "-u" option to form a unique pgn file.

http://www.cs.kent.ac.uk/people/staff/djb/extract.html

-Norm

Re: Is there a tool which reliably finds part-doubles in a p

PostPosted: 11 Jun 2005, 16:07
by Guenther Simon
You could try fooling around with pgn-extract.

It has a "-b" option to set boundaries for the number of moves.
It has a "-d" option to form a duplicate pgn file.
It has a "-u" option to form a unique pgn file.

http://www.cs.kent.ac.uk/people/staff/djb/extract.html

-Norm


This is exactly what I have tried before you had posted it ;)

Sadly PGNExtract can't do it either :(
The -b arg is just for game length boundaries not for length
of move sequences to be compared as double :(

It seems no tool exists, which can do what I want to do
and moreover I believe now the task is not trivial at all.
The tool must be able to sort all games first by ECO or similar
and then compare each game as a sequence with given length
to the next game.
Without sorting it needs to compare each game against all
games each time and that would slow it down a lot.

I have found a manual solution so far, with the help of CB8,
using a kind of pattern recognition.
First I sorted all games by ECO then I displayed all games
only as moves list and looked at it around the region of the move
number I wanted to have as *double margin* e.g. move 15.
I noticed that the Human eye is very good in recognizing patterns
between all the move lists, as identical sequences in more lines
following another create a pattern in the sea of characters, because
of identical spaces etc...
This method is very exhausting for the eyes though, but I needed
just 5 minutes anyway to find 241 games which had at least
one sequence up to 15 moves twice or more times.

Guenther

Re: Is there a tool which reliably finds part-doubles in a p

PostPosted: 11 Jun 2005, 16:54
by Norm Pollock
pgn-extract has a "-cfile" option that might be helpful. It says "Use file as a list of check files for duplicates."

Finally found an existing tool! *Spikes book builder*

PostPosted: 23 Jun 2005, 19:59
by Guenther Simon
Spikes book builder is perfectly capable in doing what I looked for!
I can import a PGN file into it and can specify the game length
in half moves for what I consider as a part_double.
Then I can set the min. number of games the same position should
have been occured.
And voila it generates a file with all FENs which fullfill my conditions.

My thanks to Volker and Ralf for this nice tool.

Guenther

Spikes book builder

PostPosted: 07 Jul 2005, 21:30
by Roy Harper
where is the link for this?

did a google search for it came up with nothing..

Re: Is there a tool which reliably finds part-doubles in a p

PostPosted: 07 Jul 2005, 21:37
by Ralf Schäfer
Hi Roy,

you can download the Spike Bookbuilder from

http://spike.lazypics.de/dl_tools_en.html

Best
Ralf

Re: Is there a tool which reliably finds part-doubles in a p

PostPosted: 07 Jul 2005, 21:38
by Roy Harper
Thank you very much!!