Re: Building your own Crafty

Archive of the old Parsimony forum. Some messages couldn't be restored. Limitations: Search for authors does not work, Parsimony specific formats do not work, threaded view does not work properly. Posting is disabled.

Re: Building your own Crafty

Postby Tord Romstad » 13 Apr 2004, 13:51

Geschrieben von:/Posted by: Tord Romstad at 13 April 2004 14:51:00:
Als Antwort auf:/In reply to: Re: Building your own Crafty geschrieben von:/posted by: Bryan Hofmann at 13 April 2004 13:12:49:
Just as a FYI you can greatly reduce your batch file and achive a faster crafty by using the following;
gcc -c -DNT_i386 -O3 crafty.c
gcc -c -DNT_i386 -O3 egtb.cpp
gcc -o crafty.exe *.o
By doing this you compile the crafty.c as one large object which allows for better optimization.
Why is the optimization better with a single input file? I just tried with
my own engine, and didn't notice any improvement.
Also, are you sure -O3 is best? For Gothmog -O is better than -O2 and -O3.
I would expect the same to be true for Crafty, which is very much bigger.
Tord
Tord Romstad
 

Re: Building your own Crafty

Postby Bryan Hofmann » 13 Apr 2004, 17:14

Geschrieben von:/Posted by: Bryan Hofmann at 13 April 2004 18:14:02:
Als Antwort auf:/In reply to: Re: Building your own Crafty geschrieben von:/posted by: Tord Romstad at 13 April 2004 14:51:00:
Just as a FYI you can greatly reduce your batch file and achive a faster crafty by using the following;
gcc -c -DNT_i386 -O3 crafty.c
gcc -c -DNT_i386 -O3 egtb.cpp
gcc -o crafty.exe *.o
By doing this you compile the crafty.c as one large object which allows for better optimization.
Why is the optimization better with a single input file? I just tried with
my own engine, and didn't notice any improvement.
Also, are you sure -O3 is best? For Gothmog -O is better than -O2 and -O3.
I would expect the same to be true for Crafty, which is very much bigger.
Tord
By compiling Crafty in to one large object file it gives the compiler a better chance to optimize the program. When I did some compile tests back in version 19.10 I found this to be the case. I will say however that I used my own make file an used other option flags as well as the profile guided optimize (2 stage compile). I found that using one large object over the several small ones gave about a 2-4% speed increase in the crafty bench for the NPS.
Below is the make file I used to get the fastest 19.10 compile for a AMD XP 3000+ system. As Dann likes to say YMMV.

# This makefile is to be used for compilation of Crafty with
# gcc under Windows.
# Tested with Dev-C++ 4.9.8.5, which includes gcc/g++ 3.2
all: profile
NT_i386_1:
$(MAKE) target=NT_i386 \
CC=gcc CXX=g++ CYY=g++\
CFLAGS='$(CFLAGS) -pipe -D_REENTRANT -fomit-frame-pointer -O3 -msse\
-fforce-mem -fno-gcse -m3dnow -fprofile-arcs\
-funroll-loops -march=athlon-xp' \
CXFLAGS='$(CFLAGS) -pipe -D_REENTRANT -fomit-frame-pointer -O3 -msse\
-fforce-mem -fno-gcse -m3dnow -fprofile-arcs\
-funroll-loops -march=athlon-xp' \
LDFLAGS=$(LDFLAGS) \
opt='$(opt) -DUSE_ASSEMBLY -DINLINE_ASM -DFAST -DFUTILITY' \
asm='X86.o' \
crafty-make
NT_i386_2:
$(MAKE) target=NT_i386 \
CC=gcc CXX=g++ CYY=g++\
CFLAGS='$(CFLAGS) -pipe -D_REENTRANT -fomit-frame-pointer -O3 -msse\
-fforce-mem -fno-gcse -m3dnow -fbranch-probabilities\
-funroll-loops -march=athlon-xp' \
CXFLAGS='$(CFLAGS) -pipe -D_REENTRANT -fomit-frame-pointer -O3 -msse\
-fforce-mem -fno-gcse -m3dnow -fbranch-probabilities\
-funroll-loops -march=athlon-xp' \
LDFLAGS=$(LDFLAGS) \
opt='$(opt) -DUSE_ASSEMBLY -DINLINE_ASM -DFAST -DFUTILITY' \
asm='X86.o' \
crafty-make
profile:
$(MAKE) NT_i386_1
@crafty < prof
@rm crafty.exe
@rm *.o
$(MAKE) NT_i386_2
opts = $(opt) -D$(target)
#objects = searchr.o search.o singular.o thread.o searchmp.o repeat.o next.o \
# nexte.o nextr.o history.o quiesce.o evaluate.o movgen.o make.o unmake.o \
# hash.o attacks.o swap.o boolean.o utility.o valid.o probe.o book.o \
# data.o drawn.o edit.o epd.o epdglue.o init.o input.o interupt.o \
# iterate.o main.o option.o output.o phase.o ponder.o preeval.o resign.o \
# root.o learn.o setboard.o test.o time.o validate.o annotate.o analyze.o \
# evtest.o bench.o egtb.o dgt.o $(asm)
objects = crafty.o egtb.o $(asm)

includes = data.h chess.h
epdincludes = epd.h epddefs.h epdglue.h
eval_users = data.o evaluate.o preeval.o
crafty-make:
@$(MAKE) \
opt='$(opt)' asm='$(asm)' \
crafty
crafty: $(objects)
$(CYY) $(LDFLAGS) -o crafty $(objects) -lm $(LIBS)
dgt: dgtdrv.o
@cc -O -o dgt dgtdrv.c
egtb.o: egtb.cpp
$(CXX) -c $(CXFLAGS) $(opts) egtb.cpp
clean:
del *.o;del crafty.exe
$(objects): $(includes)
$(eval_users): evaluate.h
epd.o epdglue.o option.o init.o : $(epdincludes)
.c.o:
$(CC) $(CFLAGS) $(opts) -c $*.c
.s.o:
$(AS) $(AFLAGS) -o $*.o $*.s
Bryan Hofmann
 

Re: Building your own Crafty

Postby Paul Hunter » 13 Apr 2004, 17:30

Geschrieben von:/Posted by: Paul Hunter at 13 April 2004 18:30:18:
Als Antwort auf:/In reply to: Re: Building your own Crafty geschrieben von:/posted by: Tord Romstad at 13 April 2004 14:51:00:

The optimization that is made possible is 'inlining'. Since everything is in one source file, the compiler is able to see functions which resides in other files. Instead of making function calls, the body of the function is expanded in the caller. The trade-off is bigger binary.
Just as a FYI you can greatly reduce your batch file and achive a faster crafty by using the following;
gcc -c -DNT_i386 -O3 crafty.c
gcc -c -DNT_i386 -O3 egtb.cpp
gcc -o crafty.exe *.o
By doing this you compile the crafty.c as one large object which allows for better optimization.
Why is the optimization better with a single input file? I just tried with
my own engine, and didn't notice any improvement.
Also, are you sure -O3 is best? For Gothmog -O is better than -O2 and -O3.
I would expect the same to be true for Crafty, which is very much bigger.
Tord
Paul Hunter
 

Re: Building your own Crafty

Postby Bryan Hofmann » 13 Apr 2004, 17:59

Geschrieben von:/Posted by: Bryan Hofmann at 13 April 2004 18:59:53:
Als Antwort auf:/In reply to: Re: Building your own Crafty geschrieben von:/posted by: Paul Hunter at 13 April 2004 18:30:18:
The optimization that is made possible is 'inlining'. Since everything is in one source file, the compiler is able to see functions which resides in other files. Instead of making function calls, the body of the function is expanded in the caller. The trade-off is bigger binary.
Just as a FYI you can greatly reduce your batch file and achive a faster crafty by using the following;
gcc -c -DNT_i386 -O3 crafty.c
gcc -c -DNT_i386 -O3 egtb.cpp
gcc -o crafty.exe *.o
By doing this you compile the crafty.c as one large object which allows for better optimization.
Why is the optimization better with a single input file? I just tried with
my own engine, and didn't notice any improvement.
Also, are you sure -O3 is best? For Gothmog -O is better than -O2 and -O3.
I would expect the same to be true for Crafty, which is very much bigger.
Tord
Well I just had to prove I was right so with Crafty 19.12 and gcc version 3.2.3 (mingw special 20030504-1) I got the following results (a 4% speed increase on the NPS);
One Object File
Crafty v19.12
White(1): bench
Running benchmark. . .
......
Total nodes: 94189491
Raw nodes per second: 914461
Total elapsed time: 103
SMP time-to-ply measurement: 6.213592

Several Small Objects
Crafty v19.12
White(1): bench
Running benchmark. . .
......
Total nodes: 94189491
Raw nodes per second: 880275
Total elapsed time: 107
SMP time-to-ply measurement: 5.981308
Bryan Hofmann
 


Return to Archive (Old Parsimony Forum)

Who is online

Users browsing this forum: No registered users and 23 guests

cron