-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when running split_by_player.py #1
Comments
I have discovered that if I leave only 1 game in the pgn then I get this output instead:
|
@DavidAAbbott Did you manage to resolve this? I'm running into the same issue here. |
I just figured it out. The first issue arises from double empty lines between games in your downloaded PGN file. The official Lichess archive used in this project only contains single empty lines between games. If you have downloaded your own games via the Lichess API, you would have to get rid of the double lines. To convert to single empty lines, simply run:
The second issue comes from a supposedly bad path within 1-data_generation/9-pgn_to_training_data.sh. My fixed version: #!/bin/bash
set -e
#args input_path output_dir player
player_file=${1}
p_dir=${2}
p_name=${3}
train_frac=90
val_frac=10
split_dir=$p_dir/split
mkdir -p ${p_dir}
mkdir -p ${split_dir}
echo "${p_name} to ${p_dir}"
python split_by_player.py $player_file $p_name $split_dir/games
for c in "white" "black"; do
python pgn_fractional_split.py $split_dir/games_$c.pgn.bz2 $split_dir/train_$c.pgn.bz2 $split_dir/validate_$c.pgn.bz2 --ratios $train_frac $val_frac
cd $p_dir
mkdir -p pgns
for s in "train" "validate"; do
mkdir -p $s
mkdir -p $s/$c
#using tool from:
#https://www.cs.kent.ac.uk/people/staff/djb/pgn-extract/
bzcat split/${s}_${c}.pgn.bz2 | pgn-extract -7 -C -N -#1000
cat *.pgn > pgns/${s}_${c}.pgn
rm -v *.pgn
#using tool from:
#https://github.com/DanielUranga/trainingdata-tool
screen -S "${p_name}-${c}-${s}" -dm bash -c "cd ${s}/${c}; trainingdata-tool -v ../../pgns/${s}_${c}.pgn"
done
cd -
done After changing this file, simply run: |
For anyone running into issues compiling trainingdata-tool with the error message: /home/paul/dev/chess/trainingdata-tool/lc0/src/neural/writer.h:39:3: error: ‘uint32_t’ does not name a type
39 | uint32_t version;
| ^~~~~~~~
/home/paul/dev/chess/trainingdata-tool/lc0/src/neural/writer.h:31:1: note: ‘uint32_t’ is defined in header ‘<cstdint>’; did you forget to ‘#include <cstdint>’?
30 | #include "utils/cppattributes.h"
+++ |+#include <cstdint> Simply add |
No matter which .pgn file I try, I seem to get various errors after running "9-pgn_to_training_data.sh" which in turn runs "split_by_player.py".
Here is the error output when trying to use my own Lichess games:
2023-04-07 06:38:39 split_by_player.py finndave.pgn finndave output/split/games 2023-04-07 06:38:39 Starting split_by_player 2023-04-07 06:38:39 Error encounteredlayers from finndave.pgn Traceback (most recent call last): File "split_by_player.py", line 48, in <module> main() File "/home/david/anaconda3/envs/transfer_chess/lib/python3.7/site-packages/backend-1.0.0-py3.7.egg/backend/utils.py", line 112, in wrapped_main val = mainFunc(*args, **kwds) File "split_by_player.py", line 25, in main for i, (d, l) in enumerate(games): File "/home/david/anaconda3/envs/transfer_chess/lib/python3.7/site-packages/backend-1.0.0-py3.7.egg/backend/pgn_parsering.py", line 20, in __iter__ yield next(self) File "/home/david/anaconda3/envs/transfer_chess/lib/python3.7/site-packages/backend-1.0.0-py3.7.egg/backend/pgn_parsering.py", line 41, in __next__ raise RuntimeError(l) RuntimeError:
The text was updated successfully, but these errors were encountered: