In an earlier thread about inserting brackets around "comments" in a chess pgn-like string, I got excellent help finishing a regex that matches move lists and comments separately.
Here is the current regex:
((?:\s?[\(\)]?\s?[\(\)]?\s?[0-9]{1,3}\.{1,3}\s[NBRQK]?[a-h1-8]?x?[a-hO][1-8-][O-]{0,3}[!?+#=]{0,2}[NBRQ]?[!?+#]{0,2}(?:\s[NBRQK]?[a-h1-8]?x?[a-hO][1-8-][O-]{0,3}[!?+#=]{0,2}[NBRQ]?[!?+#]{0,2})?\s?[()]?\s?[()]?\s?)+)|((?:(?!\s?[\(\)]?\s?[\(\)]?\s?[0-9]{1,3}\.{1,3}\s[NBRQK]?[a-h1-8]?x?[a-hO][1-8-][O-]{0,3}[!?+#=]{0,2}[NBRQ]?[!?+#]{0,2}).)+)
The three capture groups are:
- "e4 e5 2. f4 exf4 3.Nf3" etc -- i.e. lists of moves
- "Blah blah blah" -- i.e. "comments"
- comment ") (" comment -- i.e. close and begin parens, when a chess variation with a comment at the end "completes", and another chess variation with a comment at the beginning "starts"
In action here: http://regex101.com/r/dQ9lY5
Everything works correctly for "Your regular expression in" PCRE(PHP): it matches all three groups correctly. When I switch to "Your regular expression in" Javascript, however, it matches everything as Capture Group 1. Is there something in my regex that isn't supported by the Javascript regex engine? I tried to research this, but haven't been able to solve it. There is so much information on this topic, and I've already spent hours and hours.
I know one solution is to use the regex as-is, and pass it to PHP through AJAX, etc, but I don't know how to do that yet (it's on my list to learn).
Question 1: But I am also very curious about what it is in this regex that doesn't work on the Javascript regex engine.
Also, here is my Javascript CleanPgnText
function. I am most interested in the while
, but if anything else seems wrong, I would appreciate any help.
function CleanPgnText(pgn) {
var pgnTextEdited = '';
var str;
var pgnInputTextArea = document.getElementById("pgnTextArea");
var pgnOutputArea = document.getElementById("pgnOutputText");
str = pgnInputTextArea.value;
str = str.replace(/\[/g,"("); //sometimes he uses [ incorrectly for variations
str = str.replace(/\]/g,")");
str = str.replace(/[
¬]*/g,""); // remove newlines and that weird character that MS Word sticks in
str = str.replace(/\s{2,}/g," "); // turn more than one space into one space
while ( str =~ /((?:\s?[\(\)]?\s?[\(\)]?\s?[0-9]{1,3}\.{1,3}\s[NBRQK]?[a-h1-8]?x?[a-hO][1-8-][O-]{0,3}[!?+#=]{0,2}[NBRQ]?[!?+#]{0,2}(?:\s[NBRQK]?[a-h1-8]?x?[a-hO][1-8-][O-]{0,3}[!?+#=]{0,2}[NBRQ]?[!?+#]{0,2})?\s?[()]?\s?[()]?\s?)+)|((?:(?!\s?[\(\)]?\s?[\(\)]?\s?[0-9]{1,3}\.{1,3}\s[NBRQK]?[a-h1-8]?x?[a-hO][1-8-][O-]{0,3}[!?+#=]{0,2}[NBRQ]?[!?+#]{0,2})[^\)\(])+)|((?:\)\s\())/g ) {
if ($1.length > 0) { //
pgnTextEdited += $1;
}
else if ($2.length > 0) {
pgnTextEdited += '{' + $2 + '}';
}
else if ($3.length > 0) {
pgnTextEdited += $3;
}
}
pgnOutputArea.innerHTML = pgnTextEdited;
}
Question 2: Regarding the =~
in the while
statement
while ( str =~
I got the =~
from helpful code in my original thread, but it was written in Perl. I don't quite understand how the =~
operator works. Can I use this same operator in Javascript, or should I be using something else?
Question 3: Can I use .length the way I am, when I say
if ($1.length > 0)
to see if the first capture group had a match?
Thank you in advance for any help. (If the regex101 link doesn't work for you, you can get a sample pgn to test on from the original thread).