Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add language PR #75

Closed
sega-gremlen opened this issue Feb 1, 2024 · 6 comments · Fixed by #83
Closed

add language PR #75

sega-gremlen opened this issue Feb 1, 2024 · 6 comments · Fixed by #83

Comments

@sega-gremlen
Copy link
Contributor

I can add another language (tested on audio messages), but I don't know the best way to add a language selection feature.
I saw a PR with adding Korean (#43), but it was rejected for some reason. Maybe the author can tell me how he wants to implement multi-language?

@Xewdy444
Copy link
Owner

Xewdy444 commented Feb 2, 2024

I would be willing to implement this. The author of that pull request closed it as it was incomplete, however his idea can still be used. I will need translations for the text used for locators in recaptcha_box.py.

This format would be fine:

TRANSLATIONS = {
    "im_not_a_robot": ["I'm not a robot", "로봇이 아닙니다"],
    "get_an_audio_challenge": ["Get an audio challenge", "음성 보안문자 듣기"],
    "get_a_new_challenge": ["Get a new challenge", "보안문자 새로 받기"],
}

Another possible concern is transcribing the audio challenge for other languages of reCAPTCHA. The reason for my concern is that the recognize_google() method of speech_recognition.Recognizer() requires a language code and defaults to en-US. With this being said, I'm not sure if it would be able to transcribe other languages without changing this code.

@sega-gremlen
Copy link
Contributor Author

For Russian captchas, the audio check is also in English, so your script works fine on Russian captchas with translation.

Did I understand you correctly that you want to select a language by index? Something like that:

@property
def checkbox(self) -> Locator:
    """The reCAPTCHA checkbox locator."""
    return self.anchor_frame.get_by_role("checkbox", name=TRANSLATIONS['im_not_a_robot'][language_code])

Too bad playwright can't select the tag itself from the list, or am I missing something?

@Xewdy444
Copy link
Owner

Xewdy444 commented Feb 2, 2024

For Russian captchas, the audio check is also in English, so your script works fine on Russian captchas with translation.

Great, this will work fine then.


I'll probably end up doing something like this:

@property
def checkbox(self) -> Locator:
    """The reCAPTCHA checkbox locator."""
    return self.anchor_frame.get_by_role("checkbox", name=re.compile("|".join(TRANSLATIONS["im_not_a_robot"])))

This will generate a regex pattern that will match any of the translations for the text of that locator.

@Xewdy444
Copy link
Owner

Do you have the Russian translations for the reCAPTCHA text?

@sega-gremlen
Copy link
Contributor Author

Yeah, I have all the translations except:

"Multiple correct solutions required - please solve more."

I don't know how to call this case.

I know I'm a little slow on the PR, I'll try to get it done in the next 2 days ok? Too little time.
I want to add my first PR to my piggy bank :)

@Xewdy444
Copy link
Owner

Xewdy444 commented Mar 2, 2024

That text is shown when an incorrect answer is provided to the audio challenge.
image

@Xewdy444 Xewdy444 linked a pull request Mar 3, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants