-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ICU4X to run parts of util.unicode.org #1004
Comments
Hmmm, maybe we could block that user, and throttle anyone with more than 1
query per 10 seconds?
…On Sun, Jan 26, 2025, 06:30 Robin Leroy ***@***.***> wrote:
Here’s the current traffic from one specific (slightly odd) user agent:
image.png (view on web)
<https://github.com/user-attachments/assets/b70a6309-d506-45f9-b321-959704609b7c>
—
Reply to this email directly, view it on GitHub
<#1004 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMCIWJO35MX5HAIJ2ID2MTWPHAVCNFSM6AAAAABV4DY2O2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJUGQ2DQMJSHE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
@sffc is this ticket a dup? or have we just talked about it without an issue? if node is nodejs maybe someone is scraping. |
This issue is to track migrating parts of until.unicode.org to ICU4X, which has been discussed in various forums, but I couldn't find a canonical issue. Investigation on other server cost mitigation or rate limiting techniques could be discussed elsewhere. That doesn't however invalidate the motivation for making popular parts of the site run client-side. |
Currently util.unicode.org runs on top of ICU4J. It works fine, but sometimes it is slow or hits rate limits that we've imposed to cap server costs, as it is doing as I write this message:
We should add ICU4X-backed tooling to parts of util.unicode.org via WebAssembly. This has the benefit of reducing latency (all calculations are client-side) and serving costs (the ICU4X wasm file can be cost-efficiently cached and served in a CDN).
The Unicode Tools are designed to run on the latest (even unreleased) version of the Unicode Standard, and so part of this project may involve improving some of the ICU4X tooling so that it can read raw UCD files. See unicode-org/icu4x#4602
CC @josh-hadley @eggrobin
The text was updated successfully, but these errors were encountered: