Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stats against Weblate spam #1

Open
maboroshin opened this issue Aug 17, 2023 · 13 comments
Open

Stats against Weblate spam #1

maboroshin opened this issue Aug 17, 2023 · 13 comments

Comments

@maboroshin
Copy link
Owner

maboroshin commented Aug 17, 2023

Spams are not related to the Weblate team. A reference to the data is needed. Because fantastic subjective expressions are used extensively. We need a way to quantify this.

(Aug 2023) Transifex Crowdin Weblate
Place USA Estonia Czech Republic
Projects 40,000
Nominal
(2018)
166K
Nominal
1,164
Nominal
Open project 16,000
Nominal
16,869
Actual
987
Actual
Users 0.5M
Nominal
2.1M
Nominal
0.067M
67,413
Nominal
Team 10 peples 8 Peoples 3 Peoples
Eemployees 84 117 4
+Contributors
Word limits No limit? 60,000
words
160,000
strings in 2024

was 10,000 in 2023
Display of translation rate Per peroject Per peroject 1 page
Integrity
Details
1 page 1 page Per string
Access control Free Free Free
Participation Per project Per project anyone
Anti-Harassment Policy O O X
Leakage of personal data ? ? X

Service down

Many times in a month Weblate will not allow you to edit anything. maintenance mode for several hours for the whole Weblate.

Blocks of individual projects that can last as long as several days. It will continue until the administrator takes action. In some cases, it has lasted more than a month. Please search ```The translation is temporarily closed for contributions due to maintenance, please come back later'''

With other services I don't encounter this. In other services, at least, the frequency will be much less frequent.

Note for numbers

  • Crowdin's Advanced Search could be referenced up to page 563. It's 30 projects per page. 30*562+9=16869
  • Weblate is the "Hosted Weblate". Not all self hosted server instances. Weblate's top page claims 2500 libre software projects. Weblate displays 1165 projects in HostedWeblate on the explore page. And HostedWeblate shows 1164. However, 987 projects are actually available for display. There are over 350 projects on servers other than HostedWeblate on the explore page. Codeberg Weblate has over 100.
  • Launchpad includes Ubuntu / Linux Mint, there are 2,111 projects. (Actual) Integration with Git will be insufficient.
  • WordPress.org (GlotPress) has 59,667 plugins for WordPress (This is Blog, Not software). So even with half of the 30,000, 15,000 plugins correspond to translations. (Estimation) By the 15,000th item on the list, half of the list corresponded to translations of the plug-ins themselves. The more the latter, the more only the description can be translated.

Description

Wordpress and Launchpad have translation guidelines. A large number of translations can be done with uniformity. Depending on the project, Transifex and Crowdin require approval to participate. And, These two are open to participants who are at least 18 years old according to the terms and uses. Many Weblate projects are open to everyone including kids.

  • Comparison of Translation Platform UI : Weblate can display all translation rates at 1 page. Excellent. However, the list of translated strings is very poor. Transifex and Crowdin can display all strings and translation history, comments, etc. in 1 page via Ajax. Weblate requires one window to be open to view the comments associated with a single string. So, Huge number of windows open to To view history and others' comments in Weblate.

Leakage of personal data

GDPR defines email addresses as personal data. Weblate has already output mail addresses to Github and it is difficult to erase all of it. This is worse when combined with the GDPR right to erase data. guardianproject/orbot#858 See also detail

The project has been offered to migrate from Transifex to Weblate and the contributors' email addresses have been leaked. blueman-project/blueman#1149

Revision

  • Weblate strings indeed increased from 10,000 strings in 2023 to 160,000 in 2024. Sorry, I haven't checked much about 2024 later.
@maboroshin
Copy link
Owner Author

maboroshin commented Aug 17, 2023

Repeatedly subjective spammed without data or citation. Transifex and Crowdin are bad. Weblate is the best. SponsorBlock has launched DeArrow. The goal is to correct hype. Thus, People are sensitive to this.

Since last time, Transifex updated their terms, which I found to be more egregious than what was previously pretty far fetched. Also, Weblate got a lot better, and has none of those problems.
jonasoreland/runnerup#907

Clearly rejected. Changing translation platforms could result in loss of existing history and contributors.


and rather use a tool that allows spanning multiple languages and strings quickly and at a glance.
NightscoutFoundation/xDrip#3420

Then introduce Weblate. But the fact is that Weblate separates listings and multiple languages. The original service (Crowdin) combined these into one window.


Transifex is broken ... I still discover new bugs, daily. rakekniven/Transifex-Issues#19 (Locked)

Weblate has no bugs? I have also filed many bug reports for Weblate.


but can't use Transifex since it takes ownership in my work doing so. It is also a bad closed source tool.
LibrePCB/LibrePCB#615

It is countered that the ownership issues he mentions also occur in Weblate.


Crowdin has a very clunky UI for getting anything done, and things get held up in their voting system. With fewer libre software translators on Crowdin, and good ones fewer still.
ajayyy/SponsorBlock#343

Rejected. Crowdin's history listing and voting system is easy to use. Details. Strange translations are eliminated with a negative rating in Japanese Translation of SponsorBlock. Crowdin seems to have a lot of users. I don't know what he is based on any data.


Quality on Crowdin is poor, and it has few translators. microCOVID/microCOVID#670

with very few libre software projects left on Crowdin. ukanth/afwall#1184

@maboroshin
Copy link
Owner Author

maboroshin commented Aug 17, 2023

There were a total of 60 Weblate spams, Nearly 90% have not migrated. He sees privacy as a problem. I don't understand why he would allow email address to be published.

Not creating an issue (No counting) * https://github.com/ferdium/ferdium-app/issues/185#issuecomment-1150312066 * https://github.com/danth/transfer/issues/9#issuecomment-1132800170

@orangesunny
Copy link

Hello @maboroshin, Benjamin from Weblate here.

I would like to offer essential information. The Weblate team do not have anything to do with this effort. I got a notification about your comment in blueman-project/blueman#1149 (comment) as I contributed to the discussion there. It guided me here, and I left a new comment there, too.

Suggesting a tool in an open-source project repository does not seem like a spam. The community and the maintainers can reject the suggestion, close the issue, even lock it or mark it as SPAM if they feel it.

Keeping a good attitude in discussions can be a tricky discipline sometime. Everyone should be respectful, no one should be forced to do or use something; it is not about winning. If someone’s offensive, you can tell them and/or report them.

@comradekingu is a translator, Weblate user, a fan of open source. They do not work for Weblate, and we never asked him to “advertise” Weblate this way; we do not advertise Weblate at all.

Everyone is free to use Weblate as it is a libre software. We welcome new open source projects the same way we welcome new commercial customers – by a word of a current satisfied Weblate user.


Signing commits by e-mail addresses is a feature, it is how Git works. Open repositories are a feature of open-source software. To top of that:

If you want, @maboroshin, we can connect and chat about anything Weblate, this topic included. Our care (at) weblate dot org address seems like a good place to start for that. We also use Discussions.

@orangesunny
Copy link

from the initial post:

... Weblate is open to everyone and there is no access control for free plans.

I am not going to fact-check all the issues in this comparison, but this is not correct. Access control is available in a free plan, but yes, you can’t limit the visibility of the project; it’s open-source after all. Any project owner/admin can prevent a user from contributing to their project, etc.

More here: https://docs.weblate.org/en/latest/admin/access.html#project-access-control

https://rocketreach.co/weblate-profile_b5ebc47af42e8635

This seems more like a collection of things randomly scraped from the internet than a reliable source. The information is incorrect, and we are not affiliated with it.

Weblate numbers in the sheet are not correct, you can’t count all projects using Weblate.
There are instances of Weblate unreachable for the public, running in company intranets, etc. Even Discover Weblate is a voluntary, beta feature, not all Weblate instances hosting open-source projects are shown there.

@maboroshin
Copy link
Owner Author

maboroshin commented Aug 17, 2023

Spam

I understand it. I guess it is not the Weblate team that is doing this. If there is a monetary award, it is prohibited by the state as stealth marketing.


Leak

At the end of 2022 we can choose not to have Weblate publish our e-mails. This is due to the following concerns about leaks. However, published emails are still published on each Github commit. Under GDPR, email addresses are personal data. Anyone can view that personal data. Personal data is exposed outside of Weblate and can't be erased.

pappasadrian said

This leads to leaks of personal information publically, by including addresses in public code. This is not respectful towards translators who contribute, and potentially illegal in some jurisdictions (Especially considering that there's no specific consent given. Everyone is opted-in by default, with no option to opt out).
... not a clear option for people who don't want their personal information leaked.
WeblateOrg issue#6508

p0358 said

This way weblate instance owner could ensure no e-mails would get leaked by default.
WeblateOrg PL#6508

However not everyone registering on Weblate with the will to translate might be aware their e-mail could leak in the commit metadata, as such someone could still do that accidentally, and the installation owner currently has no easy way of ensuring it doesn't happen.
WeblateOrg issue#8451

(PS)
The Weblate Terms of Service states:

The User agrees to use of own name and e-mail as authorship in the VCS commits. The User understands that this grant is non revocable due to nature of the VCS.

There is no mention about disclosure of email addresses. Before 2022, no one has selected the setting "make my email address public". There would have been no explicit consent.

(End of PS)

Access controll

Please read carefully the page you referred to.

Projects running the gratis Libre plan on Hosted Weblate are always Public. You can switch to the paid plan if you want to restrict access to your project.
https://docs.weblate.org/en/latest/admin/access.html#project-access-control

Reliable source

I'm sorry. Please let me know if you have any reliable sources. I will replace it.

@orangesunny
Copy link

orangesunny commented Aug 17, 2023

Feels to me like you are trying to solve problems when there are no problems. Thus, I am unable to help.

  • “Spam“: I will try once again, for the last time as I do not have anything to add: Weblate team does not have any connection to such efforts. We did not ask for it, we did not pay for it, we are not responsible for the actions of independent individuals. If anyone feels it is marketing or SPAM, they can use the button GitHub has for such reports. This feature enables any GitHub user to deal with SPAM. We do not hold any power over this.

  • “Leak“: It is a Git feature as it was a feature before the end of 2022. The Weblate option to use generated signature address does not change anything. It was always correctly described in the terms. We also can’t change the history of anyone’s repository.

  • Access control: I know what is written there, I based my previous comment on the article. I also administer Hosted Weblate, where we adjust access control for projects with gratis Libre hosting. Do you feel like anything should be written differently?

  • Reliable source:
    No worries. All needed information about the core Weblate team is available at https://weblate.org/about/. All contributors, including developers, translators, documentation writers, etc., are visible at https://github.com/WeblateOrg/weblate/graphs/contributors and in respective repositories.

@maboroshin
Copy link
Owner Author

maboroshin commented Aug 17, 2023

Leak: Anyone can see contributor's email address. Some call it a feature, some call it a leak. I quoted others' statements above.

Access control: The documentation should say that access control is available even with the free Libre plan. With the current documentation, many administrators would also conclude that access control is not possible.

Reliable source: It doesn't seem to have the information I need on it. I will try to search again later .


PS

It was possible to block users from the project after mid-2021 (version 4.7). It was mentioned in another heading scrolling down 3 pages. I will correction to the description. I appreciate your explanation.

@comradekingu
Copy link

No, only you call it a leak when it isn't.
E-mail addresses in Git commits is the norm, hiding the real e-mail address is a feature.
Zero admins have actual problems with access control. You convinced yourself instead of actually gaining the experience or listening to advice.
Send patches containing the changes you want implemented, that is how it works.
It is plainly obvious your numbers are wrong, and that is a Weblate employee telling you as much.

@maboroshin
Copy link
Owner Author

maboroshin commented Aug 17, 2023

I'm not the only one. already cited.

Access control is a feature found in Transifex, Crowdin, and Weblate. Is this a feature that is not in demand? It's a feature used: WeblateOrg/weblate#3548 WeblateOrg/weblate#8470

If you know the correct information published, I will change to it.


https://rocketreach.co/weblate-profile_b5ebc47af42e8635
This seems more like a collection of things randomly scraped from the internet than a reliable source. The information is incorrect, and we are not affiliated with it.

All data has been changed to Linkedin data. Compared to rocketreach, the number of employees in Weblate remains unchanged at 4. The overall description is the same.

@comradekingu
Copy link

Why would it be strange for the last few libre software projects actually using C or TX may not be interesting in putting in effort to move?
It isn't as if they are not finding the same limitations and came to the same conclusions I do.
Similarly, there aren't any success stories that I know of.

Crowdin for example has lots of employees that only do marketing, while pretending otherwise.
Somehow it is negative for me to counteract this sneaky marketing that prays on the good name of
libre software…!?
You would like to think projects shifting over is all my doing? It isn't.

To deal with what seems the remaining question you have:
Weblate has lots of access control, not all of which is exposed on Hosted Weblate in the libre plan.
Could it be that I have actually used all three platforms extensively?

@comradekingu
Copy link

Since you keep linking to this as "objective" without detailing it is your own creating, and calling it "research".
Just to start somewhere. The word-limit isn't 10k strings, nor is it for libre projects.
Would it be too much to ask to be able to read a chart before attempting to make one?

The link is right where you posted it, but it bears repeating
https://weblate.org/en/hosting/#hosted

bilde

NightscoutFoundation/xDrip#3420 (comment)
Skjermavbildning fra 2024-08-31 13 25 11

You evidently don't know how the platforms work, and that is fine.

You haven't read the terms and conditions, cookie policy, etc. of either TX or C, and instead put a question-mark for their respective entries in "Leakage of personal data".

https://support.crowdin.com/privacy-policy/
https://www.transifex.com/legal/terms/

It is a spaghetti mess of multiple documents and potentially indemnifying statements mixed in with a lot of caveats to explain the actual behavior in what "may" happen, by design.

It doesn't help that something only available as a service makes one completely dependent on these terms, and their continual changes.
Yes, I have read everything there multiple times, and have detailed it for people many times.
I don't think that is quite productive to do for your level of understanding.

for contrast, for those playing at home https://weblate.org/en/terms/

@maboroshin
Copy link
Owner Author

maboroshin commented Aug 31, 2024

It is good news that the number of strings has increased to more than 10,000. But certainly in 2023 it's 10,000 strings for Libre softwares.
http://web.archive.org/web/20230810032325/https://weblate.org/ja/hosting/#hosted


If there is a fact of personal data leakage, please tell me the data. I can see the email addresses of Weblate users on GitHub. Issues were continually being made regarding Weblate's handling of personal information. Because, Weblate was not strictly compliant with GDPR. Recent one:


Even the cookie policy will be clearly stated in order to comply with GDPR.

There is no transparency as to what Weblate is actually doing. Transifex and Crowdin are strict with GDPR. But, You fear that the document is long according to GDPR. You are fantasy and not specific.

@nijel
Copy link

nijel commented Sep 1, 2024

It is a spaghetti mess of multiple documents and potentially indemnifying statements mixed in with a lot of caveats to explain the actual behavior in what "may" happen, by design.

You usually end up with that kind of mess as the legal environment evolves, and you try to update the terms without rewriting the whole of them. This doesn't necessarily have to be a bad thing. On the other side, we at Weblate have ignored some legal changes in the past years, and we will soon have a wholly new terms of service because of that. These will be much more complex than the current ones, even if I fight with the lawyers to make them as simple as possible.

It is good news that the number of strings has increased to more than 10,000. But certainly in 2023 it's 10,000 strings for Libre softwares.

We've moved from counting source strings to hosted strings, so these numbers have changed, but the numbers also count different things. We allowed 10,000 apples before, and now we allow 160,000 cherries. It might be better for some projects, but it also might be worse for others, but we believe it's a more fair way of estimating translation project size. On the other side, we would rather not use the words metric (what Transifex and Crowdin use) because it's pretty ambiguous for many languages.

And importantly, you can choose not to use cookies at first. This is called the "opt-in". The opt-in is that it's GDPR compliant.

You can only choose not to use the non-essential cookies. Weblate uses essential cookie only (for session/authentication). These services have a complex cookie policy because they heavily depend on third-party cookies. There is no need for a complex cookie policy when the service uses a single cookie (the second cookie we've used so far is just about to be removed in WeblateOrg/weblate#12383).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants