Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make better use of i18n #20096

Merged
merged 36 commits into from
Jun 26, 2022
Merged

Make better use of i18n #20096

merged 36 commits into from
Jun 26, 2022

Conversation

Gusted
Copy link
Contributor

@Gusted Gusted commented Jun 23, 2022

  • Previously, we relied on performing numerous expensive map lookups and storing each parsed ini file in memory.
    As we add more translations and languages, this strategy won't scale and has become visible and measurable in memory utilization. All parsed ini currently uses about 20-30Mb of memory, which is nothing on modern computers. Every bit counts when running Gitea on low-powered devices like Raspberry Pis and Android phones. This PR fixes that by using a new data structure that don't require more memory than is necessary and is scalable for the future.
  • Startup memory is now closer to 60Mb than 90Mb, making it easier to stay under 100Mb. The new implementation needs 7Mb to hold a single map for translation key -> index and for each locale a maps index -> translation value.
  • The lookup for translation values may be greatly improved, leading to faster performance overall, even though I haven't performed any tests or benchmarks comparing the current and earlier implementations. Additionally, the language struct, which contains the important map for index -> translation value, so you don't have to do another map lookup to get the locale struct.

Before:

image

25Mb to initialize the locales(After Go's GC)
image

After:

image

7Mb to initialize the locales(After Go's GC)
image


As usual, goodluck reviewing this piece of beauty and mess.

@Gusted Gusted added this to the 1.18.0 milestone Jun 23, 2022
@Gusted Gusted added performance/memory Performance issues affecting memory use performance/cpu type/enhancement An improvement of existing functionality labels Jun 23, 2022
@wxiaoguang
Copy link
Contributor

I think it's on the correct direction for the i18n refactoring.

The only question in my mind is: is the perfect hash map necessary? If the simple map can still save the memory and works fine, I would prefer to use a simple map (like map[string][]string => map[key][offset]) to reduce maintenance work.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jun 23, 2022
@Gusted
Copy link
Contributor Author

Gusted commented Jun 24, 2022

is the perfect hash map necessary? If the simple map can still save the memory and works fine, I would prefer to use a simple map (like map[string][]string => map[key][offset]) to reduce maintenance work.

So, I'd preferred the perfect hash as it was made for these kind of situations, need fast query and the keys are known before-hand. Thereby there was no really good quality perfect hash library(so that's why I crafted one and included in the PR). I've pushed a commit(d498472) that utilized the same concept of the language offset and then instead use golang's Map feature. The memory is bit more on the heap(it says 10Mb) but the overall memory usage still seems to be around the same levels of what the previous implementation had.

6bfa1ba actually is the one that moves back the original data structure that is optimized for memory.

Gusted added 3 commits June 24, 2022 03:27
- Uses less memory by creating for each language a map.
@wxiaoguang
Copy link
Contributor

L-G-T-M (except CI failure is related)

And, could the translationKeys be set to nil to save some memory? It seems that it's not used after the loop IIRC.

@delvh
Copy link
Member

delvh commented Jun 24, 2022

--- FAIL: TestLDAPUserSigninFailed (0.30s)
testlogger.go:78: 2022/06/24 01:51:14 modules/git/git.go:133:checkInit() [W] git module has been initialized already, duplicate init should be fixed

That issue once again...

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Jun 24, 2022

--- FAIL: TestLDAPUserSigninFailed (0.30s)
testlogger.go:78: 2022/06/24 01:51:14 modules/git/git.go:133:checkInit() [W] git module has been initialized already, duplicate init should be fixed

That issue once again...

Off-topic Unrelated .... just a warning, no harm. It might be resolved after another big git module refactoring.

Copy link
Member

@delvh delvh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, I got the gist of what you're doing.
If it works, LGTM.

modules/translation/i18n/i18n.go Outdated Show resolved Hide resolved
modules/translation/i18n/i18n.go Outdated Show resolved Hide resolved
modules/translation/i18n/i18n.go Outdated Show resolved Hide resolved
modules/translation/translation.go Outdated Show resolved Hide resolved
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Jun 24, 2022
zeripath and others added 4 commits June 25, 2022 20:00
- Separate dev and production locale stores.
- Allow for live-reloading in dev mode.

Co-authored-by: zeripath <[email protected]>
@Gusted
Copy link
Contributor Author

Gusted commented Jun 25, 2022

Pushed commits to separate dev and production localestores(so it's easier to implement live-reloading etc.). Provided by @zeripath

@wxiaoguang
Copy link
Contributor

I think some code can be simplified, will try on it later.

@Gusted
Copy link
Contributor Author

Gusted commented Jun 25, 2022

I think some code can be simplified, will try on it later.

Even more refactors 😄 I will mark this PR for 1.19 already

@wxiaoguang
Copy link
Contributor

I believe 1.18 is fine (will try on it later means several hours later, I am going outside soon 😂) .

@wxiaoguang
Copy link
Contributor

I proposed a PR about my idea. Gusted#5

Major benefits:

  • more simple (+185 −509)
  • no more go-routine, no need to close (in case it's forgotten ...)
  • no duplicate code, only one Tr / reloadLocaleByIni function, more easier to be maintained
  • works fine for dev mode, the reload delay is (at most) 1 second, it won't affect development in real life: change locale files -> refresh page to see the result -> more than 1 second has been elapsed.

@Gusted Gusted added type/refactoring Existing code has been cleaned up. There should be no new functionality. and removed type/enhancement An improvement of existing functionality labels Jun 26, 2022
Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(approve, although I also made some changes, open for suggestions)

@lunny lunny merged commit 5d3f99c into go-gitea:main Jun 26, 2022
@Gusted Gusted deleted the wow-maths branch June 26, 2022 14:22
zjjhot added a commit to zjjhot/gitea that referenced this pull request Jun 27, 2022
* giteaofficial/main:
  Add spacing between the properties of the key (go-gitea#20145)
  Remove U2F support (go-gitea#20141)
  Make better use of i18n  (go-gitea#20096)
Gusted pushed a commit to Gusted/gitea that referenced this pull request Jun 27, 2022
- Currently we're using the `i18n` variable naming for the `locale`
struct. This contains locale's specific information and cannot be used
for general i18n purpose, therefore refactoring it to `locale` makes
more sense.
- Ref: go-gitea#20096 (comment)
@Gusted Gusted mentioned this pull request Jun 27, 2022
jolheiser pushed a commit that referenced this pull request Jun 27, 2022
* Refactor `i18n` to `locale`

- Currently we're using the `i18n` variable naming for the `locale`
struct. This contains locale's specific information and cannot be used
for general i18n purpose, therefore refactoring it to `locale` makes
more sense.
- Ref: #20096 (comment)

* Update routers/install/install.go
vsysoev pushed a commit to IntegraSDL/gitea that referenced this pull request Aug 10, 2022
* Prototyping

* Start work on creating offsets

* Modify tests

* Start prototyping with actual MPH

* Twiddle around

* Twiddle around comments

* Convert templates

* Fix external languages

* Fix latest translation

* Fix some test

* Tidy up code

* Use simple map

* go mod tidy

* Move back to data structure

- Uses less memory by creating for each language a map.

* Apply suggestions from code review

Co-authored-by: delvh <[email protected]>

* Add some comments

* Fix tests

* Try to fix tests

* Use en-US as defacto fallback

* Use correct slices

* refactor (go-gitea#4)

* Remove TryTr, add log for missing translation key

* Refactor i18n

- Separate dev and production locale stores.
- Allow for live-reloading in dev mode.

Co-authored-by: zeripath <[email protected]>

* Fix live-reloading & check for errors

* Make linter happy

* live-reload with periodic check (go-gitea#5)

* Fix tests

Co-authored-by: delvh <[email protected]>
Co-authored-by: 6543 <[email protected]>
Co-authored-by: wxiaoguang <[email protected]>
Co-authored-by: zeripath <[email protected]>
vsysoev pushed a commit to IntegraSDL/gitea that referenced this pull request Aug 10, 2022
* Refactor `i18n` to `locale`

- Currently we're using the `i18n` variable naming for the `locale`
struct. This contains locale's specific information and cannot be used
for general i18n purpose, therefore refactoring it to `locale` makes
more sense.
- Ref: go-gitea#20096 (comment)

* Update routers/install/install.go
@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. performance/memory Performance issues affecting memory use type/refactoring Existing code has been cleaned up. There should be no new functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants