Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues search result is incorrect #27540

Closed
mei-rune opened this issue Oct 9, 2023 · 10 comments · Fixed by #27554
Closed

Issues search result is incorrect #27540

mei-rune opened this issue Oct 9, 2023 · 10 comments · Fixed by #27554

Comments

@mei-rune
Copy link
Contributor

mei-rune commented Oct 9, 2023

Description

Issues search result is incorrect, 搜索结果不正确,如下图中有记录包含"固资"
1696852157572

search results, 我用 "固资" 搜索时返回空

1696852220814

https://xxxx/xxxxx/test/issues?type=all&state=open&labels=&milestone=0&project=0&assignee=0&poster=0&q=%E5%9B%BA%E8%B5%84

日志中无出错记录。

Gitea Version

1.21.0+rc0-44-g7ea7f2b37 built with GNU Make 4.3, go1.21.1 : bindata, sqlite, sqlite_unlock_notify

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

2.30.0, Wire Protocol Version 2 Enabled

Operating System

windows 2016

How are you running Gitea?

run as windows service

Database

SQLite

@lng2020
Copy link
Member

lng2020 commented Oct 9, 2023

Do you enable another search engine or just using the default bleve engine?

@mei-rune
Copy link
Contributor Author

mei-rune commented Oct 9, 2023

I am also a programmer and proficient in Go, able to compile Gitea myself, but I am unfamiliar with Gitea and need someone to teach me where to add logs so that I can debug it.

Alternatively, can anyone tell me how to print the SQL execution logs of the database so that I can analyze if there are any SQL errors?

我也是程序员, 也会 golang, 可以自已编译 gitea, 但我对 gitea 不了解, 需要有人教我在什么地方加日志, 这样我可以调试一下

或者谁能告诉我如何打印数据库的 sql 执行日志, 我好分析一下是不是 sql 错了

@mei-rune
Copy link
Contributor Author

mei-rune commented Oct 9, 2023

Do you enable another search engine or just using the default db engine?

No enable another search engine, and just using the default db engine
没有用外部搜索, 仅用了缺省的。

@mei-rune
Copy link
Contributor Author

mei-rune commented Oct 9, 2023

2023/10/09 20:49:21 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:49:21 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:49:26 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:49:26 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:49:26 ...queue/workergroup.go:174:func1() [D] Queue "issue_indexer" stops idle worker
2023/10/09 20:49:26 ...queue/workergroup.go:143:func1() [D] Queue "issue_indexer" starts new worker
2023/10/09 20:49:32 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:49:32 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:49:35 ...queue/workergroup.go:174:func1() [D] Queue "issue_indexer" stops idle worker
2023/10/09 20:49:40 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:49:40 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:49:45 ...queue/workergroup.go:143:func1() [D] Queue "issue_indexer" starts new worker
2023/10/09 20:49:45 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:49:45 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:49:51 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:49:51 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:49:54 ...queue/workergroup.go:174:func1() [D] Queue "issue_indexer" stops idle worker
2023/10/09 20:49:55 ...queue/workergroup.go:143:func1() [D] Queue "issue_indexer" starts new worker
2023/10/09 20:49:56 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:49:56 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:50:01 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:50:01 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:50:05 ...queue/workergroup.go:174:func1() [D] Queue "issue_indexer" stops idle worker
2023/10/09 20:50:05 ...queue/workergroup.go:143:func1() [D] Queue "issue_indexer" starts new worker
2023/10/09 20:50:06 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:50:06 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:50:06 ...queue/workergroup.go:174:func1() [D] Queue "issue_indexer" stops idle worker
2023/10/09 20:50:11 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:50:11 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:50:16 ...queue/workergroup.go:143:func1() [D] Queue "issue_indexer" starts new worker
2023/10/09 20:50:16 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:50:16 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:50:21 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:50:21 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:50:25 ...queue/workergroup.go:174:func1() [D] Queue "issue_indexer" stops idle worker
2023/10/09 20:50:25 ...queue/workergroup.go:143:func1() [D] Queue "issue_indexer" starts new worker
2023/10/09 20:50:26 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:50:26 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded
2023/10/09 20:50:31 ...dexer/issues/util.go:170:pushIssueIndexerQueue() [W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.
2023/10/09 20:50:31 ...er/issues/indexer.go:243:PopulateIssueIndexer() [W] Retry to populate issue indexer for repo 20: context deadline exceeded

@mei-rune
Copy link
Contributor Author

mei-rune commented Oct 9, 2023

GetIssueIDsByRepoID() return context deadline exceeded

@lng2020
Copy link
Member

lng2020 commented Oct 9, 2023

Alternatively, can anyone tell me how to print the SQL execution logs of the database so that I can analyze if there are any SQL errors?

https://docs.gitea.com/next/administration/config-cheat-sheet?_highlight=cheat#log-log for print detailed logs and

;; Show the database generated SQL

for SQL query logs.

@mei-rune
Copy link
Contributor Author

mei-rune commented Oct 9, 2023

问题解决了, 原因是下面的函数

func updateRepoIndexer(ctx context.Context, repoID int64) error {
ids, err := issue_model.GetIssueIDsByRepoID(ctx, repoID)
if err != nil {
return fmt.Errorf("issue_model.GetIssueIDsByRepoID: %w", err)
}
for _, id := range ids {
if err := updateIssueIndexer(id); err != nil {
return err
}
}
return nil
}

我们有一个 repo 中的 issue 很多有 6000 多个, GetIssueIDsByRepoID() 后, 将每一个 id 作为参数调用 updateIssueIndexer 时出错了, 返回了 context.DeadlineExceeded
出错行在 gitea\modules\queue\base_channel.go:66 行,如下

case <-time.After(pushBlockTime):
	return context.DeadlineExceeded

出错原因是 queue 满了, 要命的是 updateRepoIndexer() 会不停的被重新调用,不停地出错。

解决方法是配置中添加下面的行就好了, 只要确保 queue length 大于 repo 中的 issue 数目一般就不会出错

[queue.issue_indexer]
LENGTH = 100000

总结, 是我配置的不合理, 但是我仍然认为 这是一个 bug, 因为它的后果太严重了

改进代码如下

func updateRepoIndexer(ctx context.Context, repoID int64) error {
	ids, err := issue_model.GetIssueIDsByRepoID(ctx, repoID)
	if err != nil {
		return fmt.Errorf("issue_model.GetIssueIDsByRepoID: %w", err)
	}
	for _, id := range ids {
 		for {
 			err := updateIssueIndexer(id)
			if err == nil {
			 	break
			}
			if !errors.Is(context.DeadlineExceeded) {
				return err
			}
            
            //  在这里加入重试, 同时 check ctx.Done() 就不会影响退出。
			select {
			case <- ctx.Done():
				return err
			case <- time.After(1* time.Minute):
			}
		}
	}
	return nil
}

@lunny
Copy link
Member

lunny commented Oct 9, 2023

@wolfogre Could you have time to take a look at this?

@wolfogre
Copy link
Member

wolfogre commented Oct 10, 2023

The root problem is that indexing all issues of a repository could be a big task, and it could cause a blocking. So we use a queue to cache the task. But what if the queue was full? I noticed this case and added the warning log in the last refactoring to help users know this (the old code just ignore it):

[W] It seems that issue indexer is slow and the queue is full. Please check the issue indexer or increase the queue size.

So the best solution is increasing the size of queue (just like the log said).

@mei-rune Thanks for your advice. I have indeed found some logic that could be improved. I will submit a patch later. Please note that if the queue size is insufficient, index loss is still possible.


BTW, @lng2020 please note that the default issue indexer is not DB, it's bleve.

@mei-rune
Copy link
Contributor Author

@wolfogre 你的理解正确, 我希望不能因为我的 queue 长度导致这么大问题, 我发了上面的补丁尝试解决问题,我对 gitea 架构不熟悉, 仅解决了眼前的问题, 不确定对全局是否有影响。

This was referenced Oct 10, 2023
silverwind pushed a commit that referenced this issue Oct 10, 2023
Backport #27555 by @wolfogre

It should be OK to increase the default queue length since the default
type is "level".

IMO, the old default length (100) is a little too small. See
#27540 (comment)

IIRC, a larger length could lead to more memory usage only when the type
is "channel," but it's an obscure case. Otherwise, it's just a limit
(for "level" or "redis").

Co-authored-by: Jason Song <[email protected]>
techknowlogick pushed a commit that referenced this issue Oct 15, 2023
GiteaBot pushed a commit to GiteaBot/gitea that referenced this issue Oct 15, 2023
lunny pushed a commit that referenced this issue Oct 16, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants