server : (web ui) Various improvements, now use vite as bundler #10599

ngxson · 2024-11-30T12:09:50Z

Motivation

The new web UI has received significantly more positive feedback than anticipated, prompting consideration for further enhancements.

Currently, we operate without a bundler, running code directly from index.html and 3rd party libraries from deps.sh. However, this approach has limitations, particularly with daisyui component compatibility. It also leave a big binary size with many redundant parts inside.

Given that many llama.cpp contributors have a lower-level programming background, here's a brief overview of the current UI tech stack and why we need to use them:

Tailwindcss: A CSS framework that simplifies styling, used by major platforms like ChatGPT, Claude, and Hugging Face
Daisyui: A tailwindcss-based component library offering ready-to-use elements like chat bubbles, buttons, and themes
Due to the large size of these libraries (up to 2MB for pre-compiled versions), we need vite as a bundler to eliminate unused code. This also enables compilation into a single .html file, eliminating runtime dependencies.

Improvements

Key updates in this PR:

Project relocated to server/webui using npm for dependency management (with deps.sh script removed)
Integration of vite bundler (build instructions in server/README.md). Output index.html size is just under 500kb, which is smaller than the old approach with deps.sh (Remind: this new index.html contains everything needed)
Enhanced mobile compatibility (screenshots below)

For binary size, this PR recudes 1MB of the final compiled binary compared to master:

# master
$ ls -lah llama-server
-rwxr-xr-x  1 ngxson  staff   5.5M Nov 30 15:02 llama-server

# PR
-rwxr-xr-x  1 ngxson  staff   4.5M Nov 30 14:52 llama-server

|

.github/workflows/server.yml

ggerganov · 2024-12-01T11:48:39Z

Do I need to do any extra steps? I pulled your branch and run llama-server like this:

cd build
make -j && ./bin/llama-server -m ../models/qwen2.5-32b-instruct/ggml-model-q8_0.gguf

...

0.01.190.151 I srv          init: initializing slots, n_slots = 1
0.01.190.163 I slot         init: id  0 | task -1 | new slot n_ctx_slot = 4096
0.01.190.233 I main: model loaded
0.01.190.255 I main: chat template, built_in: 1, chat_example: '<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
'
0.01.190.257 I main: server is listening on http://127.0.0.1:8080 - starting the main loop
0.01.190.259 I srv  update_slots: all slots are idle

But it renders like this in the browser:

If I go back to master, it renders OK.

ngxson · 2024-12-01T14:13:50Z

@ggerganov Could you try make clean? In the initial PR, some users reported that the generated .hpp files are not up-to-date. I added some rules to clean it in Makefile

	find examples/server -type f -name "*.js.hpp"   -delete
	find examples/server -type f -name "*.mjs.hpp"  -delete
	find examples/server -type f -name "*.css.hpp"  -delete
	find examples/server -type f -name "*.html.hpp" -delete

slaren · 2024-12-01T17:49:50Z

I had the same issue as @ggerganov. Running make clean and reconfiguring the cmake build fixed it for me.

However, just reconfiguring cmake is not enough to fix it. That should be fixed, or it is going to be very confusing.

ggerganov · 2024-12-01T18:09:43Z

Yes, make clean in the root folder and then reconfiguring the cmake build did it.

ngxson · 2024-12-01T21:48:17Z

I think I have found a fix, after playing around a bit more with Makefile's .PHONY and cmake's set_source_files_properties (Not sure if this is the correct way, but feel free to suggest other improvements)

ggerganov · 2024-12-02T08:14:40Z

The CMake build still requires to run make clean in the root folder if you have used the Makefile build previously. Can we update the CMake to do the find ... -delete stuff? We will soon remove the Makefile build so no need to worry about it, but the CMake build is affected as it is.

ngxson · 2024-12-02T09:39:19Z

I added a pre-build step to remove the generated .hpp files

pothitos · 2024-12-04T15:39:14Z

Many thanks for the new awesome front-end 🙏

I added a pre-build step to remove the generated .hpp files

Unfortunately, this change causes llama-server to recompile every time you run CMake, even when no files have changed.

I don't know if it's correct to write here this comment or should I open an issue?

ngxson · 2024-12-04T17:28:47Z

I don't observe the same behavior on my side (running cmake --build the second time does not rebuild everything)

cmake --build build -j --target llama-server
[  4%] Built target build_info
[ 20%] Built target ggml-base
[ 28%] Built target ggml-metal
[ 32%] Built target ggml-blas
[ 52%] Built target ggml-cpu
[ 56%] Built target ggml
[ 72%] Built target llama
[ 92%] Built target common
[100%] Built target llama-server

slaren · 2024-12-04T17:37:41Z

It does rebuild server.cpp, since it depends on the generated file. You should be able to see it with -v. I think the cleaner solution would be to move all the generated files to the cmake directory, and do not delete them. cmake should be able to regenerate them automatically when necessary, and removing the build directory always results in a clean start.

I imagine that the reason it doesn't work like that in the first place is because we needed to support the Makefile, but that's no longer the case.

…ganov#10599) * hide buttons in dropdown menu * use npm as deps manager and vite as bundler * fix build * fix build (2) * fix responsive on mobile * fix more problems on mobile * sync build * (test) add CI step for verifying build * fix ci * force rebuild .hpp files * cmake: clean up generated files pre build

ngxson added 2 commits November 30, 2024 11:20

hide buttons in dropdown menu

116fc4e

use npm as deps manager and vite as bundler

059a755

github-actions bot added examples devops improvements to build systems and github actions server labels Nov 30, 2024

ngxson added 4 commits November 30, 2024 13:13

fix build

c74d1d8

fix build (2)

2eda9b4

fix responsive on mobile

9e7af64

fix more problems on mobile

c26cf38

ngxson mentioned this pull request Nov 30, 2024

Server front-end: Improve UI on mobile #10589

Closed

4 tasks

ngxson marked this pull request as ready for review November 30, 2024 13:57

ngxson requested a review from ggerganov November 30, 2024 13:57

sync build

6c4305f

slaren reviewed Nov 30, 2024

View reviewed changes

.github/workflows/server.yml Show resolved Hide resolved

ngxson added 2 commits November 30, 2024 15:12

(test) add CI step for verifying build

b940cc8

fix ci

753bcce

force rebuild .hpp files

e116f59

cmake: clean up generated files pre build

591f515

ngxson mentioned this pull request Dec 2, 2024

Feature Request: Add "tokens per second" information in the Web UI #10502

Closed

4 tasks

ggerganov approved these changes Dec 3, 2024

View reviewed changes

ngxson merged commit 91c36c2 into ggerganov:master Dec 3, 2024
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : (web ui) Various improvements, now use vite as bundler #10599

server : (web ui) Various improvements, now use vite as bundler #10599

ngxson commented Nov 30, 2024 •

edited

Loading

ggerganov commented Dec 1, 2024

ngxson commented Dec 1, 2024 •

edited

Loading

slaren commented Dec 1, 2024 •

edited

Loading

ggerganov commented Dec 1, 2024

ngxson commented Dec 1, 2024 •

edited

Loading

ggerganov commented Dec 2, 2024

ngxson commented Dec 2, 2024

pothitos commented Dec 4, 2024

ngxson commented Dec 4, 2024 •

edited

Loading

slaren commented Dec 4, 2024 •

edited

Loading

server : (web ui) Various improvements, now use vite as bundler #10599

server : (web ui) Various improvements, now use vite as bundler #10599

Conversation

ngxson commented Nov 30, 2024 • edited Loading

Motivation

Improvements

ggerganov commented Dec 1, 2024

ngxson commented Dec 1, 2024 • edited Loading

slaren commented Dec 1, 2024 • edited Loading

ggerganov commented Dec 1, 2024

ngxson commented Dec 1, 2024 • edited Loading

ggerganov commented Dec 2, 2024

ngxson commented Dec 2, 2024

pothitos commented Dec 4, 2024

ngxson commented Dec 4, 2024 • edited Loading

slaren commented Dec 4, 2024 • edited Loading

ngxson commented Nov 30, 2024 •

edited

Loading

ngxson commented Dec 1, 2024 •

edited

Loading

slaren commented Dec 1, 2024 •

edited

Loading

ngxson commented Dec 1, 2024 •

edited

Loading

ngxson commented Dec 4, 2024 •

edited

Loading

slaren commented Dec 4, 2024 •

edited

Loading