Shell

如果你是資訊人，不要跟我說你沒有用過 Shell 。

既然愛過用過 Shell ，為何不嘗試深入的理解他呢？本文會先從 shell 的基本概念談到基本操作，最後在透過查看開源專案原始碼學習 shell 的設計原理。

進入正題

Shell 是一套能夠監聽使用者命令、解析命令再告知作業系統核心完成命令 (System call) 的工具。因為這套工具是利用 OS kernel 提供的函式完成作業的，所以可以把它想像成 OS kernel 的外殼。

Shell 的基本操作

一般來說， Shell 的運作流程如下：

Prompt ($)
使用者輸入命令
依照路徑尋找命令的位置
產生 Child process 完成命令
輸出結果或是錯誤運行
再次出現 Prompt

萬用字元

通常來說，系統都會支援以下萬用字元

* -- 表示不限長度的任何字元
```
cat *.html
```
? -- 表示任一字元
```
cat in?ex.html
```
[] -- 表示中括號內的其中一個字元
```
cat index[12].html
```

特殊字元

\ -- 取消後面字元的意義假設我希望印出 * 字元的內容，就不能直接使用 * ，因為該符號已經是萬用字元了。這時候，可以這樣做：
```
echo \* star \*
```
' -- 取消字串中的特殊意義
" -- 取消字串中除了 $ 、 、 、 ' 與 \ 的特殊意義
` -- 執行字串中的命令

file descriptor

請直接參考本系列的檔案系統篇，該文已經探討過什麼是 File descriptor 。

pipe

pipe 可以讓使用者連結多個命令，參考以下命令：

cat file_1 | sort

在不使用 pipe | 時， cat 會將 file_1 的內容輸出到終端機上。使用 pipe 後， cat 的輸出會作為 sort 程式的輸入，等到排序完成後才做輸出。

redirection

我們可以利用轉向符號 > 與 < 改變終端機輸出與輸入的對象。

輸出轉向
```
cat file_1 > file_2
```
將 file_1 的內容輸出到 file_2 。
輸出附加轉向
```
cat file_1 >> file_2
```
如果 file_2 原先就有內容且使用者需要保留該資料，可以使用 >> 讓 file_1 的內容接續到 file_2 的內容後面。
輸入轉向
```
cat <file_1 >file_2 
```
輸入轉向會將原來由鍵盤輸入的資料改為使用者指定的周邊、檔案替代。上面的命令會將 file_1 作為輸入並輸出到 file_2 上面。

專案觀摩: Picoshell

筆者的答案

該專案為成功大學的 Linux 核心設計小考題目，題目中提供了一個未完成的迷你 shell 程式，讓學員透過考試檢驗 C 語言程式設計的認知:

fork, wait, exec
stdin, stdout, stderr
pointer

運作流程

首先看到 main() :
```
int main()
{
    while (1) {
        prompt();
        char buf[512] = {0}; /* input buffer */
        char *c = buf;
        if (!fgets(c + 1, sizeof(buf) - 1, stdin))
            exit(0);
        for (; *++c;) /* skip to end of line */
            ;
        run(c, 0);
    }
    return 0;
}
```
我們可以得知 main() 會重複執行 while loop 的內容：
- 執行 prompt()
```
/*
 *  印出 $
 */ 
static void prompt()
{
    write(2, "$ ", 2);
}
```
- 設置 buffer 大小並將 stdin 的內容讀入
- 利用 for loop 將 c 指標指到 buffer 的最後一個字元這邊要注意的是 fgets(c + 1, sizeof(buf) - 1, stdin) ，這個行為保留了 buffer 的第一個位置，等等看到 run() 時就會明白為什麼要這麼做。

運行 run() ：

在查看 run() 的原始碼之前，我們先看到其他定義好的 static 函式，這些函式會被 run() 呼叫，幫助 shell 判斷特殊的 token 以及印出錯誤訊息：

/* Display error message, optionally - exit */
static void fatal(int retval, int leave)
{
    if (retval >= 0)
        return;
    write(2, "?\n", 2);
    if (leave)
        exit(1);
}

/* Helper functions to detect token class */
static inline int is_delim(int c)
{
    return c == 0 || c == '|';
}

static inline int is_redir(int c)
{
    return c == '>' || c == '<';
}

static inline int is_blank(int c)
{
    return c == ' ' || c == '\t' || c == '\n';
}

static int is_special(int c)
{
    return is_delim(c) || is_redir(c) || is_blank(c);
}

run() 中的字串處理

size_t length;
char *redir_stdin = NULL, *redir_stdout = NULL;
int pipefds[2] = {0, 0}, outfd = 0;
char *v[99] = {0};
char **u = &v[98]; /* end of words */
for (;;) {
    c--;
    if (is_delim(*c)) /* if NULL (start of string) or pipe: break */
        break;
    if (!is_special(*c)) {
        /* Copy word of regular chars into previous u */
        /* 在此提交你的程式碼 */
        length = 0;
        while(!is_special(*c)){
            length++;
            c--;
        }
        u--;
        *u = malloc(sizeof(char)*length+1);
        c++;
        strncpy(*u, c, length);
        u[length] = '\0';
        /* 目前補到這 */
    }
    if (is_redir(*c)) { /* If < or > */
        if (*c == '<')
            redir_stdin = *u;
        else
            redir_stdout = *u;
        if ((u - v) != 98)
            u++;
    }
}
if ((u - v) == 98) /* empty input */
    return;

if (!strcmp(*u, "cd")) { /* built-in command: cd */
    fatal(chdir(u[1]), 0);
    return; /* actually, should run() again */
}

run() 中的 v 變數為指向 char pointer 的 array ，也就代表他是用來存放多個字串的，可以讓我們在解析 command 後把內容存進去。
u 變數則是指向 char pointer 的 pointer ，在這邊被用來指向 v 存放的最後一個字串。
每次執行 for loop 都會將指標往前指，做到 parse command 的作用。
下面的判斷式可以讓我們知道 command 已經到頭了（也就是剛剛提到的 fgets(c + 1, sizeof(buf) - 1, stdin) ，大家應該還記得吧。）/ 遇到 pipe 。
```
if (is_delim(*c)) /* if NULL (start of string) or pipe: break */
break;
```
接著這邊是小考的作答區，主要是做一些字串的處理， !is_special(*c) 會在字元為一般字母時成立。當條件成立以後，筆者讓指標持續移動直到遇到特定字元，這時候我們就可以確定指標之後的 length 個字元就是我們要的命令：

length = 0;
    while(!is_special(*c)){
        length++;
        c--;
    }
    u--;
    *u = malloc(sizeof(char)*length+1);
    c++;
    strncpy(*u, c, length);
    u[length] = '\0';

不過，筆者提交的程式碼其實犯了很嚴重的錯誤，因為 v array 在一開始就已經被分配空間了 (allocated in Stack) ，我在這邊又用了 malloc() 重新分配空間 (佔用到 Heap )，不只如此還呼叫了一次 strncpy() 做了無意義的浪費，最簡單的做法就是直接把字元指過去即可。

正確答案可以參考 sysprog21 。

處理 pipe 與執行剩下的部分就是做 pipe 以及 redir 的後續處理，最後在使用 execvp 執行我們一開始輸入的 command ：

    if (*c) {
        pipe(pipefds);
        outfd = pipefds[1]; /* write end of the pipe */
    }

    pid_t pid = fork();
    if (pid) { /* Parent or error */
        fatal(pid, 1);
        if (outfd) {
            run(c, outfd);     /* parse the rest of the cmdline */
            close(outfd);      /* close output fd */
            close(pipefds[0]); /* close read end of the pipe */
        }
        wait(0);
        return;
    }

    if (outfd) {
        dup2(pipefds[0], 0); /* dup read fd to stdin */
        close(pipefds[0]);   /* close read fd */
        close(outfd);        /* close output */
    }

    if (redir_stdin) {
        close(0); /* replace stdin with redir_stdin */
        fatal(open(redir_stdin, 0), 1);
    }

    if (t) {
        dup2(t, 1); /* replace stdout with t */
        close(t);
    }

    if (redir_stdout) {
        close(1);
        fatal(creat(redir_stdout, 438), 1); /* replace stdout with redir */
    }
    fatal(execvp(*u, u), 1);
    /*
     *  因為筆者用了 malloc() ，所以還在結束時做記憶體釋放，真 D 蠢。
     */ 
    for(int i =0;i<99;i++){
        free(v[i]);s
    }
}

總結

Jserv 老師在題目設計上的用心不言而喻，之前在修大學部的作業系統時， shell 跟 pipe ... 等觀念都只有在恐龍書上看到，所以筆者在看到這個作業時就把他拿來玩看看了，雖然正解只需要補齊大約 10 行程式碼，但要補出這短短的程式碼需要掌握 C 語言技巧並且理解 shell 的行為、如何使用 fork() 與 exec() 。然後加上筆者這兩年主要都是在接觸 Modern web ， C 語言這個技能其實早就還給大一的程設老師了 (真滴抱歉，我會加油ㄉ)，因此在寫這個系列文的過程中能 Trace 到這麼多程式碼以及複習生疏的技巧真的非常爽，希望這篇文章也能讓讀者更好的理解什麼是 shell ，881。

Reference

🚀 AwesomeCS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly