Background 写了点代码,用于测试开启seccomp机制下的系统调用是否正常工作,思路很简单——先创建子进程,再执行系统调用测试用例,通过使用测试用例返回的errno
打印测试用例错误信息,考虑到非测试系统调用的报错,决定返回-errno
。核心代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 #define REPORT(func, status) \ do { \ if ((status) < 0) { \ fprintf(stdout, "run '%s' failed during other syscall: %s\n" , \ (func), strerror(-status)); \ } else if ((status) != EXIT_SUCCESS) { \ fprintf(stdout, "'%s' test failed: %s\n" , \ (func), strerror(status)); \ } else { \ fprintf(stdout, "'%s' test pass!\n" , (func)); \ } \ } while(0) int main (int argc, char *argv[]) { char *syscalls[] = {}; ... for (int i = 0 ; i < sizeof (syscalls)/sizeof (char *); i++) { retval = snprintf (path, sizeof (path), "%s/%s" , prefix, syscalls[i]); path[retval] = '\0' ; retval = posix_spawnp(&prog, path, NULL , NULL , argv, environ); if (retval != 0 ) { fprintf (stderr , "posix_spawnp('%s') failed: %s\n" , path, strerror(errno)); } waitpid(prog, &status, 0 ); if (retval == 0 ) REPORT(syscalls[i], status); } ... int main (int argc, char *argv[]) { ... fd = syscall(__NR_open, "/dev/null" , O_WRONLY); if (fd < 0 ) return -errno; syscall(__NR_write, fd, buffer, strlen (buffer)); return errno; }
上面的代码有两个大问题:
waitpid()
的返回
waitpid(prog, &status, 0)
执行后对status
的处理
Question Answer waitpid()
的返回waitpid()
返回的是进程pid,印象中存在不成功等待子进程退出便返回的情况(被信号中断),便查阅了《Linux环境编程》与man page ,修改成了这样:
1 2 while (waitpid(prog, &status, 0 ) == -1 && errno == EINTR) continue ;
子进程的返回值(exit code) waitpid()
的参数status
记录了子进程的返回值(即main func最后一行的return xxx
,或exit(xxx)
中的xxx)。根据man page 与bits/waitstatus.h 头文件,可以看出status
的有效位只有16 bits,高8 bit (status & 0xff00
)代表子进程的返回值,低8 bit (status & 0x00ff
)指示进程因哪个信号而终止(terminate),其中0代表正常结束。
查阅到errno的值 当前最大是124,因此修改成了这样:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 #define ERRNO_OFFSET 0x80 int main (int argc, char *argv[]) { ... fd = syscall(__NR_open, "/dev/null" , O_WRONLY); if (fd < 0 ) return errno+ERRNO_OFFSET; ... #define REPORT(func, status) \ do { \ if (WIFEXITED((status)) && \ WEXITSTATUS((status)) > ERRNO_OFFSET) { \ fprintf(stdout, "run '%s' failed during other syscall: %s\n" , \ (func), strerror(WEXITSTATUS((status))-ERRNO_OFFSET)); \ } else if (WIFEXITED((status)) && \ WEXITSTATUS((status)) != EXIT_SUCCESS) { \ fprintf(stdout, "'%s' test failed: %s\n" , \ (func), strerror(WEXITSTATUS((status)))); \ } else if (WIFEXITED((status)) && \ WEXITSTATUS((status)) == EXIT_SUCCESS) { \ fprintf(stdout, "'%s' test pass!\n" , (func)); \ } else { \ fprintf(stdout, "'%s' test terminated abnormally\n" , func); \ } \ } while(0)
details 有个疑问:wait()
参数wstatus
为int类型,为什么只用到了16 bits?在Stack Overflow 看到了下面的回答:
POSIX requires that the full exit value be passed in the si_status
member of the siginfo_t
structure passed to the SIGCHLD handler, if it is appropriately established via a call to sigaction
with SA_SIGINFO
specified in the flags:
If si_code is equal to CLD_EXITED, then si_status holds the exit value of the process; otherwise, it is equal to the signal that caused the process to change state. The exit value in si_status shall be equal to the full exit value (that is, the value passed to _exit(), _Exit(), or exit(), or returned from main()); it shall not be limited to the least significant eight bits of the value . (Emphasis mine).
Note that upon testing, it appears that Linux does not honour this requirement and returns only the lower 8 bits of the exit code in the si_status member . Other operating systems may correctly return the full status; FreeBSD does. See test program here .
Be wary, though, that is not completely clear that you will receive an individual SIGCHLD signal for every child process termination (multiple pending instances of a signal can be merged), so this technique is not completely infallible. It is probably better to find another way to communicate a value between processes if you need more than 8 bits.
大意是:POSIX标准并没有限制进程退出时候的返回值bit数,Linux自己做了8 bits的限制,FreeBSD没有这种限制。每个子进程结束后父进程都会收到SIGCHLD信号,也可以通过处理SIGCHLD信号获取子程序的返回值。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 sig_atomic_t exit_status;void sigchld_handler (int n, siginfo_t *si, void *v) { exit_status = si->si_status; } int main (int argc, char **argv) { ... struct sigaction act ; act.sa_sigaction = sigchld_handler; act.sa_flags = SA_SIGINFO; sigemptyset(&act.sa_mask); sigaction(SIGCHLD, &act, NULL ); ... }
更进一步地,Linux是怎么限制进程退出时候的返回值为8 bits的?
系统调用exit()
与exit_group()
中有代码,还有wait4()
获取exit code :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 SYSCALL_DEFINE1(exit , int , error_code) { do_exit((error_code&0xff )<<8 ); } SYSCALL_DEFINE1(exit_group, int , error_code) { do_group_exit((error_code & 0xff ) << 8 ); return 0 ; } static int wait_task_zombie (struct wait_opts *wo, struct task_struct *p) { ... status = (p->signal->flags & SIGNAL_GROUP_EXIT) ? p->signal->group_exit_code : p->exit_code; wo->wo_stat = status; ... out_info: infop = wo->wo_info; if (infop) { if ((status & 0x7f ) == 0 ) { infop->cause = CLD_EXITED; infop->status = status >> 8 ; } else { infop->cause = (status & 0x80 ) ? CLD_DUMPED : CLD_KILLED; infop->status = status & 0x7f ; } ...
至于wait()
获取的进程的低8 bit,或许与这一行代码 有关:
1 2 3 4 5 6 7 8 static void complete_signal (int sig, struct task_struct *p, enum pid_type type) { ... signal->flags = SIGNAL_GROUP_EXIT; signal->group_exit_code = sig; signal->group_stop_count = 0 ; ...
完整代码 完整代码在这里 。
Reference