引言
Android系统每年都会迎来大版本的更新,Android系统的应用程序基于java语言编写,底层又是基于Linux内核,系统的启动流程包含了整个系统从内核-->runtime-->java世界的全过程,掌握Android系统的启动的原理是整体上理解Android架构的关键。
此外,做开机启动的优化也必须要掌握Android系统启动的流程。
Android系统架构
这里,先放一张Google官方提供的巨经典Android系统分层架构图:
本着探究原理的角度,Android系统架构分为5个层次,从下到上依次是Linux Kernel,HAL,Native库&Runtime层,Framework、App。其中每一层都包含了大量的系统子模块和子系统。
上图是Google在好几年前提供的一张Android系统架构图,虽然很经典,但是为了更进一步的理解Android系统,这里以关键进程的视角,以分层的方式来诠释Android系统,如下图所示:
接下来,就依照上图所示内容,来阐述一下Android系统启动的过程分析。
Android系统启动过程分析
从上图可以清晰的看出,Android的启动过程从下到上的一个过程是:Bootloader --> Linux Kernel --> Native --> Framework --> App,具体来说:
Bootloader层:Android底层基于Linux内核,所以这个阶段的启动流程和正常启动Linux操作系统一样(不过pc环境与嵌入式环境不同,嵌入式系统通常不会有像bios的固件程序,所以整个系统的加载任务都是通过Bootloader来完成)。当手机上电后会先执行Bootloader引导程序,Bootloader是在Linux Kernel之前运行的一段小程序,主要是为了进行硬件设备初始化、RAM检查、建立内存空间映射表等等。从而把设备的软硬件环境进入一个合适的状态,以便为下一阶段Linux Kernel的执行准备好环境。
Linux Kernel:Linux Kernel通常包含两部分代码,分别为实模式代码和保护模式代码。当Bootloader装载完内核到内存后,分别放置两部分代码到不同的内存地址,然后先执行实模式代码,然后再执行保护模式代码。这里,会先启动Kernel的swapper进程(pid=0),该进程又称为idle进程,用于初始化内核的功能模块和驱动;之后启动init进程(pid=1);再之后启动kthreadd进程(pid=2),内核级进程,它是所有内核进程的鼻祖。
init:init进程是在上一个阶段启动Linux Kernel时创建的,它是整个Android的第一个用户进程,init在Android开机启动过程中起着至关重要的作用,它会解析init.rc文件,以及其他一些init.<xxx>.rc文件,这部分的工作主要是:
(2)负责启动ServiceManager,它是binder的服务大管家;
(4)提供property属性服务的功能,比如
代码语言:javascript复制on property:sys.boot_completed=1
start myCode
上边的示例程序是在Android系统启动完毕时执行myCode程序。
(5)孵化出Zygote进程,Zygote进程是Android系统的第一个Java 进程,Zygote是所有Java进程的父进程;
- Zygote:Zygote是init进程通过解析init.rc文件后fork生成的,Zygote进程的作用也是至关重要的,这部分的工作主要是: (1)加载ZygoteInit类,注册Zygote socket套接字; (2)加载Dalvik/ART虚拟机; (3)preloadClass和preloadResource,这里会把系统的类和系统资源提前加载; (4)孵化出System_server进程,System_server是Zygote孵化的第一个进程; (5)完成上述任务后,Zygote功成身退,进入休眠,随时待命,当后续收到请求创建新进程时会唤醒并执行相应工作;
- System_server:System_server负责启动和管理整个Android Framework,它最主要的工作就是负责启动Android的系统服务,包括AMS、PMS、WMS等;
- Home Launcher:当System_server加载了所有的系统服务后就意味着系统准备就绪了,它会向所有的服务发送一个systemready的广播。当AMS收到该条广播后,会向Zygote进程发送创建虚拟机实例的请求,Zygote进程会fork出一个新的进程,然后AMS会在系统中查找具有<category android:name = "android.intent.category.HOME" />属性的Activity,并且启动它,在这里,系统App里的Launcher应用就具有这条属性,所以Launcher就启动了;
Launcher是Android的系统桌面App,包含了映入用户眼帘的第一个ui,负责与用户进行交互。
上述过程,就是Android系统启动流程的全过程了。
原理探究
在Android启动过程中,有3个阶段是关键,分别是init、Zygote、System_server,下图展示了这几个重量级进程之间的关系:
接下来,就以Android源码的形式,来讲解一下启动过程的原理。
init
[init/init.cpp]
代码语言:javascript复制int main(int argc, char** argv) {
if (!strcmp(basename(argv[0]), "ueventd")) {
return ueventd_main(argc, argv);
}
if (!strcmp(basename(argv[0]), "watchdogd")) {
return watchdogd_main(argc, argv);
}
if (argc > 1 && !strcmp(argv[1], "subcontext")) {
InitKernelLogging(argv);
const BuiltinFunctionMap function_map;
return SubcontextMain(argc, argv, &function_map);
}
if (REBOOT_BOOTLOADER_ON_PANIC) {
InstallRebootSignalHandlers();
}
bool is_first_stage = (getenv("INIT_SECOND_STAGE") == nullptr);
if (is_first_stage) {
boot_clock::time_point start_time = boot_clock::now();
// Clear the umask.
umask(0);
clearenv();
setenv("PATH", _PATH_DEFPATH, 1);
mkdir("/dev/socket", 0755);
// Mount staging areas for devices managed by vold
// See storage config details at http://source.android.com/devices/storage/
mount("tmpfs", "/mnt", "tmpfs", MS_NOEXEC | MS_NOSUID | MS_NODEV,
"mode=0755,uid=0,gid=1000");
// /mnt/vendor is used to mount vendor-specific partitions that can not be
// part of the vendor partition, e.g. because they are mounted read-write.
mkdir("/mnt/vendor", 0755);
// Now that tmpfs is mounted on /dev and we have /dev/kmsg, we can actually
// talk to the outside world...
InitKernelLogging(argv);
LOG(INFO) << "init first stage started!";
if (!DoFirstStageMount()) {
LOG(FATAL) << "Failed to mount required partitions early ...";
}
SetInitAvbVersionInRecovery();
// Enable seccomp if global boot option was passed (otherwise it is enabled in zygote).
global_seccomp();
// Set up SELinux, loading the SELinux policy.
SelinuxSetupKernelLogging();
SelinuxInitialize();
// We're in the kernel domain, so re-exec init to transition to the init domain now
// that the SELinux policy has been loaded.
if (selinux_android_restorecon("/init", 0) == -1) {
PLOG(FATAL) << "restorecon failed of /init failed";
}
setenv("INIT_SECOND_STAGE", "true", 1);
static constexpr uint32_t kNanosecondsPerMillisecond = 1e6;
uint64_t start_ms = start_time.time_since_epoch().count() / kNanosecondsPerMillisecond;
setenv("INIT_STARTED_AT", std::to_string(start_ms).c_str(), 1);
char* path = argv[0];
char* args[] = { path, nullptr };
execv(path, args);
// execv() only returns if an error happened, in which case we
// panic and never fall through this conditional.
PLOG(FATAL) << "execv("" << path << "") failed";
}
// At this point we're in the second stage of init.
InitKernelLogging(argv);
LOG(INFO) << "init second stage started!";
// Set up a session keyring that all processes will have access to. It
// will hold things like FBE encryption keys. No process should override
// its session keyring.
keyctl_get_keyring_ID(KEY_SPEC_SESSION_KEYRING, 1);
// Indicate that booting is in progress to background fw loaders, etc.
close(open("/dev/.booting", O_WRONLY | O_CREAT | O_CLOEXEC, 0000));
property_init();
// If arguments are passed both on the command line and in DT,
// properties set in DT always have priority over the command-line ones.
process_kernel_dt();
process_kernel_cmdline();
// Propagate the kernel variables to internal variables
// used by init as well as the current required properties.
export_kernel_boot_props();
// Make the time that init started available for bootstat to log.
property_set("ro.boottime.init", getenv("INIT_STARTED_AT"));
property_set("ro.boottime.init.selinux", getenv("INIT_SELINUX_TOOK"));
// Set libavb version for Framework-only OTA match in Treble build.
const char* avb_version = getenv("INIT_AVB_VERSION");
if (avb_version) property_set("ro.boot.avb_version", avb_version);
// Clean up our environment.
unsetenv("INIT_SECOND_STAGE");
unsetenv("INIT_STARTED_AT");
unsetenv("INIT_SELINUX_TOOK");
unsetenv("INIT_AVB_VERSION");
// Now set up SELinux for second stage.
SelinuxSetupKernelLogging();
SelabelInitialize();
SelinuxRestoreContext();
epoll_fd = epoll_create1(EPOLL_CLOEXEC);
if (epoll_fd == -1) {
PLOG(FATAL) << "epoll_create1 failed";
}
sigchld_handler_init();
if (!IsRebootCapable()) {
// If init does not have the CAP_SYS_BOOT capability, it is running in a container.
// In that case, receiving SIGTERM will cause the system to shut down.
InstallSigtermHandler();
}
property_load_boot_defaults();
export_oem_lock_status();
start_property_service();
set_usb_controller();
const BuiltinFunctionMap function_map;
Action::set_function_map(&function_map);
subcontexts = InitializeSubcontexts();
ActionManager& am = ActionManager::GetInstance();
ServiceList& sm = ServiceList::GetInstance();
LoadBootScripts(am, sm);
// Turning this on and letting the INFO logging be discarded adds 0.2s to
// Nexus 9 boot time, so it's disabled by default.
if (false) DumpState();
am.QueueEventTrigger("early-init");
// Queue an action that waits for coldboot done so we know ueventd has set up all of /dev...
am.QueueBuiltinAction(wait_for_coldboot_done_action, "wait_for_coldboot_done");
// ... so that we can start queuing up actions that require stuff from /dev.
am.QueueBuiltinAction(MixHwrngIntoLinuxRngAction, "MixHwrngIntoLinuxRng");
am.QueueBuiltinAction(SetMmapRndBitsAction, "SetMmapRndBits");
am.QueueBuiltinAction(SetKptrRestrictAction, "SetKptrRestrict");
am.QueueBuiltinAction(keychord_init_action, "keychord_init");
am.QueueBuiltinAction(console_init_action, "console_init");
// Trigger all the boot actions to get us started.
am.QueueEventTrigger("init");
// Repeat mix_hwrng_into_linux_rng in case /dev/hw_random or /dev/random
// wasn't ready immediately after wait_for_coldboot_done
am.QueueBuiltinAction(MixHwrngIntoLinuxRngAction, "MixHwrngIntoLinuxRng");
// Don't mount filesystems or start core system services in charger mode.
std::string bootmode = GetProperty("ro.bootmode", "");
if (bootmode == "charger") {
am.QueueEventTrigger("charger");
} else {
am.QueueEventTrigger("late-init");
}
// Run all property triggers based on current state of the properties.
am.QueueBuiltinAction(queue_property_triggers_action, "queue_property_triggers");
while (true) {
// By default, sleep until something happens.
int epoll_timeout_ms = -1;
if (do_shutdown && !shutting_down) {
do_shutdown = false;
if (HandlePowerctlMessage(shutdown_command)) {
shutting_down = true;
}
}
if (!(waiting_for_prop || Service::is_exec_service_running())) {
am.ExecuteOneCommand();
}
if (!(waiting_for_prop || Service::is_exec_service_running())) {
if (!shutting_down) {
auto next_process_restart_time = RestartProcesses();
// If there's a process that needs restarting, wake up in time for that.
if (next_process_restart_time) {
epoll_timeout_ms = std::chrono::ceil<std::chrono::milliseconds>(
*next_process_restart_time - boot_clock::now())
.count();
if (epoll_timeout_ms < 0) epoll_timeout_ms = 0;
}
}
// If there's more work to do, wake up again immediately.
if (am.HasMoreCommands()) epoll_timeout_ms = 0;
}
epoll_event ev;
int nr = TEMP_FAILURE_RETRY(epoll_wait(epoll_fd, &ev, 1, epoll_timeout_ms));
if (nr == -1) {
PLOG(ERROR) << "epoll_wait failed";
} else if (nr == 1) {
((void (*)()) ev.data.ptr)();
}
}
return 0;
}
其中LoadBootScripts函数内容为:
代码语言:javascript复制static void LoadBootScripts(ActionManager& action_manager, ServiceList& service_list) {
Parser parser = CreateParser(action_manager, service_list);
std::string bootscript = GetProperty("ro.boot.init_rc", "");
if (bootscript.empty()) {
parser.ParseConfig("/init.rc");
if (!parser.ParseConfig("/system/etc/init")) {
late_import_paths.emplace_back("/system/etc/init");
}
if (!parser.ParseConfig("/product/etc/init")) {
late_import_paths.emplace_back("/product/etc/init");
}
if (!parser.ParseConfig("/odm/etc/init")) {
late_import_paths.emplace_back("/odm/etc/init");
}
if (!parser.ParseConfig("/vendor/etc/init")) {
late_import_paths.emplace_back("/vendor/etc/init");
}
} else {
parser.ParseConfig(bootscript);
}
}
从上述代码中,init进程的主要功能点:
- 分析和运行所有的init.rc;
- 生成设备驱动节点;
- 提供属性服务property service;
Zygote
当init解析到下条这句,便会启动Zygote:
代码语言:javascript复制service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
class main //伴随着main class的启动而启动
socket zygote stream 660 root system //创建socket
onrestart write /sys/android_power/request_state wake
onrestart write /sys/power/state on
onrestart restart media
onrestart restart netd
可以看到Zygote其实是app_process改了个名字叫Zygote了。接下来,就进入了Zygote进程了。
Zygote进程启动后,会执行到frameworks/base/cmds/app_process/App_main.cpp当main方法。
代码语言:javascript复制int main(int argc, char* const argv[])
{
..... // 条件判断
AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
// 解析命令行
.....
while (i < argc) {
..... // 解析参数
}
.....
if (!niceName.isEmpty()) {
runtime.setArgv0(niceName.string(), true /* setProcName */);
}
// 下边为两条分支
if (zygote) {
runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
} else if (className) {
runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
} else {
fprintf(stderr, "Error: no class name or --zygote supplied.n");
app_usage();
LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
}
}
在app_process进程启动时,有两个分支:
- 当zygote为true,执行ZygoteInit
- 当zygote为false,执行RuntimeInit
因为在Android系统启动时,zygote一定为true,所以会走到ZygoteInit.main()函数。
代码语言:javascript复制public static void main(String argv[]) {
ZygoteServer zygoteServer = new ZygoteServer();
.....
try {
.....
//注册socket
zygoteServer.registerServerSocketFromEnv(socketName);
if (!enableLazyPreload) {
.....
preload(bootTimingsTraceLog); //预加载
.....
}
.....
if (startSystemServer) {
// 孵化System_server
Runnable r = forkSystemServer(abiList, socketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
.....
// 功成身退,休眠
caller = zygoteServer.runSelectLoop(abiList);
}
.....
}
Zygote进程创建Java虚拟机,并注册JNI方法, 真正成为Java进程的母体,用于孵化Java进程. 在创建完system_server进程后,zygote功成身退,调用runSelectLoop(),随时待命,当接收到请求创建新进程请求时立即唤醒并执行相应工作。
System_server
Zygote通过fork后创建System_server进程,具体是通过forkSystemServer函数:
代码语言:javascript复制private static boolean startSystemServer(String abiList, String socketName)
throws MethodAndArgsCaller, RuntimeException {
...
// fork子进程system_server
pid = Zygote.forkSystemServer(
parsedArgs.uid, parsedArgs.gid,
parsedArgs.gids,
parsedArgs.runtimeFlags,
null,
parsedArgs.permittedCapabilities,
parsedArgs.effectiveCapabilities);
...
if (pid == 0) {
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
zygoteServer.closeServerSocket();
// 进入System_server进程
return handleSystemServerProcess(parsedArgs);
}
return null;
}
接着看一下handleSystemServerProcess函数:
代码语言:javascript复制private static void handleSystemServerProcess( ZygoteConnection.Arguments parsedArgs) throws ZygoteInit.MethodAndArgsCaller {
...
if (parsedArgs.niceName != null) {
//设置当前进程名为"system_server"
Process.setArgV0(parsedArgs.niceName);
}
final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
if (systemServerClasspath != null) {
//执行dex优化操作,比如services.jar
performSystemServerDexOpt(systemServerClasspath);
......
}
if (parsedArgs.invokeWith != null) {
...
} else {
ClassLoader cl = null;
if (systemServerClasspath != null) {
cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);
Thread.currentThread().setContextClassLoader(cl);
}
/*
* Pass the remaining arguments to SystemServer.
*/
return ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
}
}
在handleSystemServerProcess函数里设置了进程名,执行了dexopt的优化工作,然后就执行了ZygoteInit.zygoteInit函数:
代码语言:javascript复制public static final Runnable zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader) throws ZygoteInit.MethodAndArgsCaller {
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "RuntimeInit");
RuntimeInit.redirectLogStreams(); //重定向log输出
RuntimeInit.commonInit(); // 通用的一些初始化
ZygoteInit.nativeZygoteInit(); // zygote初始化
return RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
}
在这里,applicationInit函数经过层层调用,会抛出ZygoteInit.MethodAndArgsCaller(m, argv)的一个异常,具体代码如下:
代码语言:javascript复制protected static Runnable applicationInit(int targetSdkVersion, String[] argv,
ClassLoader classLoader) {
...
VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
final Arguments args = new Arguments(argv);
//找到目标类的静态main()方法
return findStaticMain(args.startClass, args.startArgs, classLoader);
}
private static Runnable findStaticMain(String className, String[] argv,
ClassLoader classLoader) {
//此处的className等于SystemServer
Class<?> cl = Class.forName(className, true, classLoader);
Method m = cl.getMethod("main", new Class[] { String[].class });
//新建MethodAndArgsCaller对象
return new MethodAndArgsCaller(m, argv);
}
设置虚拟机的堆利用率0.75和置TargetSdk版本;最终新建了MethodAndArgsCaller的对象,它的构造函数如下:
代码语言:javascript复制static class MethodAndArgsCaller implements Runnable {
/** method to call */
private final Method mMethod;
/** argument array */
private final String[] mArgs;
public MethodAndArgsCaller(Method method, String[] args) {
mMethod = method;
mArgs = args;
}
public void run() {
try {
mMethod.invoke(null, new Object[] { mArgs });
} catch (IllegalAccessException ex) {
throw new RuntimeException(ex);
} catch (InvocationTargetException ex) {
Throwable cause = ex.getCause();
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
} else if (cause instanceof Error) {
throw (Error) cause;
}
throw new RuntimeException(ex);
}
}
}
这里,通过run()方法,执行mMethod.invoke,反射的方式会调用到SystemServer.main()方法。
代码语言:javascript复制public final class SystemServer {
...
public static void main(String[] args) {
//先初始化SystemServer对象,再调用对象的run()方法
new SystemServer().run();
}
}
private void run() {
.....
//加载android_servers.so库,该库包含的源码在frameworks/base/services/目录下
System.loadLibrary("android_servers");
//检测上次关机过程是否失败,该方法可能不会返回
performPendingShutdown();
createSystemContext(); //初始化系统上下文
//创建系统服务管理
mSystemServiceManager = new SystemServiceManager(mSystemContext);
LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);
//启动各种系统服务
try {
startBootstrapServices(); // 启动引导服务
startCoreServices(); // 启动核心服务
startOtherServices(); // 启动其他服务
} catch (Throwable ex) {
Slog.e("System", "************ Failure starting system services", ex);
throw ex;
}
//一直循环执行
Looper.loop();
throw new RuntimeException("Main thread loop unexpectedly exited");
}
这里,最主要的就是startBootstrapServices(),startCoreServices(),startOtherServices()函数,这三个函数会启动所有的Android的系统服务。
系统服务启动完毕后,将会发送一条systemReady的广播,这条广播就会被Home Launcher的系统App接收到,从而被拉起启动。
至此,Android系统的启动流程的原理,就分析完毕了。
总结
本篇文章记录了个人学习Android系统的启动流程,从架构,到拆分启动过程的各个环节,然后从代码的角度来剖析Android系统启动的全过程。
后续,还会出Android App启动的流程分析。
作者的话
个人喜欢计算机技术,主要涉及的领域包括:Android系统,Linux内核,嵌入式软/硬件,机器人和智能硬件。同时也对其他的各个技术栈都感兴趣。
同时也很喜欢生活,喜欢享受生活,喜欢用拍照和视频的方式来记录生活。
如果你也是个爱学习爱技术的人,欢迎一起探讨。没准,咱们能称为好朋友。如果觉得本文有哪些不对的地方,欢迎指出,大家一起学习进步。