flink的类加载机制

2021-05-27 11:33:14 浏览数 (1)

我们知道,在 JVM 中,一个类加载的过程大致分为加载、链接(验证、准备、解析)、初始化5个阶段。而我们通常提到类的加载,

就是指利用类加载器(ClassLoader)通过类的全限定名来获取定义此类的二进制字节码流,进而构造出类的定义

我们先来卡一个普通的类加载顺序

从上图可知,通常的类加载都是委托给最顶成的启动类进行加载,flink同样提供了这样的加载器ParentFirstClassLoader

代码语言:javascript复制
/**
     * Regular URLClassLoader that first loads from the parent and only after that from the URLs.
     */
    public static class ParentFirstClassLoader extends FlinkUserCodeClassLoader {
 
        ParentFirstClassLoader(
                URL[] urls, ClassLoader parent, Consumer<Throwable> classLoadingExceptionHandler) {
            super(urls, parent, classLoadingExceptionHandler);
        }
 
        static {
            ClassLoader.registerAsParallelCapable();
        }
    }

不过flink作为一个分布式的计算引擎,经常会有一些第三方的jar需要被加载,全部委托给系统类加载不现实,如果仍然用双亲委派模型,就会因为 Flink 框架指定版本的类先加载,而出现莫名其妙的兼容性问题,如 NoSuchMethodError、IllegalAccessError 等,Flink 实现了 ChildFirstClassLoader 类加载器并作为默认策略。它打破了双亲委派模型,使得用户代码的类先加载,官方文档中将这个操作称为"Inverted Class Loading"

代码语言:javascript复制
/**
 * A variant of the URLClassLoader that first loads from the URLs and only after that from the
 * parent.
 *
 * <p>{@link #getResourceAsStream(String)} uses {@link #getResource(String)} internally so we don't
 * override that.
 */
public final class ChildFirstClassLoader extends FlinkUserCodeClassLoader {
 
    /**
     * The classes that should always go through the parent ClassLoader. This is relevant for Flink
     * classes, for example, to avoid loading Flink classes that cross the user-code/system-code
     * barrier in the user-code ClassLoader.
     */
    private final String[] alwaysParentFirstPatterns;
 
    public ChildFirstClassLoader(
            URL[] urls,
            ClassLoader parent,
            String[] alwaysParentFirstPatterns,
            Consumer<Throwable> classLoadingExceptionHandler) {
        super(urls, parent, classLoadingExceptionHandler);
        this.alwaysParentFirstPatterns = alwaysParentFirstPatterns;
    }
 
    @Override
    protected Class<?> loadClassWithoutExceptionHandling(String name, boolean resolve)
            throws ClassNotFoundException {
 
        // First, check if the class has already been loaded
        Class<?> c = findLoadedClass(name);
 
        if (c == null) {
            // check whether the class should go parent-first
            for (String alwaysParentFirstPattern : alwaysParentFirstPatterns) {
                if (name.startsWith(alwaysParentFirstPattern)) {
                    return super.loadClassWithoutExceptionHandling(name, resolve);
                }
            }
 
            try {
                // check the URLs
                c = findClass(name);
            } catch (ClassNotFoundException e) {
                // let URLClassLoader do it, which will eventually call the parent
                c = super.loadClassWithoutExceptionHandling(name, resolve);
            }
        } else if (resolve) {
            resolveClass(c);
        }
 
        return c;
    }
 
    @Override
    public URL getResource(String name) {
        // first, try and find it via the URLClassloader
        URL urlClassLoaderResource = findResource(name);
 
        if (urlClassLoaderResource != null) {
            return urlClassLoaderResource;
        }
 
        // delegate to super
        return super.getResource(name);
    }
 
    @Override
    public Enumeration<URL> getResources(String name) throws IOException {
        // first get resources from URLClassloader
        Enumeration<URL> urlClassLoaderResources = findResources(name);
 
        final List<URL> result = new ArrayList<>();
 
        while (urlClassLoaderResources.hasMoreElements()) {
            result.add(urlClassLoaderResources.nextElement());
        }
 
        // get parent urls
        Enumeration<URL> parentResources = getParent().getResources(name);
 
        while (parentResources.hasMoreElements()) {
            result.add(parentResources.nextElement());
        }
 
        return new Enumeration<URL>() {
            Iterator<URL> iter = result.iterator();
 
            public boolean hasMoreElements() {
                return iter.hasNext();
            }
 
            public URL nextElement() {
                return iter.next();
            }
        };
    }
 
    static {
        ClassLoader.registerAsParallelCapable();
    }
}

下面在从类的继承图和调用的入口来分析一下

继承图:

调用图:

默认是ChildFirstClassLoader

下面看一下配置参数:

配置项

默认值

描述

classloader.resolve-order

child-first

类加载顺序。child-first优先从Flink任务(jar包)中加载类,parent-first优先从Flink集群加载类。

classloader.parent-first-patterns.default

java.;scala.; org.apache.flink.; com.esotericsoftware.kryo;org.apache.hadoop.; javax.annotation.; org.slf4j;org.apache.log4j; org.apache.logging; org.apache.commons.logging;ch.qos.logback;org.xml; javax.xml; org.apache.xerces;org.w3c

优先从Flink集群加载的类,以分号分隔的类前缀匹配;alwaysParentFirstPatterns 集合中的这些类都是 Java、Flink 等组件的基础,不能被用户代码冲掉

classloader.parent-first-patterns.additional

用户如果有其他类以 child-first 模式会发生冲突,而希望以双亲委派模型来加载的话,可以额外指定(分号分隔)

0 人点赞