前言
之前《TreadLocal解析》说过Threadlocal的结构:
但netty却重新搞了一个fastthreadlocal,从各方面对比一下两者的区别。也不得不说一下netty真不愧是款优秀框架,里面中有很多优秀类和方法值得细品
VS ThreadLocal
1、性能
第一点,从性能开始,为什么要重造轮子,可能就是之前的轮子达不到性能要求
代码语言:javascript复制public class FastThreadLocalTest {
public static void main(String[] args) {
testFast(100);
testSlow(100);
}
private static void testFast(int threadLocalCount) {
final FastThreadLocal<String>[] caches = new FastThreadLocal[threadLocalCount];
final Thread mainThread = Thread.currentThread();
for (int i = 0; i < threadLocalCount; i ) {
caches[i] = new FastThreadLocal();
}
Thread t = new FastThreadLocalThread(new Runnable() {
@Override
public void run() {
for (int i = 0; i < threadLocalCount; i ) {
caches[i].set("float.lu");
}
long start = System.nanoTime();
for (int i = 0; i < threadLocalCount; i ) {
for (int j = 0; j < 1000000; j ) {
caches[i].get();
}
}
long end = System.nanoTime();
System.out.println("take[" TimeUnit.NANOSECONDS.toMillis(end - start)
"]ms");
LockSupport.unpark(mainThread);
}
});
t.start();
LockSupport.park(mainThread);
}
private static void testSlow(int threadLocalCount) {
final ThreadLocal<String>[] caches = new ThreadLocal[threadLocalCount];
final Thread mainThread = Thread.currentThread();
for (int i=0;i<threadLocalCount;i ) {
caches[i] = new ThreadLocal();
}
Thread t = new Thread(new Runnable() {
@Override
public void run() {
for (int i=0;i<threadLocalCount;i ) {
caches[i].set("float.lu");
}
long start = System.nanoTime();
for (int i=0;i<threadLocalCount;i ) {
for (int j=0;j<1000000;j ) {
caches[i].get();
}
}
long end = System.nanoTime();
System.out.println("take[" TimeUnit.NANOSECONDS.toMillis(end - start)
"]ms");
LockSupport.unpark(mainThread);
}
});
t.start();
LockSupport.park(mainThread);
}
}
//输出
fast[15]ms
slow[302]ms
从输出可见性能提升很大
2、数据结构
两者的数据结构大体相似,都是thread带上map属性,threadlocal实例为key;但在细节算法处理时,不一样
get()
整体思路:通过thread取到map,再从map中取value
ThreadLocal.get()
代码语言:javascript复制public T get() {
Thread t = Thread.currentThread();
ThreadLocalMap map = getMap(t);
if (map != null) {
ThreadLocalMap.Entry e = map.getEntry(this);
if (e != null) {
@SuppressWarnings("unchecked")
T result = (T)e.value;
return result;
}
}
return setInitialValue();
}
从map中取值:
代码语言:javascript复制private Entry getEntry(ThreadLocal<?> key) {
int i = key.threadLocalHashCode & (table.length - 1);
Entry e = table[i];
if (e != null && e.get() == key)
return e;
else
return getEntryAfterMiss(key, i, e);
}
private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
Entry[] tab = table;
int len = tab.length;
while (e != null) {
ThreadLocal<?> k = e.get();
if (k == key)
return e;
if (k == null)
expungeStaleEntry(i);
else
i = nextIndex(i, len);
e = tab[i];
}
return null;
}
如果key值相等,直接返回value
如果key不相等,使用循环线性探测,一直找到最后一个元素
FastThreadLocal.get()
代码语言:javascript复制public final V get(InternalThreadLocalMap threadLocalMap) {
Object v = threadLocalMap.indexedVariable(index);
if (v != InternalThreadLocalMap.UNSET) {
return (V) v;
}
return initialize(threadLocalMap);
}
public Object indexedVariable(int index) {
Object[] lookup = indexedVariables;
return index < lookup.length? lookup[index] : UNSET;
}
这个明显就快些,有index,直接数组拿值,不需要再去处理循环
set()
主要在于向map中放值
ThreadLocal.set()
代码语言:javascript复制public void set(T value) {
Thread t = Thread.currentThread();
ThreadLocalMap map = getMap(t);
if (map != null)
map.set(this, value);
else
createMap(t, value);
}
private void set(ThreadLocal<?> key, Object value) {
// We don't use a fast path as with get() because it is at
// least as common to use set() to create new entries as
// it is to replace existing ones, in which case, a fast
// path would fail more often than not.
Entry[] tab = table;
int len = tab.length;
int i = key.threadLocalHashCode & (len-1);
for (Entry e = tab[i];
e != null;
e = tab[i = nextIndex(i, len)]) {
ThreadLocal<?> k = e.get();
if (k == key) {
e.value = value;
return;
}
if (k == null) {
replaceStaleEntry(key, value, i);
return;
}
}
tab[i] = new Entry(key, value);
int sz = size;
if (!cleanSomeSlots(i, sz) && sz >= threshold)
rehash();
}
- 通过取模,得到index
- key相等,直接赋值value
- key不相等,那就线性探测存放
FastThreadLocal.set()
代码语言:javascript复制public final void set(V value) {
if (value != InternalThreadLocalMap.UNSET) {
InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get();
if (setKnownNotUnset(threadLocalMap, value)) {
registerCleaner(threadLocalMap);
}
} else {
remove();
}
}
public boolean setIndexedVariable(int index, Object value) {
Object[] lookup = indexedVariables;
if (index < lookup.length) {
Object oldValue = lookup[index];
lookup[index] = value;
return oldValue == UNSET;
} else {
expandIndexedVariableTableAndSet(index, value);
return true;
}
}
这类似就是放入到数组中
总结
到此可以看出二者的区别
区别 | ThreadLocal | FastThreadLocal |
---|---|---|
map | ThreadLocalMap | InternalThreadLocalMap extends UnpaddedInternalThreadLocalMap |
线程 | Thread | FastThreadLocalThread extends Thread |
主要还是在内部map的处理逻辑上,两者都没有使用hashmap,但是自定义了map结构与行为,在《hashmap源码解析》中指出map结构的两种处理方式:拉链法与线性探测法;在hasmap中使用的是拉链法,而threadlocal中使用的是线性探测法
线性探查(Linear Probing)方式虽然简单,但是有一些问题,它会导致同类哈希的聚集。在存入的时候存在冲突,在查找的时候冲突依然存在
冲突也就造成了性能损耗,而FastTreadLocal就更简单,直接使用数组
代码语言:javascript复制public FastThreadLocal() {
index = InternalThreadLocalMap.nextVariableIndex();
}
UnpaddedInternalThreadLocalMap
代码语言:javascript复制Object[] indexedVariables;
public static int nextVariableIndex() {
int index = nextIndex.getAndIncrement();
if (index < 0) {
nextIndex.decrementAndGet();
throw new IllegalStateException("too many thread-local indexed variables");
}
return index;
}
整个map就是一个数组结构,在每个thread中,每一个FastThreadLocal在创建时就指定了index,value就是数组元素