...
From the examples mentioned above, it is evident that the TLSDESC access mode provides significant optimization for accessing TLS variables on the static TLS block compared to GD. The TLSDESC mode directly returns the offset of the static TLS block in the GOT table entry, while GD requires accessing the dtv
array to calculate the address.
动态加载库初始化TLS
bionic通过dlopen动态加载库,初始TLS的步骤如下:
...
Dynamic Loading Library Initialization of TLS
In bionic, when dynamically loading a library through dlopen, the initialization steps for TLS are as follows:
In the do_dlopen ->find> find_library ->soinfo> soinfo::register_soinfo_tls→registertls -> register_tls_module流程中,获得module id,并加入至TlsModules类型变量中在soinfomodule flow, the module ID is obtained and added to the TlsModules variable.
In the soinfo::relocate ->plain> plain_relocate ->plain> plain_relocate_impl ->process_relocation→process> process_relocation -> process_relocation_impl流程中重定位初始化TLS变量的GOT表项
GD:重定位类型包括Rimpl flow, the relocation initializes the TLS variables' GOT table entries.
For GD (Global Dynamic): The relocation types include R_AARCH64_TLS_
...
DTPMOD64 and R_AARCH64_TLS_
...
LD:aarch64对LD的实现与GD相同
...
IE/LE:不支持
...
TLSDESC:重定位类型为R_AARCH64_TLSDESC,将其相邻的GOT表项初始化为tlsdesc_resolver_dynamic函数地址和TlsDynamicResolverArg类型变量地址
a. 初始化TlsDynamicResolverArg中TlsIndex的module id以及offset,offset的值为TLS变量在其TLS程序段的偏移量。另外初始化更新标志为库的更新标志,该标志表示动态库是否有更新;
b. 为了存储TlsDynamicResolverArg类型变量,将变量保存在soinfo::tlsdescargs数组中,为处理数组重新分配内存,Relocator::deferred_tlsdesc_relocs缓冲重定位信息,当该库的所有重定位操作完成后,再更新TLS变量的GOT表项.
线程创建过程中初始化TLS
调用pthread_create创建线程,需要对主程序上的所有TLS数据结构进行拷贝。(pthread_create->__allocate_thread)
调用__allocate_thread_mapping分配线程栈空间,包含了静态TLS块空间。(Allocate in order: stack guard, stack, static TLS, guard page)
调用__init_static_tls,将TlsModules类型变量中动态库的TLS段内容拷贝至静态TLS块空间中。
调用__init_tcb更新bionic_tcb
调用__init_tcb_dtv初始bionic_tcb中的TLS_SLOT_DTV,其更新标志值为0。
调用__init_bionic_tls_ptrs更新bionic_tls地址
调用clone,将静态TLS块地址传递给clone,由内核设置段寄存器tpidr_el0值
__tls_get_addr函数实现
GD/LD访问方式使用tls_get_addr函数获取TLS变量绝对地址。tls_get_addr函数涉及对dtv数据更新,其更新的条件由3个更新标志(generation)控制
全局generation,保存在__libc_tls_generation_copy,为TlsModules::generation一个副本,每次新增拥有TLS程序段的动态库时,递增该值,表示有动态库新增。不需要处理动态库删除问题
dtv数组中的generation,保存在数组中的第一个元素,初始化为0,每次更新dtv数组时,更新generation只为当时的全局generation值。与全局generation不相等,说明有新的动态库加载,需要更新dtv数组内容
动态库的generation,保存在TlsModule::first_generation,该值初始化为加载该库时全局generation的值。该值用于判断dtv指向的动态库是否有变化,即是否为旧的动态库
代码块 |
---|
struct TlsIndex {
size_t module_id;
size_t offset;
};
// ti的值保存在动态库的GOT表项中,在重定位时初始化,占两个表项内容
extern "C" void* __tls_get_addr(const TlsIndex* ti){
// 获取dtv数组
TlsDtv* dtv = __get_tcb_dtv(__get_bionic_tcb());
// 获取全局动态库更新标志
size_t generation = atomic_load(&__libc_tls_generation_copy);
if (__predict_true(generation == dtv->generation)) {
void* mod_ptr = dtv->modules[__tls_module_id_to_idx(ti->module_id)];
if (__predict_true(mod_ptr != nullptr)) {
// 无动态库更新,且内存已分配,则进入快速路径,返回TLS变量偏移地址
return static_cast<char*>(mod_ptr) + ti->offset + TLS_DTV_OFFSET;
}
// 延时分配动态库的动态TLS块内存,只有访问该动态库的TLS变量时才分配内存,进入慢速路径
}
// 有动态库更新或者第一次访问,进入dtv和动态TLS块的分配和初始化
return tls_get_addr_slow_path(ti);
} |
tls_get_addr_slow_path函数包含dtv和动态TLS块的分配和初始化.
...
DTPREL64. The adjacent GOT table entries are initialized with the module ID and the variable's offset within its TLS segment.
For LD (Local Dynamic): The implementation for AArch64 is the same as GD.
For IE (Initial Executable) / LE (Local Executable):They are not supported.
For TLSDESC: The relocation type is R_AARCH64_TLSDESC. The adjacent GOT table entries are initialized with the address of the tlsdesc_resolver_dynamic function and the address of the TlsDynamicResolverArg variable.
a. Initialize the TlsIndex in TlsDynamicResolverArg with the module ID and the offset of the TLS variable within its TLS program segment. Additionally, initialize the update flag with the library's update flag, which indicates whether the dynamic library has been updated.
b. To store the TlsDynamicResolverArg variable, it is saved in the soinfo::tlsdescargs array. To handle reallocation of the array, the Relocator::deferred_tlsdesc_relocs buffer defers relocation information. The TLS variable's GOT table entries are updated once all relocation operations for the library are completed.
Initialization of TLS during Thread Creation
When creating a thread using pthread_create, the following steps are involved in initializing TLS:
The TLS data structures on the main program are copied. This is done within the pthread_create function, specifically in the __allocate_thread function.
The __allocate_thread_mapping function is called to allocate the thread's stack space, which includes the static TLS block. The allocation order includes the stack guard, stack, static TLS block, and guard page.
The __init_static_tls function is called to copy the contents of the TLS segment from the TlsModules variable of dynamic libraries to the static TLS block.
The __init_tcb function is called to update the bionic_tcb (Thread Control Block).
The __init_tcb_dtv function is called to initialize the TLS_SLOT_DTV in the bionic_tcb, with the update flag set to 0.
The __init_bionic_tls_ptrs function is called to update the bionic_tls addresses.
The clone system call is invoked, passing the address of the static TLS block to clone. The kernel then sets the value of the tpidr_el0 register, which represents the thread pointer, according to the provided static TLS block address.
Implementation of the __tls_get_addr Function
The __tls_get_addr function is used by GD (Global Dynamic) and LD (Local Dynamic) access methods to retrieve the absolute address of a TLS variable. This function involves updating the dtv (Dynamic Thread Vector) data, and the update conditions are controlled by three generation flags.
Global Generation: The global generation is stored in __libc_tls_generation_copy, which is a copy of TlsModules::generation. Each time a dynamic library with a TLS program segment is added, this value is incremented to indicate the addition of a new dynamic library. There is no need to handle dynamic library removal.
Generation in dtv Array: The generation value in the dtv array is stored in the first element of the array, initialized to 0. When updating the dtv array, the generation is updated to match the current global generation value. If it is not equal to the global generation, it indicates that a new dynamic library has been loaded, and the contents of the dtv array need to be updated.
Generation in Dynamic Library: The generation value in the dynamic library is stored in TlsModule::first_generation. This value is initialized with the global generation value when the library is loaded. It is used to determine if the dynamic library pointed to by dtv has changed, i.e., whether it is an old dynamic library.
代码块 |
---|
struct TlsIndex {
size_t module_id;
size_t offset;
};
// The value of "ti" (Thread Index) is stored in the GOT (Global Offset Table) entries of the dynamic library. It is initialized during relocation and occupies two entries in the table.
extern "C" void* __tls_get_addr(const TlsIndex* ti){
// get the dtv
TlsDtv* dtv = __get_tcb_dtv(__get_bionic_tcb());
// retrieve the global dynamic library update flag
size_t generation = atomic_load(&__libc_tls_generation_copy);
if (__predict_true(generation == dtv->generation)) {
void* mod_ptr = dtv->modules[__tls_module_id_to_idx(ti->module_id)];
if (__predict_true(mod_ptr != nullptr)) {
return static_cast<char*>(mod_ptr) + ti->offset + TLS_DTV_OFFSET;
}
}
return tls_get_addr_slow_path(ti);
} |
The tls_get_addr_slow_path function includes the allocation and initialization of dtv (Dynamic Thread Vector) and the dynamic TLS block.
代码块 |
---|
__attribute__((noinline)) static void* tls_get_addr_slow_path(const TlsIndex* ti) {
TlsModules& modules = __libc_shared_globals()->tls_modules;
bionic_tcb* tcb = __get_bionic_tcb();
ScopedSignalBlocker ssb;
// To prevent multiple threads from simultaneously modifying the __libc_shared_globals()->tls_modules global variable, you can use a mutex to enforce mutual exclusion
ScopedWriteLock locker(&modules.rwlock);
// update the dtv array or reallocate its memory
update_tls_dtv(tcb);
TlsDtv* dtv = __get_tcb_dtv(tcb);
const size_t module_idx = __tls_module_id_to_idx(ti->module_id);
void* mod_ptr = dtv->modules[module_idx];
if (mod_ptr == nullptr) {
// If the dtv array does not exist, you would need to allocate memory, copy the contents of the dynamic library's TLS program segment to the new memory, and initialize the module pointer.
const TlsSegment& segment = modules.module_table[module_idx].segment;
mod_ptr = __libc_shared_globals()->tls_allocator.memalign(segment.alignment, segment.size);
if (segment.init_size > 0) {
memcpy(mod_ptr, segment.init_ptr, segment.init_size);
}
dtv->modules[module_idx] = mod_ptr;
// Reports the allocation to the listener, if any.
if (modules.on_creation_cb != nullptr) {
modules.on_creation_cb(mod_ptr, static_cast<void*>(static_cast<char*>(mod_ptr) + segment.size));
}
}
return static_cast<char*>(mod_ptr) + ti->offset + TLS_DTV_OFFSET;
} |
...