While runtime compilation has in practice been largely restricted to programming languages that execute on virtual machines, such as Java and C#, parallel OpenMP programs show many promising traits for efficient and effective runtime optimization. This work introduces stOMP: a specializing thread-library for OpenMP. Using a combined compile-time and run-time system, stOMP specializes OpenMP parallel regions for frequently-seen values and the configuration of the runtime system. We present a detailed description of the system, focusing on the optimizations implemented and techniques to minimize the runtime overhead: a context-based hot-spot detector; a pruning mechanism that eliminates poorly behaved variables as specialization targets; several runtime optimization policies; and several code optimizations and transformations that further allow performance improvement. We evaluate our work on the SPEC OMP benchmark suite, showing a performance increase of up to 7.8%.