24.22.6.1.2. gem5 O3ThreadContext
Instantiation happens in the FullO3CPU constructor:
FullO3CPU<Impl>::FullO3CPU(DerivO3CPUParams *params)
    for (ThreadID tid = 0; tid < this->numThreads; ++tid) {
        if (FullSystem) {
            // SMT is not supported in FS mode yet.
            assert(this->numThreads == 1);
            this->thread[tid] = new Thread(this, 0, NULL);
        // Setup the TC that will serve as the interface to the threads/CPU.
        O3ThreadContext<Impl> *o3_tc = new O3ThreadContext<Impl>;
and the SimObject DerivO3CPU is just a FullO3CPU instantiation:
class DerivO3CPU : public FullO3CPU<O3CPUImpl>
O3ThreadContext is a template class:
template <class Impl> class O3ThreadContext : public ThreadContext
The only Impl used appears to be O3CPUImpl? This is explicitly instantiated in the source:
template class O3ThreadContext<O3CPUImpl>;
Unlike in SimpleThread however, O3ThreadContext does not contain the register data itself, e.g. O3ThreadContext::readIntRegFlat instead forwards to cpu:
template <class Impl>
RegVal
O3ThreadContext<Impl>::readIntRegFlat(RegIndex reg_idx) const
{
    return cpu->readArchIntReg(reg_idx, thread->threadId());
}
where:
    typedef typename Impl::O3CPU O3CPU;
   /** Pointer to the CPU. */
    O3CPU *cpu;
and:
struct O3CPUImpl
{
    /** The O3CPU type to be used. */
    typedef FullO3CPU<O3CPUImpl> O3CPU;
and at long last FullO3CPU contains the register values:
template <class Impl>
RegVal
FullO3CPU<Impl>::readArchIntReg(int reg_idx, ThreadID tid)
{
    intRegfileReads++;
    PhysRegIdPtr phys_reg = commitRenameMap[tid].lookup(
            RegId(IntRegClass, reg_idx));
    return regFile.readIntReg(phys_reg);
}
So we guess that this difference from SimpleThread is due to register renaming of the out of order implementation.