Add read_barrier()'s to pt_mapping_find().
This is to prevent a rather hypothetical scenario in which the
architecture uses a PTE format with the present bit in a different cache
line than that of the frame address field, and reorders the load of the
present bit after the load of the frame address despite the control
dependency between the two loads.
Most of the architectures are known not to reorder control-dependent
loads, but some newer variants of arm32 do this.
Note that a read memory barrier would have to be used also in every
lock-free caller of page_mapping_find() to order reading of the present
bit and the rest of the PTE content. These are fortunately limited to
architecture-dependent TLB-miss handlers as generic code always uses
the locked variant. Since arm32 has hardware-walked page tables, it does
not call page_mapping_find() and thus nothing has to be changed.