Fork us on GitHub Follow us on Facebook Follow us on Twitter

Opened 7 years ago

Closed 7 years ago

#491 closed defect (fixed)

arm32: Loader issue with -march=armv5 and ALIGN(., 8) in the linker script

Reported by: Jan Vesely Owned by: Jiri Svoboda
Priority: minor Milestone: 0.6.0
Component: helenos/srv/loader Version: mainline
Keywords: Cc:
Blocker for: Depends on:
See also:

Description

Using -march=armv5 to compile integratorcp target (attached armv5.patch):
Task init:ns (18107582119936) killed due to an exception at program counter 0x00000002.
r0 =0x00000000 r1 =0x00000000 r2 =0xf0006000 r3 =0x00000000
r4 =0x00000000 r5 =0x00000000 r6 =0x00000000 r7 =0x00000000
r8 =0x00000000 r9 =0x00000000 r10=0x00000000 fp =0x00000000
r12=0x00000000 sp =0x80000000 lr =0x00000000 spsr=0x60000050
Kill message: Page fault: 0x0.

All initial tasks fail for the same reason.
Program counter is bogus (see #490), but gdb and ns.disassm mismatch:
ns.disassm
00001078 <entry>:

1078: e59f0018 ldr r0, [pc, #24] ; 1098 <entry+0x20>
107c: e5802000 str r2, [r0]
1080: e3a0b000 mov fp, #0
1084: e1a0c00d mov ip, sp

gdb:
⇒ 0x00001078 <+0>: str r2, [r0]

0x0000107c <+4>: mov r11, #0
0x00001080 <+8>: mov r12, sp

Attachments (6)

armv5.patch (1.1 KB) - added by Jan Vesely 7 years ago.
Use -march=armv5
align.patch (355 bytes) - added by Jan Vesely 7 years ago.
Drop explicit alignment from linker script
nsv4.elfdump (5.2 KB) - added by Jan Vesely 7 years ago.
srv/ns elf dump (-march=armv4)
nsv5-broken.elfdump (5.2 KB) - added by Jan Vesely 7 years ago.
srv/ns elf dump (-march=armv5)
nsv5-align-entry.elfdump (5.2 KB) - added by Jan Vesely 7 years ago.
srv/ns elf dump (-march=armv5), fixed by adding .align 8 line to entry.s
nsv5-linker.elfdump (5.2 KB) - added by Jan Vesely 7 years ago.
srv/ns elf dump (-march=armv5), fixed by removing align line from linker script

Download all attachments as: .zip

Change History (22)

Changed 7 years ago by Jan Vesely

Attachment: armv5.patch added

Use -march=armv5

comment:1 Changed 7 years ago by Jan Vesely

dropping line:
/* Make sure the code is aligned reasonably */

. = ALIGN(., 8);

from the linker script fixes the issue. and integratorcp runs compiled for both armv4 and armv5.
(see align.patch)

Changed 7 years ago by Jan Vesely

Attachment: align.patch added

Drop explicit alignment from linker script

comment:2 Changed 7 years ago by Jiří Zárevúcky

This line has been added very recently in http://trac.helenos.org/changeset/mainline%2C1582
Was it added to solve any actual problem?

Last edited 7 years ago by Jiří Zárevúcky (previous) (diff)

comment:3 Changed 7 years ago by Jakub Jermář

The fix in mainline,1582 was probably a follow-up on mainline,1574, which was a follow-up on mainline,1564, which fixed #430.

comment:4 Changed 7 years ago by Jakub Jermář

Quoting from ARM ELF specification:

Under the ARM EABI, a loadable section is aligned on a 4-byte boundary.

Changed 7 years ago by Jan Vesely

Attachment: nsv4.elfdump added

srv/ns elf dump (-march=armv4)

Changed 7 years ago by Jan Vesely

Attachment: nsv5-broken.elfdump added

srv/ns elf dump (-march=armv5)

Changed 7 years ago by Jan Vesely

Attachment: nsv5-align-entry.elfdump added

srv/ns elf dump (-march=armv5), fixed by adding .align 8 line to entry.s

Changed 7 years ago by Jan Vesely

Attachment: nsv5-linker.elfdump added

srv/ns elf dump (-march=armv5), fixed by removing align line from linker script

comment:5 in reply to:  4 Changed 7 years ago by Jan Vesely

Replying to jermar:

Quoting from ARM ELF specification:

Under the ARM EABI, a loadable section is aligned on a 4-byte boundary.

does that mean using ALIGN(8) is incorrect?
do we need that line, shouldn't linker follow target's eabi without it?

8-byte aligned also means 4-byte aligned so why does it break things?
compiling without ALIGN(8) results in entry section like this (ns.disassm):
00001074 <
entry>:

1074: e59f0018 ldr r0, [pc, #24] ; 1094 <entry+0x20>
1078: e5802000 str r2, [r0]

comment:6 Changed 7 years ago by Jiri Svoboda

If I recall correctly, this started with a problem of GNU ld vs. the loader. The loader binary is 'unusual' because it has an interpreter section. Originally this section was not loadable. GNU ld didn't like this for some reason and created an incorrect binary (with garbage interp section) - probably because it though it is not useful. I believe Martin D. tried to fix this by making that section part of the text segment(?) which caused alignment issues hence the added explicit alignment.

I believe that using explicit alignment should be avoided as each section should already come annotated with its correct alignment. I think we should switch to a different way of marking the loader to avoid problems in the future.

comment:7 Changed 7 years ago by Martin Decky

Yes, the previous comment basically summarizes the history. The .interp section needs to be in the text segment in order to not be garbage collected by the linker. But because the size of the .interp section is not necessary a multiple of 4 bytes or 8 bytes, it can disrupt the alignment of the following sections in the text segment what contain real code. That's the reason why I have added the explicit ". = ALIGN(., 8)" statement, but there should be a better/cleaner way to solve it.

On the other hand, I am still quite puzzled what is wrong with the ". = ALIGN(., 8)" statement in the first place. If the ARM code needs a 4-byte alignment, then an 8-byte alignment should work as well. What am I missing?

comment:8 Changed 7 years ago by Jan Vesely

First thing that I noticed is that nsv5-broken has .p_align = 4…

I think this is not a linker problem but rather helenos elf_loader problem.
nsv5-broken has these header values:

.p_type = 1 , /* [PT_LOAD] */
.p_offset = 116 ,
.p_vaddr = 0x1078 ,
.p_paddr = 0x1078 ,

I think that the offset (116 = 0x74) not being the same as offset of the starting address from beginning of the page is the culprit here.

If I understand the loading mechanism, it will load entire page. In this case it will result in the first instruction being at 0x1074 instead of 0x1078, that matches observed behaviour.
All working wlfdumps have p_offset in sync with starting address ⇒ offset from the start of the file is the same as offset from the start of the page.

comment:9 Changed 7 years ago by Martin Decky

Component: helenos/lib/chelenos/srv/loader
Owner: changed from Jakub Jermář to Jiri Svoboda
Priority: majorminor
Summary: arm32: Using -march higher than armv4 results in a crash.arm32: Loader issue with -march=armv5 and ALIGN(., 8) in the linker script

Indeed, it looks like a loader problem to me which is only manifested by the ALIGN(., 8) in the linker script.

As of mainline,1648 the problem cannot be observed even with -march=armv5 (as there is no ALIGN statement in the linker script any more), therefore I am lowering the priority of this ticket to minor. However, it still needs to be fixed properly.

comment:10 in reply to:  9 ; Changed 7 years ago by Jakub Jermář

Replying to decky:

Indeed, it looks like a loader problem to me which is only manifested by the ALIGN(., 8) in the linker script.

Well, srv/loader is not involved in this in any way as the loaded task at hand is an init task ns.

Supporting Jano's observations, I also think this is not an alignment issue, but rather a congruency issue. By using ALIGN(., 8) to increase . (i.e. load address) to the next alignment of 8, we are basically breaking the congruency required by our loaders (I believe both elf_backend and uspace loader), because we are not aligning the file offset at the same time (how would that be achieved)? Anyway, there is a mention of this requirement in Jiri's thesis, section 5.2.4:

The file offset and memory address specify, where the section begins in the file
and in virtual memory, respectively. These two numbers are congruent modulo page
size. Consequently, the section data can be directly mapped from the file into virtual
memory.

comment:11 in reply to:  10 Changed 7 years ago by Jakub Jermář

Replying to jermar:

Anyway, there is a mention of this requirement in Jiri's thesis, section 5.2.4:

The file offset and memory address specify, where the section begins in the file
and in virtual memory, respectively. These two numbers are congruent modulo page
size. Consequently, the section data can be directly mapped from the file into virtual
memory.

Btw, this is not a HelenOS specific requirement, the ELF specification in the chapters on Program Headers and Program Loading states the same thing.

comment:12 in reply to:  10 Changed 7 years ago by Martin Decky

Well, srv/loader is not involved in this in any way as the loaded task at hand is an init task ns.

Sorry, I meant kernel loader, but we don't have a component for that yet.

comment:13 Changed 7 years ago by Jan Vesely

I hoped to get this fixed before merging bbxm branch as it uses arch specific options (armv4,5,7).
if the pointer/offset congruency is mandated behaviour than this looks like ld bug.

comment:14 Changed 7 years ago by Jakub Jermář

ld is just blindly following instructions in the linker script. It is also possible, maybe even more likely, that we are writing our linker scripts in a wrong way. For the starters, for section alignments, we may want to use the ALIGN attribute of the SECTION declaration:

SECTION [ADDRESS] [(TYPE)] :
       [AT(LMA)]
       [ALIGN(SECTION_ALIGN)]
       [SUBALIGN(SUBSECTION_ALIGN)]
       [CONSTRAINT]
       {
         OUTPUT-SECTION-COMMAND
         OUTPUT-SECTION-COMMAND
         ...
       } [>REGION] [AT>LMA_REGION] [:PHDR :PHDR ...] [=FILLEXP]

comment:15 Changed 7 years ago by Jan Vesely

oh, I thought ld would at least print a warning if the result breaks specs…nvm

From what I have found out ALIGN changes only section(not code) alignment (memory locations), it does not modify file offsets and AFAICT there is no way to do it in a linker script. It's the other way around, and the linker script should set addresses so that they correspond to file offsets.

I was able to replicate the same crash on amd64 when align was set to 512 (the offset is 256 so lower values won't do). The reason it worked for armv4 is that gcc -march=armv4 produces 8B aligned code and ALIGN did not modify the address.

On the other hand, using .align (.align 3 in this case) directive in entry.s and -march=armv5 resulted in 8B alignment of both file offset and memory address.

I suggest removing the ALIGN lines from all linker scripts as they don't work as intended and may cause problems.

comment:16 Changed 7 years ago by Jan Vesely

Resolution: fixed
Status: newclosed

This was resolved in changeset:mainline,1648

Note: See TracTickets for help on using tickets.