arm - How to debug an aarch64 translation fault? -
i writing simple kernel in armv8 (aarch64).
mmu config:
- 48 va bits (t1sz=64-48=16)
- 4k page size
- all physical ram flat mapped kernel virtual memory (on ttbr1_el1) (mmu active ttbr0_el1=0, i'm using addresses in 0xffff< addr >, flat-mapped physical memory)
i'm mapping new address space (starting @ 1<<40) free physical region. when try access address 1<<40, exception (of type "el1 using sp1, synchronous"):
esr_el1=0x96000044 far_el1=0xffff010000000000 inspecting other registers, have:
ttbr1_el1=0x82000000 ttbr1_el1[2]=0x0000000082003003 so, based on arm architecture reference manual armv8 (armv8-a profile):
- esr (exception syndrome register) translates into: exception class=100101 (data abort without change in exception level) on pages d7-1933 sq. ; wnr=1 (faulting instruction write) ; dfsc=0b000100 (translation fault @ level 0) on page d7-1958 ;
- far_el1 faulting address ; indicates ttbr1_el1 used (since high bits 1). va top 9 bits 0b000000010, indicate entry 2 used in table ;
- entry 2 in table indicates next-level table (low bits 0b11) @ physical address 0x82003000.
so, translation fails @ level 0, should not.
my question is: doing wrong? missing info lead translation fault? and, more generally, how debug translation fault ?
update:
everthing works when write tables before enabling mmu.
whenever write tables after enabling mmu (via flat-mapped table region), mapping never works. wonder why happens.
i tried manually writing selected tables (to remove side effect mmapping function): same result (when writes done before mmu on, works; after, fails).
i tried doing tlbi , dsb sy instructions, followed isb, without effect. 1 cpu running @ time caching should not problem - write instructions , mmu talk same caches (but test next).
i overlooked caching issues within single core. problem that, after turning mmu on, cpu , table walk unit didn't have same view of memory. armv8 cortex-a programming guide states cache has cleaned/invalidated point of unification (same view single core) after modifying tables.
two possibilities can explain behavior (i don't understand how caches work yet):
- first possibility: mmu not have required address in internal walk cache.
in case, when updating regular data , making available other core's l1,dsbinstruction waits cores have synchronized state (thanks coherency network): other cores know line has updated, , when try access it, gets updated l2 or migrated previous core's l1 l1.
not happen mmu (no coherency participation), still sees old value in l2.
however, if case, same thing should happen before mmu turned on (because caching activated way before), except if memory considered l1-non-cacheable before mmu activated (which possible, i'll have double check that).
minimal way of fixing problem may change caching policies table pages, cache maintenance still necessary clear possible old values mmu. - second possibility: in cases tested, mmu has faulting address in internal walk cache, not coherent data l1 or l2.
in case, explicit invalidate can eject old line mmu cache. before mmu turned on, cache contains nothing , never gets old value (0), new one.
still think case unlikely because tested many cases, , offset between previsouly mapped memory (for example, entry 0 in level 1 table) , newly mapped memory (for example, entry 128 in same level 1 table) greater cache line size (in case, 1024 bytes, more cache line size).
so, i'm still not sure causes problem, cleaning/invalidating updated addresses works.
Comments
Post a Comment