Hard crash corrupting memory chips?
Posted: Fri Sep 11, 2020 8:04 am
Hi guys.
I've been debugging a program I'm working on. I changed some things around in an attempt to clean up some operations and found it had the opposite effect of crashing my program. I eventually determined what was going on from a cryptic crash log and restored the code, which had been working perfectly fine, and it was back to working order again.
But since then, I've started getting other crashes I can't figure out. They don't crash in my program but other programs in the system. Somehow I have a cascading crash effect and I can't track it down. It is hard crashing my program with a system freeze. So I can only read the result from a serial log. I suspect there is an interrupt crashing as the tasks affected make no sense. It's so bad that the reaper is crashing in the middle of dumping a log and task suspending is failing to suspend a task as the task keeps crashing before it can be suspended! This doesn't make sense as I thought the system disabled interrupts which should halt most things crashing.
But the problem is now worse. One night, after I was testing it crashed again, but somehow ruined the hardware. After I reset it kept going into a reboot loop. CFE crashed and reset. Then it loaded amgaboot and crashed. It kept rebooting and crashing. I turned off for for a minute or so. Turn on and it the crashing remained. I gave up and just went to bed.
The next night I turn on and it seems to boot perfectly fine. Then I tried to boot Linux which loaded the kernel but then the screen just went black. I rebooted and tried again with my recovery setup and it booted fine. I loaded up the desktop and used it for an hour with no problems.
I've also done some memory testing and suspect the ram has gone bad. I already had this problem last year and thought I had solved it but it may be back again. This comes after I have replaced the battery and thought my machine was stable again. Can a hard crash permanently damage memory chips? It seems ridiculous but the timing is impeccable.
This CFE crashing and rebooting in amigaboot immediately as soon as it loads a config.
Check out these memory tests in CFE. The don't look right to me. But I don't fully understand what it means either, as it doesn't explain it.
Another ram test after my program crashed.
So, time for some new RAM? I've been running it on one stick for years. Which I replaced last year. I'd like to run it on two matching sticks now. If it can still be bought new.
I've been debugging a program I'm working on. I changed some things around in an attempt to clean up some operations and found it had the opposite effect of crashing my program. I eventually determined what was going on from a cryptic crash log and restored the code, which had been working perfectly fine, and it was back to working order again.
But since then, I've started getting other crashes I can't figure out. They don't crash in my program but other programs in the system. Somehow I have a cascading crash effect and I can't track it down. It is hard crashing my program with a system freeze. So I can only read the result from a serial log. I suspect there is an interrupt crashing as the tasks affected make no sense. It's so bad that the reaper is crashing in the middle of dumping a log and task suspending is failing to suspend a task as the task keeps crashing before it can be suspended! This doesn't make sense as I thought the system disabled interrupts which should halt most things crashing.
But the problem is now worse. One night, after I was testing it crashed again, but somehow ruined the hardware. After I reset it kept going into a reboot loop. CFE crashed and reset. Then it loaded amgaboot and crashed. It kept rebooting and crashing. I turned off for for a minute or so. Turn on and it the crashing remained. I gave up and just went to bed.
The next night I turn on and it seems to boot perfectly fine. Then I tried to boot Linux which loaded the kernel but then the screen just went black. I rebooted and tried again with my recovery setup and it booted fine. I loaded up the desktop and used it for an hour with no problems.
I've also done some memory testing and suspect the ram has gone bad. I already had this problem last year and thought I had solved it but it may be back again. This comes after I have replaced the battery and thought my machine was stable again. Can a hard crash permanently damage memory chips? It seems ridiculous but the timing is impeccable.
This CFE crashing and rebooting in amigaboot immediately as soon as it loads a config.
Code: Select all
Booting configuration AmigaOS_4.1_Final_Edition
** Exception 0x0200: SRR0=000000007FD50A10 SRR1=100000000210B000 [MCheck ] cpu0
LR = 000000007FD3DAB0 CTR = 000000007FD4D060
XER = 0000000000000000 DSISR = 00008000
HID0 = 8000000000000000 HID1 = 000000005CE993B1
HID4 = 4400240000080180 HID5 = 0000006600000000
LPCR = 0000000000000002
r0 = 00000000FC800000 r1 = 000000007FFFE5FC
r2 = 000000007FD20838 r3 = 0000000000000000
r4 = 00000000FC801040 r5 = 000000000000000C
r6 = 000000007CE81438 r7 = 0000000000000200
r8 = 0000000000000000 r9 = 0000000000000002
r10 = 000000007CE81438 r11 = 000000007CE81638
r12 = 0000000000004000 r13 = 0000000000000000
r14 = 0000000000000000 r15 = 0000000000000000
r16 = 000000007F2AC238 r17 = 0000000000000CC3
r18 = 00000000003BAE62 r19 = 0000000000000CC3
r20 = 000000000020F10C r21 = 00000000001A1C00
r22 = 0000000000000CC3 r23 = 00000000424E4443
r24 = 000000000020F110 r25 = 000000007FFFF09C
r26 = 0000000000000000 r27 = 0000000000000009
r28 = 000000007FE00B40 r29 = 000000000000660C
r30 = 000000007FDDE9A0 r31 = 0000000000000000
Code: Select all
CFE> testdram
DRAM test complete!
*** command status = 0
CFE> memorytest
Available memory arenas:
phys = 0000000000000000, virt = 0000000000000000, size = 000000007FD1D000
Testing memory.
Testing: phys = 0000000000001500, virt = 0000000000001500, size = 000000007FD1BB00
Writing: a/5/c/3
Reading: a/5/c/3
Writing: address|5555/inv/aaaa|address
Reading: address|5555/inv/aaaa|address
MC_Status: MC0: 00000000 [ ] SBE=0 | MC1: 00000000 [ ] SBE=0
*** command status = 0
CFE> randmemtest
Writing (0000000000000000 -> 0000000010000000) scrambler=7
Reading.
mem[000000000FFFFFC0] 8000000000000000 should be 55555AAAAAAAAAAA (D5555AAAAAAAAAAA)
mem[000000000FFFFFC8] 0000000000000000 should be AAAAA55555555555 (AAAAA55555555555)
mem[000000000FFFFFD8] 0000000000000000 should be AAAAAAAAAAAAAAAA (AAAAAAAAAAAAAAAA)
mem[000000000FFFFFD0] 0000000000000000 should be 5555555555555555 (5555555555555555)
mem[000000000FFFFFE0] 0000000000000000 should be 000000000FFFFFC0 (000000000FFFFFC0)
mem[000000000FFFFFE8] 0000000000000000 should be FFFFFFFFF000003F (FFFFFFFFF000003F)
mem[000000000FFFFFF8] 0000000000000000 should be FFFFFFFFFFFFFFFF (FFFFFFFFFFFFFFFF)
Code: Select all
CFE> memorytest
Available memory arenas:
phys = 0000000000000000, virt = 0000000000000000, size = 000000007FD1D000
Testing memory.
Testing: phys = 0000000000001500, virt = 0000000000001500, size = 000000007FD1BB00
Writing: a/5/c/3
Reading: a/5/c/3
Writing: address|5555/inv/aaaa|address
Reading: address|5555/inv/aaaa|address
MC_Status: MC0: 00000000 [ ] SBE=0 | MC1: 00000000 [ ] SBE=0
*** command status = 0
CFE> testdram
DRAM test complete!
*** command status = 0
CFE> randmemtest
Writing (0000000000000000 -> 0000000010000000) scrambler=7
Reading.
mem[000000000FFFFFC0] 8000000000000000 should be 55555AAAAAAAAAAA (D5555AAAAAAAAAAA)
mem[000000000FFFFFC8] 7FE000087FE00008 should be AAAAA55555555555 (D54AA55D2AB5555D)
mem[000000000FFFFFD8] 7FE000087FE00008 should be AAAAAAAAAAAAAAAA (D54AAAA2D54AAAA2)
mem[000000000FFFFFD0] 7FE000087FE00008 should be 5555555555555555 (2AB5555D2AB5555D)
mem[000000000FFFFFE0] 7FE000087FE00008 should be 000000000FFFFFC0 (7FE00008701FFFC8)
mem[000000000FFFFFE8] 7FE000087FE00008 should be FFFFFFFFF000003F (801FFFF78FE00037)
mem[000000000FFFFFF0] 7FE000087FE00008 should be 0000000000000000 (7FE000087FE00008)
mem[000000000FFFFFF8] 7FE000087FE00008 should be FFFFFFFFFFFFFFFF (801FFFF7801FFFF7)