Tuesday, January 22, 2019

14.04 regularly fails to boot (dead, NOT just blank screen)

I'm tearing my hair out with this now: My PC regularly and repeatedly fails to boot since upgrading to 14.04LTS, the sequence is something like this:



  • Power on

  • POST

  • GRUB Starts

  • Grub counts down, calls Ubuntu

  • Blank screen, then loads of text, stuff loading, etc...

  • Then black, dead. Monitors power off due to no video signal.


Keyboard and mouse are dead too, no numlock light, no light from the mouse sensor, caps lock key does nothing (caps lock light doesn't toggle).


I have to HARD power cycle the PC to get it to restart. After a few tries it might eventually boot all the way.


I don't get any error messages when it finally DOES boot.


I don't want to spam random logfiles and debug and stuff all over this post so please just ask for any info I can provide and I'll paste it in. Likewise, any commands to run / experiments to try I'll give 'em a go and report back.


System is fairly normal single-boot PC: Dual Xeon X5660 64-bit / 48G RAM / 256G SSD boot drive / 4TB drive for my junk / ATI 3650 gfx card / dual DVI monitors. There's nothing else plugged in beyond a LAN cable.


Even vague hints at where to look, logfiles to dig through, etc. would be welcome - I'm fairly familiar with the command line and whatnot, just not familiar enough with the inner workings of Ubuntu to know where to start!


Edit: Some stuff cropped from the logs at long last!


The logs are mostly full of this:


Dec 18 23:01:27 Puter kernel: [ 8571.122775] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:27 Puter kernel: [ 8571.122781] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:27 Puter kernel: [ 8571.122794] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:27 Puter kernel: [ 8571.667217] [drm:drm_mode_addfb], [FB:50]
Dec 18 23:01:27 Puter kernel: [ 8571.667224] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:27 Puter kernel: [ 8571.667236] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:28 Puter kernel: [ 8572.488325] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:28 Puter kernel: [ 8572.488332] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:28 Puter kernel: [ 8572.488344] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800

But that's the same in good times and bad, I've googled it, doesn't seem to be anything anyone really cares about. But, there is literally ~200Mb of this stuff spamming into kern.log non-stop.


Here's the log showing the last (re)boot, unfortunately I can't really see any smoking gun in there:


Dec 18 23:01:26 Puter kernel: [ 8570.170086] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:26 Puter kernel: [ 8570.170093] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.170105] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.187846] [drm:drm_mode_addfb], [FB:50]
Dec 18 23:01:26 Puter kernel: [ 8570.187853] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:26 Puter kernel: [ 8570.187865] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:26 Puter kernel: [ 8570.217859] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:26 Puter kernel: [ 8570.217866] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.217878] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.242266] [drm:drm_mode_addfb], [FB:50]
Dec 18 23:01:26 Puter kernel: [ 8570.242273] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:26 Puter kernel: [ 8570.242286] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:26 Puter kernel: [ 8570.466341] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:26 Puter kernel: [ 8570.466348] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.466360] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.506137] [drm:drm_mode_addfb], [FB:50]
Dec 18 23:01:26 Puter kernel: [ 8570.506143] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:26 Puter kernel: [ 8570.506156] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:26 Puter kernel: [ 8570.754652] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:26 Puter kernel: [ 8570.754659] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.754671] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:26 Puter kernel: [ 8570.786391] [drm:drm_mode_addfb], [FB:50]
Dec 18 23:01:26 Puter kernel: [ 8570.786398] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:26 Puter kernel: [ 8570.786410] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:27 Puter kernel: [ 8571.122775] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:27 Puter kernel: [ 8571.122781] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:27 Puter kernel: [ 8571.122794] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:27 Puter kernel: [ 8571.667217] [drm:drm_mode_addfb], [FB:50]
Dec 18 23:01:27 Puter kernel: [ 8571.667224] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:27 Puter kernel: [ 8571.667236] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff880bff782800, cur_bbo = ffff8805fe615400
Dec 18 23:01:28 Puter kernel: [ 8572.488325] [drm:drm_mode_addfb], [FB:51]
Dec 18 23:01:28 Puter kernel: [ 8572.488332] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 18 23:01:28 Puter kernel: [ 8572.488344] [drm:radeon_crtc_page_flip], flip-ioctl() cur_fbo = ffff8805fe615400, cur_bbo = ffff880bff782800
Dec 20 20:59:56 Puter kernel: [ 0.000000] Initializing cgroup subsys cpuset
Dec 20 20:59:56 Puter kernel: [ 0.000000] Initializing cgroup subsys cpu
Dec 20 20:59:56 Puter kernel: [ 0.000000] Initializing cgroup subsys cpuacct
Dec 20 20:59:56 Puter kernel: [ 0.000000] Linux version 3.13.0-74-generic (buildd@lcy01-07) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 (Ubuntu 3.13.0-74.118-generic 3.13.11-ckt30)
Dec 20 20:59:56 Puter kernel: [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.13.0-74-generic root=/dev/mapper/ubuntu--vg-root ro recovery nomodeset
Dec 20 20:59:56 Puter kernel: [ 0.000000] KERNEL supported cpus:
Dec 20 20:59:56 Puter kernel: [ 0.000000] Intel GenuineIntel
Dec 20 20:59:56 Puter kernel: [ 0.000000] AMD AuthenticAMD
Dec 20 20:59:56 Puter kernel: [ 0.000000] Centaur CentaurHauls
Dec 20 20:59:56 Puter kernel: [ 0.000000] e820: BIOS-provided physical RAM map:
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009e3ff] usable
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dbdf9bff] usable
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000dbdf9c00-0x00000000dbe4bbff] ACPI NVS
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000dbe4bc00-0x00000000dbe4dbff] ACPI data
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000dbe4dc00-0x00000000dbffffff] reserved
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fcffffff] reserved
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000fed003ff] reserved
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000feefffff] reserved
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x00000000ffb00000-0x00000000ffffffff] reserved
Dec 20 20:59:56 Puter kernel: [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000c23ffffff] usable
Dec 20 20:59:56 Puter kernel: [ 0.000000] NX (Execute Disable) protection: active
Dec 20 20:59:56 Puter kernel: [ 0.000000] SMBIOS 2.5 present.
Dec 20 20:59:56 Puter kernel: [ 0.000000] DMI: Dell Inc. Precision WorkStation T7500 /0D881F, BIOS A14 07/06/2012
Dec 20 20:59:56 Puter kernel: [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
Dec 20 20:59:56 Puter kernel: [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
Dec 20 20:59:56 Puter kernel: [ 0.000000] No AGP bridge found
Dec 20 20:59:56 Puter kernel: [ 0.000000] e820: last_pfn = 0xc24000 max_arch_pfn = 0x400000000
Dec 20 20:59:56 Puter kernel: [ 0.000000] MTRR default type: write-back

Any thoughts really appreciated!


Further edit


On more googling I came across this bug report which suggests editing etc/default/grub to remove the DRM debug flags. I've now done this, only time will tell if it has any effect.


Another Edit:


After a long period of stability, it's happened again - I can't be 100% sure but it seems to happen every time Ubuntu runs an update of the Ubuntu Core (I assume that's another word for the kernel). Totally dead, have to hard reset it, but if I then select the normal boot option from Grub (it only pops up if the system failed to boot last time) it will just work normally as if nothing's happened.


I can't help but wonder about the kernel update perhaps setting some parameter in the Grub config that's lingering from an older install (for example, when I had a different gfx card) and it's only when it fails to boot that grub sets some slightly different parameter because of the failure and it sails through fine.


Doing the suggested egrep -B75 '\[ 0.000000\] Linux version' /var/log/kern.log* returns precisely nothing!


I'm going to see if there's a way to do a clean Ubuntu install on a system without losing installed programs / settings, see if that cures it - in the meantime any suggestions welcomed!

No comments:

Post a Comment

11.10 - Can't boot from USB after installing Ubuntu

I bought a Samsung series 5 notebook and a very strange thing happened: I installed Ubuntu 11.10 from a usb pen drive but when I restarted (...