Working around a A64 thermal sensor miscalibration – recompiling .dtb to change the kernel driver trip point

I got a SoPine (Pine64-based board) 2 years ago. Hooked it up just now, trying to make it boot since I want to use it for a project. Flashed Armbian Focal on a 8GB MicroSD (this version: Armbian_20.08.1_Pine64so_focal_current_5.8.5.img). It wouldn’t boot, so I connected to the board’s serial port, and after all the uboot logs, here’s what I got after “Starting kernel”:

[    3.070744] thermal thermal_zone0: critical temperature reached (188 C), shutting down
[    3.082243] reboot: Power down

Everything was cold – all ICs on the SoPine board were cold, all ICs on the baseboard were cold, whatever was 200 degrees hot, I couldn’t find it anywhere.

Trying to “bisect” the issue, I loaded an old Armbian Ubuntu 16.04 image from January of 2019 and it actually booted the board without instantly shutting down – that was a good start. What did I notice?

1) armbianmonitor -m actually gave reasonable temperatures, though it did throw some weird Bash errors:

22:37:39: 1152MHz 0.11 8% 1% 2% 0% 5% 0%/usr/bin/armbianmonitor: line 385: read: read error: 0: Operation not permitted
/usr/bin/armbianmonitor: line 386: [: -ge: unary operator expected
°C 41°C 0/7

2) dmesg had output that indicated the kernel couldn’t even read the thermal zone properly
[ +1.967998] thermal thermal_zone0: failed to read out thermal zone 0
[ +1.968005] thermal thermal_zone0: failed to read out thermal zone 0
[ +1.967997] thermal thermal_zone0: failed to read out thermal zone 0
[ +1.967971] thermal thermal_zone0: failed to read out thermal zone 0

Well, perhaps it was just an old kernel, who knows.

After much talking on Pine64 Discord, we’ve found a relevant-ish bugtracker issue. Sounds like it’s possible for A64 temperature sensors to not be properly calibrated at the factory, and that’s what apparently happened to me too. The image wouldn’t even start booting because the shutdown was initiated by the Linux thermal driver and would immediately shutdown the kernel as soon as it booted up and noticed the temps were outrageously high.

The bugtracker issue:

1) Mentioned a tool that let me see the calibration register values that, apparently, were responsible for temperature calibration.

root@pine64so:~# ./regtool a64-sid
SID:
0x01c14234 : 07ab07b1
0x01c14238 : 000007b4

2) mentioned you could recompile the .dtb files to change the trip points that the kernel driver should use.

Well, I couldn’t guarantee that I would be able to both change the temperature calibration values and figure out the right values to make the A64 CPU show the right temperature – the factory probably uses some sort of algorithm to calculate those values and flash them into the CPU. However, I could definitely change the trip points.

I initially went the “set up a kernel compile environment and compile the dtb files from there” route, however, that’s not required. Just mount the OS SD card, and go to its /boot/dts/. Decompile the SoPine file:

dtc -I dtb -O dts -f sun50i-a64-sopine-baseboard.dtb -o sun5i-a64-sopine-baseboard.dts

Ctrl+F (or Ctrl+W if you’re in nano) for “thermal”, you’ll see a “trips {“ section. Start with cpu_alert0, change the “temperature” section to something large in hexadecimal, like, 230000 (230 degrees) => 0x38270, and also up the next alerts to, say, 240000 (0x3a980) and 250000 (0x3d090):

Then, compile the dts file back into a dtb file:

dtc -I dts -O dtb -o sun5i-a64-sopine-baseboard.dtb sun50i-a64-sopine-baseboard.dts

You might need to compile with sudo since you’re replacing a file owned by root on the SD card’s filesystem. That should set trippoints to a value larger than the bogus value returned by the sensors.

In the end, I successfully booted into the latest OS!

root@pine64so:~# armbianmonitor -m
Stop monitoring using [ctrl]-[c]
Time CPU load %cpu %sys %usr %nice %io %irq CPU C.St.

13:16:25: 1008MHz 0.37 24% 2% 2% 0% 18% 0% 198.5°C 0/7
13:16:30: 1008MHz 0.34 1% 0% 0% 0% 1% 0% 198.2°C 0/7

That’s one hot CPU. Shame it can’t be used as a space heater – with how little SoPine consumes while heating up the CPU so much, I could really save on my heating bills. Oh, BTW – temps stayed at 199 even after an apt dist-upgrade that heated the CPU up quite a bit.

Oh, also – any OS upgrade (with apt dist-upgrade) might install a new version of dtb files that’ll set the trip points back to normal. I should consider some kind of long-term solution to this issue.

Also, I took my .dts files from the mainline kernel tree, and with them, I2C wouldn’t work on SoPine – there was only /dev/i2c-0, and cat /sys/class/i2c-adapter/i2c-0/name highlighted that it’s the HDMI I2C port. I needed to 1) enable the I2C ports in the .dts files and compile 2) add external I2C pullup resistors, since my SoPine board didn’t have any pullup resistors on the Pi header I2C pins. The symptom was that the “i2cdetect -y 1” command ran painfully slow and didn’t show any devices attached even if there were some sensors hooked up to the pins.

Sponsored Post Learn from the experts: Create a successful blog with our brand new courseThe WordPress.com Blog

WordPress.com is excited to announce our newest offering: a course just for beginning bloggers where you’ll learn everything you need to know about blogging from the most trusted experts in the industry. We have helped millions of blogs get up and running, we know what works, and we want you to to know everything we know. This course provides all the fundamental skills and inspiration you need to get your blog started, an interactive community forum, and content updated annually.

dpkg error processing package fuse3

Using Debian Bullseye Testing? Your fuse3 package might fail to install like this:

Setting up fuse3 (3.4.1-1)
dpkg: error processing package fuse3 (--configure):
 installed fuse3 package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 fuse3

No worries, however, you can just wget and install a newer version.

wget http://ftp.ee.debian.org/debian/pool/main/f/fuse3/libfuse3-3_3.9.0-1_amd64.deb
wget http://ftp.ee.debian.org/debian/pool/main/f/fuse3/fuse3_3.9.0-1_amd64.deb
dpkg -i libfuse3-3_3.9.0-1_amd64.deb
dpkg -i fuse3_3.9.0-1_amd64.deb

Substitute “amd64” if needed, use other Debian repo if desired. Can’t remove the fuse3 package for some reason? Use this “nuclear option”:

dpkg --remove --force-remove-reinstreq fuse3

RAID sync speed slow when installing Debian

I’m installing Debian on a computer where the rootfs will be stored on two 250GB NVMe drives that are RAIDed (RAID1) together. I couldn’t figure out a good way to also RAID (and add some redundancy to) the ESP partition, unfortunately, and I don’t think that’s possible – though it would be cool to have a working ESP partition, no matter which drive might fail. In the end, I followed this tutorial and RAIDed together two 249GB partitions I made on each on the drives – using the TUI (ncurses) interface of the Debian installer. Then, I decided to not install the system until mdadm would finish the sync, did Ctrl-Alt-F2 to switch to a terminal, then did cat /proc/mdstat – only to see 1000K/s speed and “30 hours left” estimate. Given that the drives were NVMe, this was very weird.

However, it seems like some settings in the Debian installer environment artificially limit the RAID sync speed to 1000K/s. Following this tutorial, I removed the limit using this command:

echo 1000000 > /proc/sys/dev/raid/speed_limit_max

Then, all went well and the array synced at full speed (1G/s in this case). Hope this helps you too!

‘python3 setup.py sdist bdist_wheel’ fail – possible fix

error: can't create or remove files in install directory
The following error occurred while trying to add or remove files in the installation directory:
[Errno 13] Permission denied: '/usr/local/lib/python3.6/dist-packages/test-easy-install-32566.write-test'
The installation directory you specified (via --install-dir, --prefix, or the distutils default setting) was: 
/usr/local/lib/python3.6/dist-packages/
Perhaps your account does not have write access to this directory? If the installation directory is a system-owned directory, you may need to sign in as the administrator or "root" account. If you do not have administrative access to this machine, you may wish to choose a different installation directory, preferably one that is listed in your PYTHONPATH environmentvariable. For information on other options, you may wish to consult the documentation at: https://setuptools.readthedocs.io/en/latest/easy_install.html Please make the appropriate changes for your system and try again.

You might want to update your setuptools – maybe also pip and so on:

sudo python3 -m pip install -U pip setuptools

That solved this for me.

Remove devices left after kpartx

In case you’ve used kpartx -a on a disk image to mount partitions from it, and then deleted that image (so you cannot kpartx -d anymore, as you would normally do), use this:

dmsetup delete /dev/dm-X

to remove each /dev/dm-X device left over. Use losetup -D to remove the /dev/loopX device left over.

ili9225 and fbtft

I was working on making an ILI9225-based display breakout work with the fbtft drivers. Here’s a page with pics for the exact make of the display breakout I worked with. Here’s a page that claims you can use ILI9341 fbtft commandline – except it’s an obvious copy-paste, the init commands of ILI9341 aren’t even close to what’s needed, and the register width is wrong – register addresses for ILI9225 are 1 byte and not 2 bytes wide, you should be able to verify that with a $6 logic analyzer (and if you don’t have one, buy one ASAP), and it should use the “0x20, 0x21, 0x22” addr set mechanism instead of something like “0x2A, 0x2B, 0x2C”. With a logic analyzer, Pulseview and a working C ILI9225 library for RPi that we could test against, we sat down and analyzed the behaviour of the library.

DSCN7335.JPGOur work setup

DSCN7329.JPG

RPi with the display and the splitter

DSCN7331.JPG

The splitter that allowed us to connect a logic analyzer in a clean way

DSCN7332.JPG

Looking at the known-working data from the C library

We had to lower the SPI clock line – at 16MHz, the “Saleae clone” can’t keep up with 4 channels of data (CS, CLK, MISO and RS). It’s only useful to scope RST in the beginning – just to make sure it works. The SPI decoder worked wonders, even on a netbok with Intel Celeron 847 and 3GB of RAM (long live zram).

6 hours later, here’s a commandline that worked for us in the end, something that creates a framebuffer device that works with fbcp, con2fbmap and fbi:

sudo modprobe flexfb width=220 height=176 regwidth=8 init=-1,0x01,0x01,0x1C,-1,0x02,0x01,0x00,-1,0x03,0x10,0x30,-1,0x08,0x08,0x08,-1,0x0C,0x00,0x00,-1,0x0F,0x08,0x01,-1,0x20,0x00,0x00,-1,0x21,0x00,0x00,-2,50,-1,0x10,0x0A,0x00,-1,0x11,0x10,0x38,-2,50,-1,0x12,0x11,0x21,-1,0x13,0x00,0x66,-1,0x14,0x5F,0x60,-1,0x30,0x00,0x00,-1,0x31,0x00,0xDB,-1,0x32,0x00,0x00,-1,0x33,0x00,0x00,-1,0x34,0x00,0xDB,-1,0x35,0x00,0x00,-1,0x36,0x00,0xAF,-1,0x37,0x00,0x00,-1,0x38,0x00,0xDB,-1,0x39,0x00,0x00,-1,0x50,0x04,0x00,-1,0x51,0x06,0x0B,-1,0x52,0x0C,0x0A,-1,0x53,0x01,0x05,-1,0x54,0x0A,0x0C,-1,0x55,0x0B,0x06,-1,0x56,0x00,0x04,-1,0x57,0x05,0x01,-1,0x58,0x0E,0x00,-1,0x59,0x00,0x0E,-2,50,-1,0x07,0x10,0x17,-3 buswidth=8 setaddrwin=1 && sudo modprobe fbtft_device debug=3 name=flexfb rotate=1 speed=16000000 regwidth=8 buswidth=8 gpios=reset:25,dc:24,led:18,cs:8

There’s probably an easier way to do that – please leave a comment if you find one!

Here’s a picture you can use with fbi in your testing:

you can do wget http://eja.lv/3hm to download this picture from command-line. Use imagemagick to rotate it if necessary.

The flexfb driver is getting removed from the kernel. You should be able to compile it in again – revert this commit and then just go as you’d go about compiling kernel modules, here’s an example for the sh1106 module.

Also, here’s the datasheet. Here’s a mirror of it: Datasheet_ILI9225DS_V022   

The datasheet is good enough as far as datasheets go. For example, in case you need to flip the display (as we did), it’s easy to find that you need to flip the bits in the 0x01 register – to test that on-the-fly, just run

sudo modprobe -r fbtft_device && sudo modprobe -r flexfb

and rerun the init command tweaking the registers that you need, no need to even restart.

Convert .CAP file into a BIOS (UEFI) image you can use with an SPI programmer

So, you got a .CAP file and you want to flash over SPI. CAP file format is a universal format for sharing UEFI BIOS images that people can program through a BIOS menu, DOS prompt, or using a manufacturer-approved flash tool – some manufacturers are using this format already, let’s hope it catches on since finally having some standards is good. What if your motherboard’s BIOS is already dead or doesn’t support the CPU you’re trying to boot with, though? You need to boot the computer to flash a new .CAP, however, you can’t boot your computer until you flash that .CAP. You can use an SPI programmer to flash it, all using free and open-source software (flashrom) – on the hardware side, a Raspberry Pi will work, so will a CH341-based programmer from eBay. I use my Pi Zero-powered ZeroPhone for this since it already has all the tools and breaks out all the SPI pins needed.

But first, you need to extract the firmware file from the .CAP file. You can do that through Linux command-line:

dd bs=1024 skip=2 if=YOURFILE.CAP of=image.bin

Some insight:

root@zerophone-prototype:/home/pi/z370# ls
190701-first.bin TUF-Z370-PRO-GAMING-ASUS-2102.CAP
# "first" is a working BIOS image dumped from the SPI flash
# let's run dd on the .CAP file
root@zerophone-prototype:/home/pi/z370# dd bs=1024 skip=2 if=TUF-Z370-PRO-GAMING-ASUS-2102.CAP of=trimmed.bin
16384+0 records in
16384+0 records out
16777216 bytes (17 MB, 16 MiB) copied, 0.922419 s, 18.2 MB/s
# trimmed file size in bytes
root@zerophone-prototype:/home/pi/z370# du -B1 trimmed.bin
16777216        trimmed.bin
# original file size in bytes
root@zerophone-prototype:/home/pi/z370# du -B1 190701-first.bin
16781312        190701-first.bin
# the CAP file size
root@zerophone-prototype:/home/pi/z370# du -B1 TUF-Z370-PRO-GAMING-ASUS-2102.CAP
16785408        TUF-Z370-PRO-GAMING-ASUS-2102.CAP
# Interesting, the trimmed image is said to be 8192 bytes smaller than .CAP.
# Also, it's said to be 4096 bytes smaller than the original image
# Can we trust the du output here?
# Let's strip 3 blocks instead of 2 and check.
root@zerophone-prototype:/home/pi/z370# dd bs=1024 skip=3 if=TUF-Z370-PRO-GAMING-ASUS-2102.CAP
of=3.bin
16383+0 records in
16383+0 records out
16776192 bytes (17 MB, 16 MiB) copied, 0.818545 s, 20.5 MB/s
root@zerophone-prototype:/home/pi/z370# du -B1 3.bin
16777216        3.bin
# I guess the answer is no.
# Let's check the signature, at least?
root@zerophone-prototype:/home/pi/z370# xxd 190701-first.bin | head
00000000: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000010: 5aa5 f00f 0300 0400 0802 105a 3003 3100  Z..........Z0.1.
00000020: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000030: f500 5c12 2142 60ad b7b9 c4c7 ffff ffff  ..\.!B`.........
00000040: 0000 0000 8002 ff0f 0300 7f02 0100 0200  ................
00000050: ff7f 0000 ff7f 0000 ff7f 0000 ff7f 0000  ................
00000060: ff7f 0000 ff7f 0000 ffff ffff ffff ffff  ................
00000070: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000080: 000f a000 000d 4000 0009 8000 0000 0000  ......@.........
00000090: 0001 0110 0000 0000 ffff ffff ffff ffff  ................
# This has the proper binary image signature. What about the trimmed file?
root@zerophone-prototype:/home/pi/z370# xxd trimmed.bin |head
00000000: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000010: 5aa5 f00f 0300 0400 0802 105a 3003 3100  Z..........Z0.1.
00000020: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000030: f500 5c12 2142 60ad b7b9 c4c7 ffff ffff  ..\.!B`.........
00000040: 0000 0000 8002 ff0f 0300 7f02 0100 0200  ................
00000050: ff7f 0000 ff7f 0000 ff7f 0000 ff7f 0000  ................
00000060: ff7f 0000 ff7f 0000 ffff ffff ffff ffff  ................
00000070: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000080: 000f a000 000d 4000 0009 8000 0000 0000  ......@.........
00000090: 0001 0110 0000 0000 ffff ffff ffff ffff  ................
# Looks like we have what we need!

du issues notwithstanding, this file, once flashed into the chip using an SPI programmer, actually booted the motherboard. For a good measure, I then used the BIOS built-in flasher tool to flash the .CAP over this file, just in case there are actually some differences.

Warning: if the motherboard works (i.e. you just can’t boot it using the current CPU and you don’t have another CPU), please dump the original flash image before proceeding. Another warning: you might lose your MAC address, but there are tutorials available showing you how to add it, and there are also tutorials showing how to extract it from the original image if you need that.

Interested to know more about .CAP format? This article helped me a lot, it’s in Russian, so if you don’t know it, use your online/browser-builtin translation service of choice.

Installing Replicant 4.2 on Galaxy S1

Yesterday, I decided to install Replicant on a Samsung Galaxy S1 ( i9000, also referred to as galaxysmtd in the builds ). However, it already had some version of Cyanogen Mod on it – apparently, not the one that was installed. Also, I had to blacklist some drivers for heimdall tool to work on my Linux laptop.

When flashing recovery – if you connect a Galaxy S1 in download mode to a Linux laptop, it’ll present itself as a serial port, then the cdc_acm driver will load and, apparently, send some AT commands (to check if the device is a modem), which crashes something in the download mode code, apparently, so it stops responding to the heimdall tool and there’s always this “Protocol initialization failed” message, and if you get further, there are still problems

To solve this, I added two lines: “blacklist cdc_acm” and “blacklist visor” to an /etc/modprobe.d/blacklist.conf file, and then rebooted. I don’t know if that could be solved without rebooting, only after reboot it seemed to be solved for me.

After this, I could easily flash Replicant-provided recovery image to the phone. However, it’d give this error all the time:

DSCN4600

“Error in replicant-4.2-galaxysmtd.zip (Status 0)”. Not the most informative error message. Basically, the scripts returns “ran normally” but the install doesn’t continue. I opened the .zip and saw the updater.sh there, which ought to have been the file causing that error. After debugging the script execution path with some ui_print macros, I found a section in the middle of the script (in the section for “new mtd layouts” or something) that would stop the process, but exit 0 (erroneously assuming something IIRC).  I removed that section, and it went further – but got stuck on some thing that would “exit 7”. I removed that exit statement too, and Replicant finally got installed. TODO: understand what that “exit 7” thing was complaining about.

However, Replicant is running great now, so all is well =)