1

I have two UART devices on an FPGA exposed to Linux on an Altera Cyclone V SoC. I have modified the DTS to incorporate these devices, and Linux picks them up on boot:

[    0.879942] (NULL device *): ttyAL0 at MMIO 0xff200400 (irq = 41, base_baud = 3125000) is a Altera UART
[    0.890050] (NULL device *): ttyAL1 at MMIO 0xff200420 (irq = 44, base_baud = 3125000) is a Altera UART

Resulting in a ttyAL0 and ttyAL1 in /dev/. The devices also appear in the relevant device subdirectory in /sys/devices/soc/ with the driver symlink present, for example:

lrwxrwxrwx    1 root     root             0 Jun 20 10:36 driver -> ../../../bus/platform/drivers/altera_uart
-rw-r--r--    1 root     root          4096 Jun 20 10:36 driver_override
-r--r--r--    1 root     root          4096 Jun 20 10:36 modalias
drwxr-xr-x    2 root     root             0 Jun 20 10:36 power
lrwxrwxrwx    1 root     root             0 Jun 20 10:36 subsystem -> ../../../bus/platform
-rw-r--r--    1 root     root          4096 Jun 20 10:36 uevent

However if I try to open the port either programmatically, or with cat or setserial, there is a 20s stall before the RCU scheduler throws an exception:

[  202.242133] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=2102 jiffies, g=124, c=123, q=254)
[  202.252516] INFO: Stall ended before state dump start
[  223.252109] INFO: rcu_sched self-detected stall on CPU { 0}  (t=2100 jiffies g=125 c=124 q=229)
[  223.260843] Task dump for CPU 0:
[  223.264066] klogd           R running      0   954      1 0x00000002
[  223.270566] [<c0017984>] (unwind_backtrace) from [<c00137e0>] (show_stack+0x20/0x24)
[  223.278319] [<c00137e0>] (show_stack) from [<c004b6cc>] (sched_show_task+0xb0/0x104)
[  223.286045] [<c004b6cc>] (sched_show_task) from [<c004e34c>] (dump_cpu_task+0x48/0x4c)
[  223.293941] [<c004e34c>] (dump_cpu_task) from [<c006ae60>] (rcu_dump_cpu_stacks+0xa0/0xcc)
[  223.302188] [<c006ae60>] (rcu_dump_cpu_stacks) from [<c006e520>] (rcu_check_callbacks+0x488/0x790)
[  223.311137] [<c006e520>] (rcu_check_callbacks) from [<c0072db0>] (update_process_times+0x50/0x70)
[  223.319982] [<c0072db0>] (update_process_times) from [<c0083258>] (tick_sched_timer+0x78/0x27c)
[  223.328656] [<c0083258>] (tick_sched_timer) from [<c00735f4>] (__run_hrtimer+0x90/0x1bc)
[  223.336719] [<c00735f4>] (__run_hrtimer) from [<c0073ef4>] (hrtimer_interrupt+0x140/0x31c)
[  223.344955] [<c0073ef4>] (hrtimer_interrupt) from [<c0016b58>] (twd_handler+0x40/0x50)
[  223.352867] [<c0016b58>] (twd_handler) from [<c00669bc>] (handle_percpu_devid_irq+0x90/0x124)
[  223.361364] [<c00669bc>] (handle_percpu_devid_irq) from [<c0062684>] (generic_handle_irq+0x3c/0x4c)
[  223.370377] [<c0062684>] (generic_handle_irq) from [<c0062948>] (__handle_domain_irq+0x6c/0xb4)
[  223.379042] [<c0062948>] (__handle_domain_irq) from [<c00086b0>] (gic_handle_irq+0x34/0x6c)
[  223.387362] [<c00086b0>] (gic_handle_irq) from [<c0014380>] (__irq_svc+0x40/0x54)
[  223.394811] Exception stack(0xded29cf8 to 0xded29d40)
[  223.399842] 9ce0:                                                       00000001 c06cb200
[  223.407986] 9d00: 00000000 00000000 c0687b34 00000000 00000082 00000001 df418800 c06c416c
[  223.416128] 9d20: ded28000 ded29d9c 00000000 ded29d40 c06cb200 c0029330 200f0113 ffffffff
[  223.424285] [<c0014380>] (__irq_svc) from [<c0029330>] (__do_softirq+0xc4/0x2f0)
[  223.431656] [<c0029330>] (__do_softirq) from [<c00297f8>] (irq_exit+0x88/0xc0)
[  223.438851] [<c00297f8>] (irq_exit) from [<c006294c>] (__handle_domain_irq+0x70/0xb4)
[  223.446649] [<c006294c>] (__handle_domain_irq) from [<c00086b0>] (gic_handle_irq+0x34/0x6c)
[  223.454965] [<c00086b0>] (gic_handle_irq) from [<c0014380>] (__irq_svc+0x40/0x54)
[  223.462412] Exception stack(0xded29e08 to 0xded29e50)
[  223.467443] 9e00:                   dfbd3540 df782ac0 00000000 0000996f df59d6c0 dfbd3540
[  223.475584] 9e20: c0695e20 00000000 df59c1c0 df59c540 ded28030 ded29e6c ded29e70 ded29e50
[  223.483725] 9e40: c047bad0 c004756c 600f0013 ffffffff
[  223.488762] [<c0014380>] (__irq_svc) from [<c004756c>] (finish_task_switch+0x78/0x11c)
[  223.496661] [<c004756c>] (finish_task_switch) from [<c047bad0>] (__schedule+0x230/0x5f4)
[  223.504726] [<c047bad0>] (__schedule) from [<c047bed4>] (schedule+0x40/0x8c)
[  223.511746] [<c047bed4>] (schedule) from [<c0061a58>] (do_syslog+0x51c/0x5a8)
[  223.518855] [<c0061a58>] (do_syslog) from [<c0061b00>] (SyS_syslog+0x1c/0x20)
[  223.525968] [<c0061b00>] (SyS_syslog) from [<c000f820>] (ret_fast_syscall+0x0/0x30)

I don't know why this is happening but I have noticed two interesting (i.e. wrong) things about how Linux sees my devices. The first is that their IRQs, even though correctly reported during boot and any bind/unbind operations, are not listed in /proc/interrupts (they would appear as ff200400.serial2 and ff200420.serial3):

           CPU0       CPU1
 29:      47565      47091       GIC  29  twd
 74:          0          0       GIC  74  0009
 75:          0          0       GIC  75  000A
 76:          0          0       GIC  76  000A
 77:          0          0       GIC  77  0004
 78:          0          0       GIC  78  0003
 79:          0          0       GIC  79  0006
 80:          0          0       GIC  80  0011
 81:          0          0       GIC  81  0011
 82:          0          0       GIC  82  0010
171:      10554          0       GIC 171  dw-mci
186:          0          0       GIC 186  dw_spi65535
190:          0          0       GIC 190  ffc04000.i2c
191:          0          0       GIC 191  ffc05000.i2c
192:          0          0       GIC 192  ffc06000.i2c
193:          0          0       GIC 193  ffc07000.i2c
194:        465          0       GIC 194  serial
199:          0          0       GIC 199  timer0
207:          0          0       GIC 207  fpga-mgr
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:        591       3015  Rescheduling interrupts
IPI3:          0          0  Function call interrupts
IPI4:          1          5  Single function call interrupts
IPI5:          0          0  CPU stop interrupts
IPI6:          0          0  IRQ work interrupts
IPI7:          0          0  completion interrupts
Err:          0

The other observation is that in /sys/class/tty, the ttyAL* entries are links to virtual devices instead of the physical ones:

...
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 tty8 -> ../../devices/virtual/tty/tty8
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 tty9 -> ../../devices/virtual/tty/tty9
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 ttyAL0 -> ../../devices/virtual/tty/ttyAL0
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 ttyAL1 -> ../../devices/virtual/tty/ttyAL1
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 ttyS0 -> ../../devices/soc/ffc02000.serial0/tty/ttyS0
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 ttyS1 -> ../../devices/soc/ffc03000.serial1/tty/ttyS1
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 ttyp0 -> ../../devices/virtual/tty/ttyp0
lrwxrwxrwx    1 root     root             0 Jun 20 10:49 ttyp1 -> ../../devices/virtual/tty/ttyp1
...

You can see the other two physical devices ttyS0 and ttyS1 ('real' UARTs on the ARM part of the SoC), I expected my devices to be in the same format. If you refer to the /sys/devices/soc/ device subdirectory listing above, you'll notice that it does not have a corresponding tty subdirectory - presumably part of the reason why I have a virtual TTY associated with the device.

So my question is: Why is my physical serial device appearing as virtual, and is that the reason I'm suffering kernel stalls?

In case I am missing vital information in the DTS, here are my UART additions:

uart2: serial2@ff200400 {
    compatible = "altr,uart-1.0";
    reg = <0xff200400 0x20>;
    interrupts = <0 9 4>;
    clock-frequency = <50000000>;
    current-speed = <115200>;
};

uart3: serial3@ff200420 {
    compatible = "altr,uart-1.0";
    reg = <0xff200420 0x20>;
    interrupts = <0 12 4>;
    clock-frequency = <50000000>;
    current-speed = <115200>;
};

They are child nodes of a soc node where the interrupt controller is specified.

2
  • Looks like you copied the Device Tree node for Altera UART, and tried to reuse that for your FPGA. That might work only if the FPGA UART is really a silicon clone of the Altera peripheral, and can use the Altera device driver. Do you have a proper device driver for this FPGA UART? IOW the problem is that you've installed the wrong device driver for the FPGA UARTs. Commented Jun 20, 2016 at 18:38
  • @sawdust The FPGA UART is entirely Altera IP, and the sopcinfo for it uses a DTS "compatible" string of "altr,uart-1.0", which is the same as the altera_uart driver's. Commented Jun 21, 2016 at 7:33

1 Answer 1

2

I finally discovered the issue, and it's unsurprising judging from the RCU scheduler stack trace: My IRQs are wrong.

I don't quite understand the exact mechanics of it as I'm not a firmware engineer, but the UART modules were on a IRQ offset of 40, so the their IRQs were not 9 and 12 as I thought, but 49 and 52. Updating the DTS to match caused everything to work as expected.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.