Page MenuHomeVyOS Platform

Intel QAT: vyos-1.3-rolling-202011020217-amd64 kernel panic during configure
Closed, ResolvedPublicBUG

Description

Notes: system boots fine with QAT disabled, and I can enable after boot with no immediate crash. It only appears to be when QAT is applied during bootup as part of the configuration process.

I was able to successfully boot 1.3-rolling-202010260327, suggesting https://github.com/vyos/vyos-1x/commit/4a83a9ac55caf463c876dd948445e09d316ca629 may be the culprit. (edit: see comments below)

The console output, including a kernel stacktrace, has been attached.

Details

Difficulty level
Easy (less than an hour)
Version
yos-1.3-rolling-202011020217-amd64
Why the issue appeared?
Issues in third-party code
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

Your referenced commit is only relevant for validation logic if your hardware supports QAT or not. We now run Kernel 4.19.155 - please retry with the latest rolling release.

A few updates... the failure still occurs on latest rolling. Similar outcome—the kernel panics and dumps a stacktrace during the initial boot-up configure process. However, this issue goes back further than I expected (and initially expressed in the ticket). I goofed up in my testing of 1.3-rolling-202010260327 by booting with a default config file without the QAT option.

At this point 1.3-rolling-202010091522 is the most recent image I have verified to boot without issue when QAT is enabled.

Any chance https://github.com/vyos/vyos-build/commit/b234558db422390ed4d995e9134fe91c37d6cc8f could be related? My CPU is Intel(R) Atom(TM) CPU C3558 @ 2.20GHz, so seems unlikely since it seems like that file is for a different platform.

I will perform a few additional tests tomorrow with the oldest available rolling releases (looks like October 13th as of writing). Will see if I can binary search my way to when things broke.

@lucasec of course this commit could be related and we can try revert back to the old version. Would you be willing in testing a binary for us?

Sure—if you want to drop me an image I can try it out. I do have a working vyos-build as well, I can also try and produce my own with that change backed out when I get some time towards the end of the week.

I have reverted the commit of QAT driver update. can you please try out this image:

https://helix.mybll.net/vyos-1.3-qat-test-202011131253-amd64.iso

Your revert appears to do the trick. Image booted fine with QAT enabled, and "show system acceleration qat status" shows the QAT device came up fine and is running happily.

Thank you for the feedback! Will incorporate this into the rolling releases. Looks like once again Intel did us a favor.

c-po changed the task status from Open to Needs testing.Nov 14 2020, 7:01 AM
c-po claimed this task.

Next rolling release will carry the revert

c-po triaged this task as High priority.
c-po changed Difficulty level from Unknown (require assessment) to Easy (less than an hour).
c-po changed Why the issue appeared? from Will be filled on close to Issues in third-party code.
c-po changed Is it a breaking change? from Unspecified (possibly destroys the router) to Perfectly compatible.
erkin set Issue type to Bug (incorrect behavior).Aug 29 2021, 12:22 PM
erkin removed a subscriber: Active contributors.