Page MenuHomeVyOS Platform

HTTP API upstream task timeout (504 Gateway Timeout )
Open, HighPublicBUG

Description

Problem:
When committing very large configs over http api, the upstream task for nginx times out, as the commit takes longer than 10 minutes to complete the commit.

After the API connection has been idle for 10 min(waiting for upstream task) it is closed by nginx.

What you see:
Nginx error log show when the upstrean task task timing out

2025/01/17 00:32:03 [error] 86401#86401: *9 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xxx.xxx.xxx.42, server: xxx-fw-ha, request: "POST /configure-section HTTP/1.1", upstream: "http://unix:/run/api.sock/configure-section", host: "xxx.xxx.xx.43"

You can reproduce by using the following steps using config sync - see attached files.

  1. baseline set commands primary-fw

  1. baseline set commands secondary-fw

  1. Script to commit x lines of firewall config on primary-fw -> probably mutch faster to make a config file and load it, set commands are very slow.


( set using commands, very slow)

vyos@primary-fw:~$ sudo bash add_fw_lines.sh

How many lines do you want to add to the config? \nEnter a number between 10 and 20000:  10000
Valid input: 10000
Adding firewall rules from: 2 to 3334
 Starting with set commands at
Sun Jan 26 11:29:32 AM UTC 2025

Wait for it to fail. It takes up to 30 min to go through 10 000 set command lines.

NB! If the FW config is 10 000 lines long already you save time by just making changes to one line and commiting. The commit time is also way shorter, but the commit-sync load still takes more than 10 min, and times out.

Do you want to commit the changes? (yes/no): yes
Starting commit at
Sun Jan 26 12:08:37 PM UTC 2025

AT: 12:19:31, after 11 min of commit (successfull)

Commit-sync to secondary starts.

INFO:vyos_config_sync:Config synchronization: Mode=load, Secondary=10.0.1.2

And times out after 10 min at 12:29:31 (if it takes longer than 10 min)

Response on primary-fw

INFO:vyos_config_sync:Response status code: 504
INFO:vyos_config_sync:Response headers: {'Server': 'nginx/1.22.1', 'Date': 'Sun, 26 Jan 2025 12:29:31 GMT', 'Content-Type': 'text/html', 'Content-Length': '167', 'Connection': 'keep-alive'}
INFO:vyos_config_sync:Response content: b'<html>\r\n<head><title>504 Gateway Time-out</title></head>\r\n<body>\r\n<center><h1>504 Gateway Time-out</h1></center>\r\n<hr><center>nginx/1.22.1</center>\r\n</body>\r\n</html>\r\n'

The error message is not printed, but you can add verbosity on the config_sync under https://github.com/vyos/vyos-1x/blob/current/src/helpers/vyos_config_sync.py#L131

logger.info(f"Response status code: {config.status_code}")
logger.info(f"Response headers: {config.headers}")
logger.info(f"Response content: {config.content}")
  1. check nginx error log on secondary

vyos@secondary-fw:~$ tail /var/log/nginx/error.log

2025/01/26 12:29:31 [error] 3882#3882: *11 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.0.1.1, server: secondary-fw, request: "POST /configure-section HTTP/1.1", upstream: "http://unix:/run/api.sock/configure-section", host: "10.0.1.2"

Details

Version
1.4.0
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)
Forum thread
https://vyos-community.slack.com/archives/C01A6CJFW1F/p1737006851137659

Event Timeline

dmbaturin changed Is it a breaking change? from Unspecified (possibly destroys the router) to Perfectly compatible.