You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I configured an upstream with two nodes and I am testing that the health check works. If queried directly, one of the nodes responds with a 200 OK status code and the other one with a 404 for the moment. When no health check is enabled, the upstream correctly distributes requests to both nodes. However, after enabling the health check, both nodes are marked as unhealthy: in particular, the node which responds 404 is marked as unhealthy due to tcp_failure, but shouldn’t it be marked as such due to http_failure? The service is up and running, it is simply responding with a 404 status code. Moreover, the same exact node is marked as healthy in another upstream configuration, where again it should be marked as an unhealthy due to http_failure. How is this possible? Am I misjudging something?
Expected Behavior:
• Nodes responding with 200 OK should not be marked as unhealthy.
• A node responding with 404 should be marked unhealthy due to http_failure, not tcp_failure.
• A node marked as unhealthy in one upstream should not be marked as healthy in another upstream if the conditions are the same.
Is this a bug or am I missing something in the configuration? Any suggestion would be greatly appreciated.
I see that you have configured http_path: /status, which means that the health check sends a request for this path to the upstream, did you confirm that the upstream can respond to this request properly?
Description
I configured an upstream with two nodes and I am testing that the health check works. If queried directly, one of the nodes responds with a 200 OK status code and the other one with a 404 for the moment. When no health check is enabled, the upstream correctly distributes requests to both nodes. However, after enabling the health check, both nodes are marked as unhealthy: in particular, the node which responds 404 is marked as unhealthy due to tcp_failure, but shouldn’t it be marked as such due to http_failure? The service is up and running, it is simply responding with a 404 status code. Moreover, the same exact node is marked as healthy in another upstream configuration, where again it should be marked as an unhealthy due to http_failure. How is this possible? Am I misjudging something?
Here is the upstream configuration:
{
"nodes": [
{
"host": "webservice1",
"port": xxx,
"weight": 1
},
{
"host": "webservice2",
"port": yyy,
"weight": 1
}
],
"timeout": {
"connect": 6,
"send": 6,
"read": 30
},
"type": "roundrobin",
"checks": {
"active": {
"concurrency": 10,
"healthy": {
"http_statuses": [
200,
302
],
"interval": 10,
"successes": 2
},
"http_path": "/status",
"https_verify_certificate": false,
"timeout": 5,
"type": "https",
"unhealthy": {
"http_failures": 5,
"http_statuses": [
429,
404,
500,
501,
502,
503,
504,
505
],
"interval": 10,
"tcp_failures": 2,
"timeouts": 3
}
}
},
"hash_on": "vars",
"scheme": "https",
"pass_host": "node",
"name": "upstream1",
"keepalive_pool": {
"idle_timeout": 60,
"requests": 1000,
"size": 320
}
}
Here is what Control API shows:
{
"name": "/apisix/upstreams/1",
"type": "https",
"nodes": [
{
"hostname": "webservice1",
"counter": {
"http_failure": 5,
"success": 0,
"timeout_failure": 0,
"tcp_failure": 0
},
"status": "unhealthy"
},
{
"hostname": "webservice2",
"counter": {
"http_failure": 0,
"success": 0,
"timeout_failure": 0,
"tcp_failure": 2
},
"status": "unhealthy"
}
]
},
{
"name": "/apisix/upstreams/2",
"type": "https",
"nodes": [
{
"hostname": "webservice3",
"counter": {
"http_failure": 0,
"success": 0,
"timeout_failure": 0,
"tcp_failure": 0
},
"status": "healthy"
},
{
"hostname": "webservice2",
"counter": {
"http_failure": 0,
"success": 0,
"timeout_failure": 0,
"tcp_failure": 0
},
"status": "healthy"
}
]
},
Expected Behavior:
• Nodes responding with 200 OK should not be marked as unhealthy.
• A node responding with 404 should be marked unhealthy due to http_failure, not tcp_failure.
• A node marked as unhealthy in one upstream should not be marked as healthy in another upstream if the conditions are the same.
Is this a bug or am I missing something in the configuration? Any suggestion would be greatly appreciated.
Environment
• APISIX version: 3.10.0
• APISIX Dashboard version: 3.0.1
• Operating system: Linux
The text was updated successfully, but these errors were encountered: