Improve vLLM upstream health checks to only pass when models are servable

As documented in #550, the default vLLM configuration could be improved and documented better.  A startupProbe on /health is the right default for vLLM given it does not load the server until a very long model load is complete, but tunables may vary.