Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix API request validation #1158

Conversation

viktor-ferenczi
Copy link
Contributor

Issue #1130

  • Added test coverage for None prompt and invalid SamplingParam attribute
  • Fixed request validation to handle the above cases
  • Minor refactoring to make the test case a bit more debuggable

@viktor-ferenczi viktor-ferenczi mentioned this pull request Sep 23, 2023
Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also format the code with format.sh

tests/async_engine/test_api_server.py Show resolved Hide resolved
tests/async_engine/test_api_server.py Outdated Show resolved Hide resolved
@viktor-ferenczi
Copy link
Contributor Author

Fixed and formatted the code

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for your contribution!

@zhuohan123
Copy link
Member

I'm getting the following error message when running $ pytest tests/async_engine/test_api_server.py. The same test works on main. Can you check what's going on?

============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.0.0
rootdir: /home/zhuohan/vllm/vllm
plugins: forked-1.6.0, xdist-3.3.1, anyio-3.5.0
collected 1 item

tests/async_engine/test_api_server.py F                                  [100%]

=================================== FAILURES ===================================
_______________________________ test_api_server ________________________________

self = <urllib3.connection.HTTPConnection object at 0x7f9e7f17a770>

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
>           conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )

../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/connection.py:174: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/util/connection.py:95: in create_connection
    raise err
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

address = ('localhost', 8000), timeout = None, source_address = None
socket_options = [(6, 1, 1)]

    def create_connection(
        address,
        timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
        source_address=None,
        socket_options=None,
    ):
        """Connect to *address* and return the socket object.
    
        Convenience function.  Connect to *address* (a 2-tuple ``(host,
        port)``) and return the socket object.  Passing the optional
        *timeout* parameter will set the timeout on the socket instance
        before attempting to connect.  If no *timeout* is supplied, the
        global default timeout setting returned by :func:`socket.getdefaulttimeout`
        is used.  If *source_address* is set it must be a tuple of (host, port)
        for the socket to bind as a source address before making the connection.
        An host of '' or port 0 tells the OS to use the default.
        """
    
        host, port = address
        if host.startswith("["):
            host = host.strip("[]")
        err = None
    
        # Using the value from allowed_gai_family() in the context of getaddrinfo lets
        # us select whether to work with IPv4 DNS records, IPv6 records, or both.
        # The original create_connection function always returns all records.
        family = allowed_gai_family()
    
        try:
            host.encode("idna")
        except UnicodeError:
            return six.raise_from(
                LocationParseError(u"'%s', label empty or too long" % host), None
            )
    
        for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            sock = None
            try:
                sock = socket.socket(af, socktype, proto)
    
                # If provided, set socket level options before connecting.
                _set_socket_options(sock, socket_options)
    
                if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
                    sock.settimeout(timeout)
                if source_address:
                    sock.bind(source_address)
>               sock.connect(sa)
E               ConnectionRefusedError: [Errno 111] Connection refused

../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/util/connection.py:85: ConnectionRefusedError

During handling of the above exception, another exception occurred:

self = <urllib3.connectionpool.HTTPConnectionPool object at 0x7f9e7f17a3b0>
method = 'POST', url = '/generate'
body = b'{"prompt": "Hello world", "max_tokens": 100, "temperature": 0, "ignore_eos": true}'
headers = {'User-Agent': 'python-requests/2.31.0', 'Accept-Encoding': 'gzip, deflate, br', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '82', 'Content-Type': 'application/json'}
retries = Retry(total=0, connect=None, read=False, redirect=None, status=None)
redirect = False, assert_same_host = False
timeout = Timeout(connect=None, read=None, total=None), pool_timeout = None
release_conn = False, chunked = False, body_pos = None
response_kw = {'decode_content': False, 'preload_content': False}
parsed_url = Url(scheme=None, auth=None, host=None, port=None, path='/generate', query=None, fragment=None)
destination_scheme = None, conn = None, release_this_conn = True
http_tunnel_required = False, err = None, clean_exit = False

    def urlopen(
        self,
        method,
        url,
        body=None,
        headers=None,
        retries=None,
        redirect=True,
        assert_same_host=True,
        timeout=_Default,
        pool_timeout=None,
        release_conn=None,
        chunked=False,
        body_pos=None,
        **response_kw
    ):
        """
        Get a connection from the pool and perform an HTTP request. This is the
        lowest level call for making a request, so you'll need to specify all
        the raw details.
    
        .. note::
    
           More commonly, it's appropriate to use a convenience method provided
           by :class:`.RequestMethods`, such as :meth:`request`.
    
        .. note::
    
           `release_conn` will only behave as expected if
           `preload_content=False` because we want to make
           `preload_content=False` the default behaviour someday soon without
           breaking backwards compatibility.
    
        :param method:
            HTTP request method (such as GET, POST, PUT, etc.)
    
        :param url:
            The URL to perform the request on.
    
        :param body:
            Data to send in the request body, either :class:`str`, :class:`bytes`,
            an iterable of :class:`str`/:class:`bytes`, or a file-like object.
    
        :param headers:
            Dictionary of custom headers to send, such as User-Agent,
            If-None-Match, etc. If None, pool headers are used. If provided,
            these headers completely replace any pool-specific headers.
    
        :param retries:
            Configure the number of retries to allow before raising a
            :class:`~urllib3.exceptions.MaxRetryError` exception.
    
            Pass ``None`` to retry until you receive a response. Pass a
            :class:`~urllib3.util.retry.Retry` object for fine-grained control
            over different types of retries.
            Pass an integer number to retry connection errors that many times,
            but no other types of errors. Pass zero to never retry.
    
            If ``False``, then retries are disabled and any exception is raised
            immediately. Also, instead of raising a MaxRetryError on redirects,
            the redirect response will be returned.
    
        :type retries: :class:`~urllib3.util.retry.Retry`, False, or an int.
    
        :param redirect:
            If True, automatically handle redirects (status codes 301, 302,
            303, 307, 308). Each redirect counts as a retry. Disabling retries
            will disable redirect, too.
    
        :param assert_same_host:
            If ``True``, will make sure that the host of the pool requests is
            consistent else will raise HostChangedError. When ``False``, you can
            use the pool on an HTTP proxy and request foreign hosts.
    
        :param timeout:
            If specified, overrides the default timeout for this one
            request. It may be a float (in seconds) or an instance of
            :class:`urllib3.util.Timeout`.
    
        :param pool_timeout:
            If set and the pool is set to block=True, then this method will
            block for ``pool_timeout`` seconds and raise EmptyPoolError if no
            connection is available within the time period.
    
        :param release_conn:
            If False, then the urlopen call will not release the connection
            back into the pool once a response is received (but will release if
            you read the entire contents of the response such as when
            `preload_content=True`). This is useful if you're not preloading
            the response's content immediately. You will need to call
            ``r.release_conn()`` on the response ``r`` to return the connection
            back into the pool. If None, it takes the value of
            ``response_kw.get('preload_content', True)``.
    
        :param chunked:
            If True, urllib3 will send the body using chunked transfer
            encoding. Otherwise, urllib3 will send the body using the standard
            content-length form. Defaults to False.
    
        :param int body_pos:
            Position to seek to in file-like body in the event of a retry or
            redirect. Typically this won't need to be set because urllib3 will
            auto-populate the value when needed.
    
        :param \\**response_kw:
            Additional parameters are passed to
            :meth:`urllib3.response.HTTPResponse.from_httplib`
        """
    
        parsed_url = parse_url(url)
        destination_scheme = parsed_url.scheme
    
        if headers is None:
            headers = self.headers
    
        if not isinstance(retries, Retry):
            retries = Retry.from_int(retries, redirect=redirect, default=self.retries)
    
        if release_conn is None:
            release_conn = response_kw.get("preload_content", True)
    
        # Check host
        if assert_same_host and not self.is_same_host(url):
            raise HostChangedError(self, url, retries)
    
        # Ensure that the URL we're connecting to is properly encoded
        if url.startswith("/"):
            url = six.ensure_str(_encode_target(url))
        else:
            url = six.ensure_str(parsed_url.url)
    
        conn = None
    
        # Track whether `conn` needs to be released before
        # returning/raising/recursing. Update this variable if necessary, and
        # leave `release_conn` constant throughout the function. That way, if
        # the function recurses, the original value of `release_conn` will be
        # passed down into the recursive call, and its value will be respected.
        #
        # See issue #651 [1] for details.
        #
        # [1] <https://github.com/urllib3/urllib3/issues/651>
        release_this_conn = release_conn
    
        http_tunnel_required = connection_requires_http_tunnel(
            self.proxy, self.proxy_config, destination_scheme
        )
    
        # Merge the proxy headers. Only done when not using HTTP CONNECT. We
        # have to copy the headers dict so we can safely change it without those
        # changes being reflected in anyone else's copy.
        if not http_tunnel_required:
            headers = headers.copy()
            headers.update(self.proxy_headers)
    
        # Must keep the exception bound to a separate variable or else Python 3
        # complains about UnboundLocalError.
        err = None
    
        # Keep track of whether we cleanly exited the except block. This
        # ensures we do proper cleanup in finally.
        clean_exit = False
    
        # Rewind body position, if needed. Record current position
        # for future rewinds in the event of a redirect/retry.
        body_pos = set_file_position(body, body_pos)
    
        try:
            # Request a connection from the queue.
            timeout_obj = self._get_timeout(timeout)
            conn = self._get_conn(timeout=pool_timeout)
    
            conn.timeout = timeout_obj.connect_timeout
    
            is_new_proxy_conn = self.proxy is not None and not getattr(
                conn, "sock", None
            )
            if is_new_proxy_conn and http_tunnel_required:
                self._prepare_proxy(conn)
    
            # Make the request on the httplib connection object.
>           httplib_response = self._make_request(
                conn,
                method,
                url,
                timeout=timeout_obj,
                body=body,
                headers=headers,
                chunked=chunked,
            )

../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/connectionpool.py:714: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/connectionpool.py:415: in _make_request
    conn.request(method, url, **httplib_request_kw)
../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/connection.py:244: in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
../../anaconda3/envs/vllm/lib/python3.10/http/client.py:1283: in request
    self._send_request(method, url, body, headers, encode_chunked)
../../anaconda3/envs/vllm/lib/python3.10/http/client.py:1329: in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
../../anaconda3/envs/vllm/lib/python3.10/http/client.py:1278: in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
../../anaconda3/envs/vllm/lib/python3.10/http/client.py:1038: in _send_output
    self.send(msg)
../../anaconda3/envs/vllm/lib/python3.10/http/client.py:976: in send
    self.connect()
../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/connection.py:205: in connect
    conn = self._new_conn()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib3.connection.HTTPConnection object at 0x7f9e7f17a770>

    def _new_conn(self):
        """Establish a socket connection and set nodelay settings on it.
    
        :return: New socket connection.
        """
        extra_kw = {}
        if self.source_address:
            extra_kw["source_address"] = self.source_address
    
        if self.socket_options:
            extra_kw["socket_options"] = self.socket_options
    
        try:
            conn = connection.create_connection(
                (self._dns_host, self.port), self.timeout, **extra_kw
            )
    
        except SocketTimeout:
            raise ConnectTimeoutError(
                self,
                "Connection to %s timed out. (connect timeout=%s)"
                % (self.host, self.timeout),
            )
    
        except SocketError as e:
>           raise NewConnectionError(
                self, "Failed to establish a new connection: %s" % e
            )
E           urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f9e7f17a770>: Failed to establish a new connection: [Errno 111] Connection refused

../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/connection.py:186: NewConnectionError

During handling of the above exception, another exception occurred:

self = <requests.adapters.HTTPAdapter object at 0x7f9e7f17a140>
request = <PreparedRequest [POST]>, stream = False
timeout = Timeout(connect=None, read=None, total=None), verify = True
cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection(request.url, proxies)
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
>           resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )

../../anaconda3/envs/vllm/lib/python3.10/site-packages/requests/adapters.py:486: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/connectionpool.py:798: in urlopen
    retries = retries.increment(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Retry(total=0, connect=None, read=False, redirect=None, status=None)
method = 'POST', url = '/generate', response = None
error = NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9e7f17a770>: Failed to establish a new connection: [Errno 111] Connection refused')
_pool = <urllib3.connectionpool.HTTPConnectionPool object at 0x7f9e7f17a3b0>
_stacktrace = <traceback object at 0x7f9e7f1b1900>

    def increment(
        self,
        method=None,
        url=None,
        response=None,
        error=None,
        _pool=None,
        _stacktrace=None,
    ):
        """Return a new Retry object with incremented retry counters.
    
        :param response: A response object, or None, if the server did not
            return a response.
        :type response: :class:`~urllib3.response.HTTPResponse`
        :param Exception error: An error encountered during the request, or
            None if the response was received successfully.
    
        :return: A new ``Retry`` object.
        """
        if self.total is False and error:
            # Disabled, indicate to re-raise the error.
            raise six.reraise(type(error), error, _stacktrace)
    
        total = self.total
        if total is not None:
            total -= 1
    
        connect = self.connect
        read = self.read
        redirect = self.redirect
        status_count = self.status
        other = self.other
        cause = "unknown"
        status = None
        redirect_location = None
    
        if error and self._is_connection_error(error):
            # Connect retry?
            if connect is False:
                raise six.reraise(type(error), error, _stacktrace)
            elif connect is not None:
                connect -= 1
    
        elif error and self._is_read_error(error):
            # Read retry?
            if read is False or not self._is_method_retryable(method):
                raise six.reraise(type(error), error, _stacktrace)
            elif read is not None:
                read -= 1
    
        elif error:
            # Other retry?
            if other is not None:
                other -= 1
    
        elif response and response.get_redirect_location():
            # Redirect retry?
            if redirect is not None:
                redirect -= 1
            cause = "too many redirects"
            redirect_location = response.get_redirect_location()
            status = response.status
    
        else:
            # Incrementing because of a server error like a 500 in
            # status_forcelist and the given method is in the allowed_methods
            cause = ResponseError.GENERIC_ERROR
            if response and response.status:
                if status_count is not None:
                    status_count -= 1
                cause = ResponseError.SPECIFIC_ERROR.format(status_code=response.status)
                status = response.status
    
        history = self.history + (
            RequestHistory(method, url, error, status, redirect_location),
        )
    
        new_retry = self.new(
            total=total,
            connect=connect,
            read=read,
            redirect=redirect,
            status=status_count,
            other=other,
            history=history,
        )
    
        if new_retry.is_exhausted():
>           raise MaxRetryError(_pool, url, error or ResponseError(cause))
E           urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9e7f17a770>: Failed to establish a new connection: [Errno 111] Connection refused'))

../../anaconda3/envs/vllm/lib/python3.10/site-packages/urllib3/util/retry.py:592: MaxRetryError

During handling of the above exception, another exception occurred:

api_server = None

    def test_api_server(api_server):
        """
        Run the API server and test it.
    
        We run both the server and requests in separate processes.
    
        We test that the server can handle incoming requests, including
        multiple requests at the same time, and that it can handle requests
        being cancelled without crashing.
        """
        # Run a simple request
>       response = _query_server("Hello world")

tests/async_engine/test_api_server.py:50: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/async_engine/test_api_server.py:15: in _query_server
    response = requests.post("http://localhost:8000/generate",
../../anaconda3/envs/vllm/lib/python3.10/site-packages/requests/api.py:115: in post
    return request("post", url, data=data, json=json, **kwargs)
../../anaconda3/envs/vllm/lib/python3.10/site-packages/requests/api.py:59: in request
    return session.request(method=method, url=url, **kwargs)
../../anaconda3/envs/vllm/lib/python3.10/site-packages/requests/sessions.py:589: in request
    resp = self.send(prep, **send_kwargs)
../../anaconda3/envs/vllm/lib/python3.10/site-packages/requests/sessions.py:703: in send
    r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <requests.adapters.HTTPAdapter object at 0x7f9e7f17a140>
request = <PreparedRequest [POST]>, stream = False
timeout = Timeout(connect=None, read=None, total=None), verify = True
cert = None, proxies = OrderedDict()

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection(request.url, proxies)
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )
    
        except (ProtocolError, OSError) as err:
            raise ConnectionError(err, request=request)
    
        except MaxRetryError as e:
            if isinstance(e.reason, ConnectTimeoutError):
                # TODO: Remove this in 3.0.0: see #2811
                if not isinstance(e.reason, NewConnectionError):
                    raise ConnectTimeout(e, request=request)
    
            if isinstance(e.reason, ResponseError):
                raise RetryError(e, request=request)
    
            if isinstance(e.reason, _ProxyError):
                raise ProxyError(e, request=request)
    
            if isinstance(e.reason, _SSLError):
                # This branch is for urllib3 v1.22 and later.
                raise SSLError(e, request=request)
    
>           raise ConnectionError(e, request=request)
E           requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /generate (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f9e7f17a770>: Failed to establish a new connection: [Errno 111] Connection refused'))

../../anaconda3/envs/vllm/lib/python3.10/site-packages/requests/adapters.py:519: ConnectionError
=============================== warnings summary ===============================
../../anaconda3/envs/vllm/lib/python3.10/site-packages/ray/_private/parameter.py:4
  /home/zhuohan/anaconda3/envs/vllm/lib/python3.10/site-packages/ray/_private/parameter.py:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
    import pkg_resources

../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
  /home/zhuohan/anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
  /home/zhuohan/anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871
  /home/zhuohan/anaconda3/envs/vllm/lib/python3.10/site-packages/pkg_resources/__init__.py:2871: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('zope')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

../../anaconda3/envs/vllm/lib/python3.10/site-packages/ray/tune/logger/tensorboardx.py:41
  /home/zhuohan/anaconda3/envs/vllm/lib/python3.10/site-packages/ray/tune/logger/tensorboardx.py:41: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`.  (Deprecated NumPy 1.24)
    VALID_NP_HPARAMS = (np.bool8, np.float32, np.float64, np.int32, np.int64)

../../anaconda3/envs/vllm/lib/python3.10/site-packages/ray/tune/logger/tensorboardx.py:169
  /home/zhuohan/anaconda3/envs/vllm/lib/python3.10/site-packages/ray/tune/logger/tensorboardx.py:169: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`.  (Deprecated NumPy 1.24)
    VALID_NP_HPARAMS = (np.bool8, np.float32, np.float64, np.int32, np.int64)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/async_engine/test_api_server.py::test_api_server - requests.exce...
======================== 1 failed, 11 warnings in 0.34s ========================

@viktor-ferenczi
Copy link
Contributor Author

Reproduced the difference in test runs here, working on it.

@viktor-ferenczi
Copy link
Contributor Author

Fixed the test case to let the test API server to run before doing the newly added tests.

@zhuohan123
Copy link
Member

@viktor-ferenczi Not sure why but when I run pytest tests/async_engine/test_api_server.py the test just hangs now.

@viktor-ferenczi
Copy link
Contributor Author

viktor-ferenczi commented Oct 15, 2023

@zhuohan123 That test runs just fine for me. Rebased the branch to current master. Please try again.

Copy link

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions bot added the stale label Oct 31, 2024
Copy link

mergify bot commented Oct 31, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. @viktor-ferenczi please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants