Wednesday, October 15, 2014

WebDriver fix for UnreachableBrowserException

This page is a summary of 3 different fixes for 3 different ways how to get UnreachableBrowserException-s while using WebDrivers

1. Caused by: java.net.SocketException: Software caused connection abort: recv failed


To fix this issue just replace usage of PhantomJSDriver with following FixedPhantomJSDriver:

public class FixedPhantomJSDriver extends PhantomJSDriver {

    private final int retryCount = 2;

    public FixedPhantomJSDriver() {
    }

    public FixedPhantomJSDriver(Capabilities desiredCapabilities) {
        super(desiredCapabilities);
    }

    public FixedPhantomJSDriver(PhantomJSDriverService service, Capabilities desiredCapabilities) {
        super(service, desiredCapabilities);
    }

    @Override
    protected Response execute(String driverCommand, Map<String, ?> parameters) {
        int retryAttempt = 0;

        while (true) {
            try {

                return super.execute(driverCommand, parameters);

            } catch (UnreachableBrowserException e) {
                retryAttempt++;
                if (retryAttempt > retryCount) {
                    throw e;
                }
            }
        }
    }
}
So in summary:
org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died.
caused by:
Caused by: java.net.SocketException: Software caused connection abort: recv failed
(the recv failed is important)
happens (from my investigations) when connection is closed prematurely. In our case it seemed to be caused by ssl certificate handling in java (I'm still investigating this) and is extremely random. Luckily all http traffic is handled by the execute method. So by overriding it and adding simple retry functionality you provide a working workaround solution (it helped us on our project as we never had a failing/flaky tests again).
Although this is a specific implementation for PhantomJS driver, the same approach should work for other drivers as well.

2. Caused by: java.net.SocketTimeoutException: Read timed out


But there is still chance that you have different symptoms and your browser just hangs for some time until it throws:
Caused by: java.net.SocketTimeoutException: Read timed out
If you're working on a Windows machine, the chances are that you've reached limit of possible open connections. This normally happens, because Selenium creates a lot of connections and Windows keeps them opened/cached even when java triggered a close connection command. To fix this issue you need to change Windows registry values under:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
you need to set/create two DWORD values:
MaxUserPort = 32768
TcpTimedWaitDelay = 30
MaxUserPort will increase the limit of possible open connections (you can select any value between 5000-65534, the higher the better). TcpTimedWaitDelay makes sure that windows will close stale connections (already closed by java) after 30 seconds (can't be set to lower, but without setting this the default value is 4 minutes !!!). In most cases this should fix your "hang" issue.

3. When your test still hangs for 3 hours until it fails


Unfortunately there is still a small chance that you have issues where your test will get stuck for 3 hours !!! The reason for this is that the HttpClientFactory in selenium has hardcoded socket timeout to 3 hours, and although there is a proposed fix to the selenium core code, until it will be accepted there is no way how to change it using normal means. For those who unfortunatelly must bear the pain, here is how you use my hacky, but working workaround to this problem:
public class FixExample {

    public static void main(String[] args) {
    
        // this is my custom workaround

        HttpParamsSetter.setSoTimeout(60 * 1000); // set socket timeout to 1 minute
        
        // and here goes your custom code

        WebDriver driver = new PhantomJSDriver();
        
        ...

        driver.quit();
    }
}
Here you can find the code that does the magic (this works because fields HttpCommandExecutor.httpClientFactory and HttpClientFactory.client are static fields that are initialized only once):
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.params.HttpConnectionParams;
import org.apache.http.params.HttpParams;
import org.openqa.selenium.remote.HttpCommandExecutor;
import org.openqa.selenium.remote.internal.HttpClientFactory;
import java.lang.reflect.Field;

public class HttpParamsSetter {

    @SuppressWarnings("deprecation")
    public static void setSoTimeout(int soTimeout) {
        HttpClientFactory factory = getStaticValue(HttpCommandExecutor.class, "httpClientFactory");
        if (factory == null) {
            factory = new HttpClientFactory();
        }

        DefaultHttpClient httpClient = (DefaultHttpClient) factory.getHttpClient();
        HttpParams params = httpClient.getParams();
        HttpConnectionParams.setSoTimeout(params, soTimeout);
        httpClient.setParams(params);

        setStaticValue(HttpCommandExecutor.class, "httpClientFactory", factory);
    }

    private static <T> T getStaticValue(Class<?> aClass, String fieldName) {
        Field field = null;
        Boolean isAccessible = null;
        try {
            field = aClass.getDeclaredField(fieldName);
            isAccessible = field.isAccessible();
            field.setAccessible(true);

            return (T) field.get(null);

        } catch (NoSuchFieldException e) {
            throw new RuntimeException(e);
        } catch (IllegalAccessException e) {
            throw new RuntimeException(e);
        } finally {
            if (field != null && isAccessible != null) {
                field.setAccessible(isAccessible);
            }
        }
    }

    private static void setStaticValue(Class<HttpCommandExecutor> aClass, String fieldName, Object value) {
        Field field = null;
        Boolean isAccessible = null;
        try {
            field = aClass.getDeclaredField(fieldName);
            isAccessible = field.isAccessible();
            field.setAccessible(true);

            field.set(null, value);

        } catch (NoSuchFieldException e) {
            throw new RuntimeException(e);
        } catch (IllegalAccessException e) {
            throw new RuntimeException(e);
        } finally {
            if (field != null && isAccessible != null) {
                field.setAccessible(isAccessible);
            }
        }
    }
}

19 comments:

  1. Hello! I wonder, have you decompiled org/openqa/selenium/phantomjs/PhantomJSDriver.class for that and compiled it back? Or somehow else?

    ReplyDelete
    Replies
    1. You don't have to do that as the code is available on github: https://github.com/detro/ghostdriver/blob/master/binding/java/src/main/java/org/openqa/selenium/phantomjs/PhantomJSDriver.java
      The problem is that the actual exception is thrown from the class RemoteWebDriver when it tries to do a Socket connections (actually java throws this Exception). PhantomJSDriver extends this RemoteWebDriver class.

      Delete
  2. Hi, I am trying to implement the HttpParamsSetter Class, but I am getting "Deprecated" message for: DefaultHttpClient, HttpParams and HttpConnectionParams, is that correct?

    ReplyDelete
    Replies
    1. I've updated the code a little bit to suppress the deprecation warnings. Unfortunately these classes are used in selenium even when they are deprecated in httpclient library that provides them so not much can be done about this.

      Delete
  3. Your workaround for (3) didn't work for us (httpclient version 4.4, selenium 2.44.0, using FirefoxDriver) because the call to "factory.getHttpClient()" returned an "InternalHttpClient" which is (a) package-private and (b) doesn't implement the "getParams" method.

    We were able to solve this instead with the code below (using Powermock for the reflection stuff).

    Thanks for pointing us in the right direction! :-)

    import org.apache.http.impl.client.CloseableHttpClient;
    import org.openqa.selenium.remote.HttpCommandExecutor;
    import org.openqa.selenium.remote.internal.HttpClientFactory;
    import org.powermock.reflect.Whitebox;


    public final class HttpParamsSetter {

    private HttpParamsSetter() {
    }


    public static void setSoTimeout(int soTimeout) {
    injectIntoHttpCommandExecutor(createClientFactoryWithSoTimeout(soTimeout));
    }


    private static HttpClientFactory createClientFactoryWithSoTimeout(int soTimeout) {
    HttpClientFactory httpClientFactory = new HttpClientFactory();
    Whitebox.setInternalState(httpClientFactory, "TIMEOUT_THREE_HOURS", soTimeout);
    CloseableHttpClient httpClient = httpClientFactory.createHttpClient(null);
    Whitebox.setInternalState(httpClientFactory, "httpClient", httpClient);
    return httpClientFactory;
    }


    private static void injectIntoHttpCommandExecutor(HttpClientFactory httpClientFactory) {
    Whitebox.setInternalState(HttpCommandExecutor.class, "httpClientFactory", httpClientFactory);
    }
    }

    ReplyDelete
    Replies
    1. Thankyou a lot for this solutions. And Thankyou to Matej for pointing us in the right direction!

      Delete
  4. Thanks a lot for the workaround!
    However this wasn't working for me in selenium 2.45, due to this fix https://github.com/SeleniumHQ/selenium/commit/c523c76c73f9f85558db45aded7ecf0201d9912c, where the variable "HttpClientFactory httpClientFactory" was replaced by "HttpClient.Factory defaultClientFactory". This is the small tweak needed:

    import org.apache.http.impl.client.CloseableHttpClient;
    import org.openqa.selenium.remote.HttpCommandExecutor;
    import org.openqa.selenium.remote.internal.ApacheHttpClient;
    import org.openqa.selenium.remote.internal.HttpClientFactory;
    import org.powermock.reflect.Whitebox;

    public final class HttpParamsSetter {

    private HttpParamsSetter() {
    }

    public static void setSoTimeout(int soTimeout) {
    injectIntoHttpCommandExecutor(createClientFactoryWithSoTimeout(soTimeout));
    }

    private static org.openqa.selenium.remote.http.HttpClient.Factory createClientFactoryWithSoTimeout(int soTimeout) {
    HttpClientFactory httpClientFactory = new HttpClientFactory();
    Whitebox.setInternalState(httpClientFactory, "TIMEOUT_THREE_HOURS", soTimeout);
    CloseableHttpClient httpClient = (CloseableHttpClient) httpClientFactory.createHttpClient(null);
    Whitebox.setInternalState(httpClientFactory, "httpClient", httpClient);
    return new ApacheHttpClient.Factory(httpClientFactory);
    }

    private static void injectIntoHttpCommandExecutor(org.openqa.selenium.remote.http.HttpClient.Factory httpClientFactory) {
    Whitebox.setInternalState(HttpCommandExecutor.class, "defaultClientFactory", httpClientFactory);
    }
    }

    ReplyDelete
  5. For selenium 2.45 try this:

    public final class HttpParamsSetter {

    private HttpParamsSetter() {
    }

    public static void setSoTimeout(int soTimeout) {
    injectIntoHttpCommandExecutor(createClientFactoryWithSoTimeout(soTimeout));
    }

    private static HttpClientFactory createClientFactoryWithSoTimeout(int soTimeout) {
    HttpClientFactory httpClientFactory = new HttpClientFactory();
    Whitebox.setInternalState(httpClientFactory, "TIMEOUT_THREE_HOURS", soTimeout);
    CloseableHttpClient httpClient = httpClientFactory.createHttpClient(null);
    Whitebox.setInternalState(httpClientFactory, "httpClient", httpClient);
    return httpClientFactory;
    }

    private static void injectIntoHttpCommandExecutor(HttpClientFactory httpClientFactory) {
    Whitebox.setInternalState(Factory.class, "defaultClientFactory", httpClientFactory);
    }
    }

    The problema now is for 2.46: the TIMEOUT_THREE_HOURS is now a static field and Whitebox.setInternalState not runs

    ReplyDelete
  6. For Selenium 2.46 use:
    Whitebox.setInternalStateFromContext(httpClientFactory, "TIMEOUT_THREE_HOURS", soTimeout);

    ReplyDelete
  7. Could you add import for Factory.class?

    ReplyDelete
    Replies
    1. I found org.openqa.selenium.remote.internal.ApacheHttpClient.Factory but for Selenium 2.46 is not working.

      Delete
  8. Is the import you say: org.openqa.selenium.remote.internal.ApacheHttpClient.Factory

    The code complete I use and runs Ok for me:

    import org.apache.http.impl.client.CloseableHttpClient;
    import org.openqa.selenium.remote.internal.HttpClientFactory;
    import org.openqa.selenium.remote.internal.ApacheHttpClient;
    import org.openqa.selenium.remote.internal.ApacheHttpClient.Factory;

    import org.powermock.reflect.Whitebox;


    public final class HttpParamsSetter {

    private HttpParamsSetter() {
    }

    public static void setSoTimeout(int soTimeout) {
    injectIntoHttpCommandExecutor(createClientFactoryWithSoTimeout(soTimeout));
    }

    private static HttpClientFactory createClientFactoryWithSoTimeout(int soTimeout) {
    HttpClientFactory httpClientFactory = new HttpClientFactory();
    //Whitebox.setInternalState(httpClientFactory, "TIMEOUT_THREE_HOURS", soTimeout); //Selenium 2.45
    Whitebox.setInternalStateFromContext(httpClientFactory, "TIMEOUT_THREE_HOURS", soTimeout); //Selenium 2.46 (compatible con la 2.48.2)
    CloseableHttpClient httpClient = httpClientFactory.createHttpClient(null);
    Whitebox.setInternalState(httpClientFactory, "httpClient", httpClient);
    return httpClientFactory;
    }

    private static void injectIntoHttpCommandExecutor(HttpClientFactory httpClientFactory) {
    Whitebox.setInternalState(Factory.class, "defaultClientFactory", httpClientFactory);
    }
    }

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
  9. For selenium 2.45 it is working great for me.
    For selenium 2.46+ still is not working. I don't know what I am doing wrong.
    Here is my code:

    selenium-java 2.46.0
    org.powermock 1.6.4

    public static void main(String[] args) throws MalformedURLException {
    HttpParamsSetter.setSoTimeout(1000 * 3);
    for (int i = 0; i < 100; i++) {
    WebDriver driver = new RemoteWebDriver(new URL("http://127.0.0.1:4444/wd/hub"), DesiredCapabilities.firefox());
    driver.quit();
    System.out.println(i);
    }
    }

    I run this code many times and this timeout is not setup.
    Have you any idea why this not work for me?

    ReplyDelete
    Replies
    1. you're right, reviewing the execution i observed that despite the Powermock sentence runs correctly the socketTimeout is unchanged. Sorry, i don't know the cause of that problem.

      Despite this, in my case the problem of 3 hours timeout has not become to produce with selenium versions 2.46+

      Delete
  10. What version java you have? Maybe this is problem. I have java 1.8.0_66.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete