Version 2.60.0 / March 20, 2022
❤️ Sponsor
Homepage
News
Download
For maven, you would add:
<dependency>
<groupId>net.sourceforge.htmlunit</groupId>
<artifactId>htmlunit</artifactId>
<version>2.60.0</version>
</dependency>
HtmlUnit is a "GUI-Less browser for Java programs". It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc... just like you do in your "normal" browser.
It has fairly good JavaScript support (which is constantly improving) and is able to work even with quite complex AJAX libraries, simulating Chrome, Firefox or Internet Explorer depending on the configuration used.
HtmlUnit is typically used for testing purposes or to retrieve information from web sites.
- Support for the HTTP and HTTPS protocols
- Support for cookies
- Ability to specify whether failing re ses from the server should throw exceptions or should be returned as pages of the appropriate type (based on content type)
- Support for submit methods POST and GET (as well as HEAD, DELETE, ...)
- Ability to customize the request headers being sent to the server
- Support for HTML responses
- Wrapper for HTML pages that provides easy access to all information contained inside them
- Support for submitting forms
- Support for clicking links
- Support for walking the DOM model of the HTML document
- Proxy server support
- Support for basic and NTLM authentication
- Excellent JavaScript support
You can start here
- Getting Started
- The Java Web Scraping Handbook A nice tutorial about webscraping with a lot of background information and details about HtmlUnit.
Pull Requests and and all other Community Contributions are essential for open source software. Every contribution - from bug reports to feature requests, typos to full new features - are greatly appreciated.
The latest builds are available from our Jenkins CI build server
If you use maven please add:
<dependency>
<groupId>net.sourceforge.htmlunit</groupId>
<artifactId>htmlunit</artifactId>
<version>2.61.0-SNAPSHOT</version>
</dependency>
You have to add the sonatype snapshot repository to your pom repositories section:
Maven
<repository>
<id>OSS Sonatype snapshots</id>
<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
<releases>
<enabled>false</enabled>
</releases>
</repository>
Gradle
maven { url "https://oss.sonatype.org/content/repositories/snapshots"}
This project is licensed under the Apache 2.0 License