Skip to content
Rakesh Vardan edited this page Jan 20, 2024 · 3 revisions

Introduction to Selenium WebDriver

What is Selenium WebDriver?

Selenium WebDriver is a tool for automating web browsers to perform various tasks, including testing web applications. It allows you to control a web browser programmatically and interact with web elements.

Why Selenium WebDriver?

Selenium WebDriver is preferred for web automation because:

  • It supports multiple programming languages, including Java.
  • It can work with various browsers like Chrome, Firefox, Edge, etc.
  • It allows you to automate repetitive tasks and perform functional testing.

Setting up the Environment

Java Setup

Before using Selenium WebDriver with Java, ensure that you have Java Development Kit (JDK) installed and set up the environment variables.

  • Install JDK: Download and install the JDK from the official Oracle website or a trusted source.
  • Set PATH Environment Variable: Add the JDK's bin directory to the system's PATH environment variable. This enables the command prompt to recognize Java commands.
# Example (on Windows):
set PATH=C:\path\to\jdk\bin;%PATH%

Selenium WebDriver Setup

To use Selenium WebDriver with Java, you need to download the WebDriver JAR files and configure your Java project to include them.

  • Download WebDriver JAR Files: Download the WebDriver JAR files for the browsers you intend to automate (e.g., ChromeDriver, GeckoDriver for Firefox).
  • Configure Java Project: In your Java project, add these JAR files to the build path. This allows your code to interact with the browsers.
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

public class SeleniumExample {
    public static void main(String[] args) {
        // Set the system property to the location of ChromeDriver
        System.setProperty("webdriver.chrome.driver", "path/to/chromedriver.exe");

        // Initialize a WebDriver instance for Chrome
        WebDriver driver = new ChromeDriver();

        // Your automation code goes here

        // Close the browser when done
        driver.quit();
    }
}

Basics of Selenium WebDriver

Creating WebDriver Instances

To automate web browsers, you need to create instances of WebDriver for the specific browser you want to control.

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

public class SeleniumExample {
    public static void main(String[] args) {
        // Set the system property to the location of ChromeDriver
        System.setProperty("webdriver.chrome.driver", "path/to/chromedriver.exe");

        // Initialize a WebDriver instance for Chrome
        WebDriver driver = new ChromeDriver();
    }
}

Navigating to URLs

Use the get() method to navigate to a specific URL.

driver.get("https://www.example.com");

Document Object Model

The Document Object Model, commonly known as DOM, is a cross-platform and language-independent interface. In the context of HTML, it is a programming interface for web documents. It represents the structure of a document, in this case, a webpage, as a tree structure, where each node is an object representing a part of the document.

  1. The DOM allows programmers to manipulate the structure and content of a web document. By leveraging the DOM, scripts can add, remove, and modify elements within the HTML. 2. In the DOM tree structure representation of an HTML document, everything in an HTML document is a node. For example, elements are element nodes, the text inside elements is a text node, and HTML attributes are attribute nodes.
  2. DOM presents HTML document as a hierarchical structure or tree of nodes, with the root node being the Document, which then branches out to Element Nodes such as HTML, which further branches out into more Element nodes such as Head and Body, Text Nodes, and so on.
  3. You can select these nodes on the DOM tree with methods like getElementById(), getElementsByClassName(), getElementsByTagName(), querySelector(), and querySelectorAll() among others.
  4. One of the powerful features of the DOM, is the ability to handle events. These are actions such as a user clicking on an element, loading of the webpage, hovering over an element, pressing a key on the keyboard etc.
  5. Learning DOM is crucial when learning Selenium for web automation testing because Selenium uses the DOM structure to navigate and locate elements through the page structure. In the training, I would recommend focusing on real examples and giving your participants hands-on assignments to fully understand the DOM functionalities. It would also be beneficial to understand the relationship between JavaScript, HTML, and the DOM as JS is commonly used in web automation for manipulating the DOM.

Let's look at some basic examples, assuming we are dealing with the following HTML code represented as a DOM tree:

html<!DOCTYPE html>
<html>
  <head>
    <title>My Title</title>
  </head>
  <body>
    <h1 id="heading1">Hello World</h1>
    <p class="myClass">This is a paragraph.</p>
    <p class="myClass">This is another paragraph.</p>
  </body>
</html>

The DOM representation would look like this:

Untitled

Now, let's look at some examples of how JavaScript might interact with this HTML via the DOM:

  1. Changing the content of an element:

Untitled

After the script runs, our h1 element will now read "New Heading" instead of "Hello World". 2. Changing an attribute of an element:

Untitled

Untitled

Untitled

  1. Adding a new element to the document:
let para = document.createElement("p");
let node = document.createTextNode("This is new.");
para.appendChild(node);
let element = document.getElementById("div1");
element.appendChild(para);

This script creates a new paragraph element, adds text to that element, and then appends it to an existing element with the id "div1". 4. Event handling:

const el = document.getElementById("heading1");
if (el) {
  el.addEventListener("click", function () {
    alert("You clicked the Heading!");
  });
}

This script listens for a 'click' event on the h1 element, and when it occurs, it triggers a function that alerts "You clicked the Heading!". Remember that in real scenarios, the DOM can be much more complex and nested which is why understanding of DOM is crucial. Also, these examples only cover a tiny amount of what the DOM is capable of, but they're good for getting a handle on the basics!

Locating Web Elements

Selenium provides various methods to locate web elements, such as findElement() and findElements(). You can use various locators like ID, name, XPath, or CSS selectors.

import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;

// Find an element by ID
WebElement elementById = driver.findElement(By.id("elementId"));

// Find elements by XPath
List<WebElement> elementsByXPath = driver.findElements(By.xpath("//div[@class='example']"));

These code examples demonstrate how to create a WebDriver instance, navigate to a URL, and locate web elements using different locators.

Interacting with Web Elements

Interacting with Text Boxes and Buttons

You can send keys to text boxes and click buttons using the sendKeys() and click() methods.

WebElement textBox = driver.findElement(By.id("username"));
textBox.sendKeys("myusername");

WebElement loginButton = driver.findElement(By.id("loginBtn"));
loginButton.click();

Working with Links

To click hyperlinks, locate the element and use the click() method.

WebElement link = driver.findElement(By.linkText("Click Me"));
link.click();

Dropdowns and Select Elements

Use the Select class to interact with dropdown menus.

import org.openqa.selenium.support.ui.Select;

WebElement dropdown = driver.findElement(By.id("countryDropdown"));
Select select = new Select(dropdown);

// Select by visible text
select.selectByVisibleText("United States");

// Select by index
select.selectByIndex(2);

Radio Buttons and Checkboxes

For radio buttons and checkboxes, use the click() method to select or deselect them.

WebElement radioButton = driver.findElement(By.id("radioButton"));
radioButton.click();

WebElement checkbox = driver.findElement(By.id("agreeCheckbox"));
if (!checkbox.isSelected()) {
    checkbox.click();
}

These code examples show how to interact with text boxes, buttons, links, dropdowns, radio buttons, and checkboxes using Selenium WebDriver.

Advanced WebDriver Techniques

WebDriver Waits

WebDriver waits are essential to handle synchronization issues. You can use implicit and explicit waits along with ExpectedConditions.

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;

WebDriverWait wait = new WebDriverWait(driver, 10);
WebElement element = wait.until(ExpectedConditions.visibilityOfElementLocated(By.id("elementId")));

Handling Alerts and Pop-ups

You can interact with JavaScript alerts, confirms, and prompts using the Alert class.

import org.openqa.selenium.Alert;

Alert alert = driver.switchTo().alert();
String alertText = alert.getText();
alert.accept(); // or alert.dismiss() for canceling

Switching Between Windows and Frames

Selenium allows you to switch between browser windows and iframes.

// Switch to a new window
String mainWindow = driver.getWindowHandle();
Set<String> allWindows = driver.getWindowHandles();
for (String window : allWindows) {
    if (!window.equals(mainWindow)) {
        driver.switchTo().window(window);
    }
}

// Switch back to the main window
driver.switchTo().window(mainWindow);

// Switch to an iframe
WebElement iframeElement = driver.findElement(By.id("iframeId"));
driver.switchTo().frame(iframeElement);

// Switch back to the default content
driver.switchTo().defaultContent();

Mouse and Keyboard Actions

You can perform various mouse and keyboard actions, such as mouse hover and simulating keyboard shortcuts.

import org.openqa.selenium.interactions.Actions;

Actions actions = new Actions(driver);

// Mouse hover over an element
WebElement elementToHover = driver.findElement(By.id("hoverElement"));
actions.moveToElement(elementToHover).perform();

// Right-click on an element
WebElement elementToRightClick = driver.findElement(By.id("rightClickElement"));
actions.contextClick(elementToRightClick).perform();

// Simulate keyboard shortcuts
actions.keyDown(Keys.CONTROL).sendKeys("c").keyUp(Keys.CONTROL).perform(); // Ctrl+C

Page Object Model (POM)

What is POM?

The Page Object Model (POM) is a design pattern used in test automation to improve code organization and maintenance. It involves creating separate classes for each web page or component, encapsulating page-specific methods and locators.

Creating Page Objects

To implement POM, create a class for each web page. In this class, define locators and methods that interact with elements on that page.

public class LoginPage {
    private WebDriver driver;

    // Locators
    private By usernameField = By.id("username");
    private By passwordField = By.id("password");
    private By loginButton = By.id("loginBtn");

    // Constructor
    public LoginPage(WebDriver driver) {
        this.driver = driver;
    }

    // Page-specific methods
    public void enterUsername(String username) {
        driver.findElement(usernameField).sendKeys(username);
    }

    public void enterPassword(String password) {
        driver.findElement(passwordField).sendKeys(password);
    }

    public void clickLogin() {
        driver.findElement(loginButton).click();
    }
}

Test Automation Using POM

In your test scripts, create an instance of the page object class for each page. Use these objects to interact with the page's elements.

public class LoginTest {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        driver.get("https://www.example.com/login");

        LoginPage loginPage = new LoginPage(driver);

        loginPage.enterUsername("myusername");
        loginPage.enterPassword("mypassword");
        loginPage.clickLogin();

        // Continue with test steps for the next page
    }
}

Using the Page Object Model, you create a clear separation between page-specific code and test script code, making it easier to maintain and update your test cases.

Handling File Uploads and Downloads

Uploading Files

To automate file uploads, locate the file input element and use the sendKeys() method to provide the file path.

WebElement fileInput = driver.findElement(By.id("fileInput"));
fileInput.sendKeys("path/to/file.txt");

Downloading Files

Handling file downloads can be tricky because Selenium cannot directly access the file system. You can configure the browser to download files to a specific directory and then access them using Java.

// Set the download directory using ChromeOptions
ChromeOptions options = new ChromeOptions();
options.addArguments("--download.default_directory=/path/to/download/dir");
WebDriver driver = new ChromeDriver(options);

// Perform actions that trigger file downloads

// Use Java to access downloaded files
File downloadedFile = new File("/path/to/download/dir/filename.txt");

Testing Frameworks

JUnit and TestNG

JUnit and TestNG are popular testing frameworks in the Java ecosystem. They provide annotations and assertions to structure and manage your test cases.

Test Annotations and Test Suites

In JUnit and TestNG, you use annotations to mark your test methods, and these frameworks handle test execution.

import org.junit.Test;
import org.testng.annotations.Test;

public class MyTest {
    @Test
    public void myTestMethod() {
        // Your test logic here
    }
}

You can create test suites by grouping multiple test classes together in a configuration XML file.

Data-Driven Testing

Parameterization

Data-driven testing involves running the same test with different sets of data. JUnit and TestNG support parameterized tests.

import org.testng.annotations.Parameters;
import org.testng.annotations.Test;

public class DataDrivenTest {

    @Parameters({"username", "password"})
    @Test
    public void loginTest(String username, String password) {
        // Your test logic using username and password
    }
}

You can pass data sets from an XML file or a data provider method.

Reporting and Logging

Generating Reports

To generate test reports, you can use reporting libraries like Extent Reports or TestNG Reports. These libraries provide detailed reports about test execution.

import com.aventstack.extentreports.ExtentReports;
import com.aventstack.extentreports.ExtentTest;
import com.aventstack.extentreports.Status;

public class ReportingExample {
    public static void main(String[] args) {
        ExtentReports extent = new ExtentReports();
        ExtentTest test = extent.createTest("My Test");

        // Your test logic here

        test.log(Status.PASS, "Test passed!");
        extent.flush(); // Generate the report
    }
}

Logging Test Activities

Logging is crucial for debugging and tracking the progress of your test scripts. You can use Java's built-in logging or external libraries like Log4j.

import java.util.logging.Logger;

public class LoggingExample {
    public static void main(String[] args) {
        Logger logger = Logger.getLogger("MyLogger");

        // Log messages
        logger.info("Informational message");
        logger.warning("Warning message");
    }
}

Best Practices

Maintainable Test Scripts

Writing maintainable test scripts is crucial for long-term success. Follow best practices to keep your code clean, readable, and easy to maintain.

Code Example (Best Practices):

public class LoginTest {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();

        // Use meaningful variable and method names
        LoginPage loginPage = new LoginPage(driver);
        loginPage.enterUsername("myusername");
        loginPage.enterPassword("mypassword");
        loginPage.clickLogin();

        // Add comments to explain the purpose of the code
        // Wait for the login to complete
        WebDriverWait wait = new WebDriverWait(driver, 10);
        wait.until(ExpectedConditions.urlToBe("https://www.example.com/dashboard"));

        // Perform assertions to validate the test
        String expectedTitle = "Dashboard - Example App";
        String actualTitle = driver.getTitle();
        Assert.assertEquals(actualTitle, expectedTitle);

        // Clean up resources
        driver.quit();
    }
}

Handling Synchronization Issues

Synchronization issues can lead to flaky tests. Use explicit waits and ExpectedConditions to handle synchronization problems.

Code Example (Handling Synchronization):

public class SynchronizationExample {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        driver.get("https://www.example.com/login");

        WebDriverWait wait = new WebDriverWait(driver, 10);

        // Wait for an element to be clickable
        WebElement loginButton = driver.findElement(By.id("loginBtn"));
        wait.until(ExpectedConditions.elementToBeClickable(loginButton));
        loginButton.click();

        // Wait for a specific URL
        wait.until(ExpectedConditions.urlToBe("https://www.example.com/dashboard"));

        // Continue with the test
    }
}

Cross-Browser Testing

Cross-Browser Compatibility

Testing your web application on multiple browsers ensures a broader audience can use your application. Selenium supports various browsers.

Code Example (Cross-Browser Testing):

public class CrossBrowserTest {
    public static void main(String[] args) {
        WebDriver driver;

        // Test on Chrome
        driver = new ChromeDriver();
        performTest(driver);

        // Test on Firefox
        driver = new FirefoxDriver();
        performTest(driver);

        // Clean up resources
        driver.quit();
    }

    public static void performTest(WebDriver driver) {
        driver.get("https://www.example.com");
        // Perform test actions
        // ...
    }
}

Parallel Execution

Parallel Test Execution

Parallel test execution allows you to run multiple test cases simultaneously, reducing test execution time.

Code Example (TestNG Parallel Execution):

import org.testng.annotations.Test;

public class ParallelTestExample {
    @Test
    public void testMethod1() {
        // Test logic for method 1
    }

    @Test
    public void testMethod2() {
        // Test logic for method 2
    }
}

You can configure parallel execution in the test framework you're using, such as TestNG or JUnit.

Continuous Integration (CI)

Introduction to CI/CD

Continuous Integration (CI) is a practice where code changes are automatically built, tested, and integrated into the main codebase. Popular CI/CD platforms include Jenkins, Travis CI, CircleCI, and GitLab CI/CD.

Jenkins CI Configuration:

  1. Set up Jenkins with the necessary plugins.
  2. Create a Jenkins job that fetches your code from a version control system (e.g., Git).
  3. Configure build and test steps.
  4. Schedule the job to run automatically.

Continuous integration ensures that your tests are run whenever code changes are committed to the repository.

Selenium Grid

Selenium Grid Setup

Selenium Grid allows you to distribute tests across multiple machines and browsers, providing scalability for test automation.

Selenium Grid Configuration:

  1. Set up a Selenium Grid hub and nodes on different machines.
  2. Configure your test scripts to use the hub's URL to distribute tests.
  3. Specify the desired browser and platform for each test.
// Define the WebDriver capabilities (browser and platform)
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setBrowserName("chrome");
capabilities.setPlatform(Platform.WINDOWS);

// Create a RemoteWebDriver instance by connecting to the hub
WebDriver driver = new RemoteWebDriver(new URL("http://grid-hub-url:4444/wd/hub"), capabilities);

Selenium Grid is essential for running tests in parallel on different environments and browsers.

Miscellaneous

Handling Cookies

Cookies are often used for session management in web applications. Selenium provides methods to interact with cookies.

Code Example (Handling Cookies):

public class CookieExample {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        driver.get("https://www.example.com");

        // Get all cookies
        Set<Cookie> cookies = driver.manage().getCookies();

        // Add a new cookie
        Cookie newCookie = new Cookie("myCookieName", "myCookieValue");
        driver.manage().addCookie(newCookie);

        // Delete a cookie
        driver.manage().deleteCookieNamed("myCookieName");

        // Delete all cookies
        driver.manage().deleteAllCookies();

        // Manipulate cookies as needed
    }
}

Handling JavaScript Alerts

Sometimes, web applications use JavaScript pop-ups and alerts. Selenium provides methods to handle these alerts.

Code Example (Handling JavaScript Alerts):

public class JavaScriptAlertExample {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        driver.get("https://www.example.com");

        // Trigger an alert
        WebElement alertButton = driver.findElement(By.id("alertButton"));
        alertButton.click();

        // Switch to the alert
        Alert alert = driver.switchTo().alert();

        // Get the alert text
        String alertText = alert.getText();

        // Accept the alert (click OK)
        alert.accept();

        // Dismiss the alert (click Cancel)
        alert.dismiss();

        // Enter text in a prompt alert
        alert.sendKeys("Text to enter in the prompt");

        // Handle other types of alerts as needed
    }
}

Troubleshooting and Debugging

Debugging Techniques

Debugging is an essential skill for identifying and fixing issues in your test scripts. You can use various tools and techniques for debugging.

Code Example (Basic Debugging Techniques):

public class DebuggingExample {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        driver.get("https://www.example.com");

        // Set a breakpoint in your IDE to pause execution
        // and inspect variables and step through code

        // Use System.out.println to print variable values
        String pageTitle = driver.getTitle();
        System.out.println("Page Title: " + pageTitle);

        // Use try-catch blocks to handle exceptions gracefully
        try {
            WebElement element = driver.findElement(By.id("nonExistentElement"));
            element.click();
        } catch (NoSuchElementException e) {
            System.err.println("Element not found: " + e.getMessage());
        }
    }
}

Tips for Efficient Test Automation

Code Example (Efficient Test Automation Practices):

public class AutomationTips {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        driver.get("https://www.example.com");

        // Tip 1: Use Page Object Model for better code organization
        LoginPage loginPage = new LoginPage(driver);

        // Tip 2: Implement explicit waits for better synchronization
        WebDriverWait wait = new WebDriverWait(driver, 10);
        WebElement element = wait.until(ExpectedConditions.visibilityOfElementLocated(By.id("elementId")));

        // Tip 3: Maintain a clean and clear code structure
        // ...

        // Tip 4: Use version control (e.g., Git) to track changes
        // ...

        // Tip 5: Regularly review and refactor your code for improvements
        // ...

        // Tip 6: Parallelize and distribute tests for faster execution
        // ...

        // Tip 7: Integrate with CI/CD pipelines for continuous testing
        // ...

        // Tip 8: Regularly update your automation tools and dependencies
        // ...

        // Continue with your test automation practices
    }
}
Clone this wiki locally