Tuesday, August 14, 2012

How to Read Excel Files in Java

Here in this post we will create a simple project to Read and Write to Excel files. Excel is a popular Microsoft application where in data can be entered in rows and columns in spread sheet and manipulated. We will be using Apache POI libraries to demonstrate Excel Handling.

Typical examples where in you can use this post is:
  • In a web application you might want to push the data in your DB to an Excel sheet and provide to end user. Your bank statement can be seen as an example for this.
  • QA(testing) automation scenario,The data required to perform automated execution might required to be read from Excel sheet.

Contents:

Download required Jar Files

We will be using Apache POI libraries to demonstrate. Apache POI is the Java API provided to handle Microsoft office documents.

  • Download Apache POI binary distribution from here.
  • Extract the zip file to a convenient location in your PC.
  • You should see below folder structure in extracted zip file apart from POI distribution jars.
    • docs -> API docs and Examples
    • lib -> Holds common lib like junit, Log4j etc.
    • ooxml-lib -> Dom, stax and XmlBeans jars (required to handle XLSX formats)

Create Project Setup & Folders





Create the Folder structure for the project as shown below.
  • src folder will hold simple wrapper classes we write.
  • lib folder will hold the required jar / libraries.
  • excel_sheets folder will have the excel sheets we manipulate.

Include the jar files in class path

We need to copy (a few) of the extracted Jar files to lib folder in project from extracted POI zip.

  • Navigate to ooxml-lib in POI root folder.
    • Copy dom4j-x.x.x.jar,xmlbeans-x.x.x.jar to lib folder in Eclipse Java project.
  • Navigate to POI root folder.
    • Copy poi-ooxml-schemas-3.x-xxxxxxxx.jar, poi-ooxml-3.x-xxxxxxxx.jar, poi-3.x-xxxxxxxx.jar files for lib folder in Eclipse Java project.
  • The final project setup and lib folder contents should look something like below.

    • Select all the external jar files in lib folder.
    • Right click on Jar files.
      • From Context menu select 'Build path'.
      • Select 'Add to build Path' in sub-context menu.

    •  External libraries included in project Build path should appear in referenced libraries section as below.


      Read Excel Sheet


      Java code to Read XLSX and XLS format Sheets


      
      /**
       * @author "Raghu Kiran"
       *
       * This program is free software; you can redistribute it and/or
       * modify it under the terms of the GNU General Public License
       * as published by the Free Software Foundation; either version 2
       * of the License, or (at your option) any later version.
       *
       * This program is distributed in the hope that it will be useful,
       * but WITHOUT ANY WARRANTY; without even the implied warranty of
       * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       * GNU General Public License for more details.
       *
       * Feel Free to use/modify/distribute this code without removing this note.
       * No warranty or Support will be provided what so ever.
       * 
       */
      package com.automation.java.excel;
      
      import java.io.FileInputStream;
      import java.io.FileNotFoundException;
      import java.io.IOException;
      import java.io.InputStream;
      import java.util.Date;
      import java.text.SimpleDateFormat;
      
      // Imports for XLS format Excel Sheet.
      import org.apache.poi.hssf.usermodel.HSSFCell;
      import org.apache.poi.hssf.usermodel.HSSFRow;
      import org.apache.poi.hssf.usermodel.HSSFSheet;
      import org.apache.poi.hssf.usermodel.HSSFWorkbook;
      
      // Imports required for XLSX format Excel sheet.
      import org.apache.poi.ss.usermodel.DateUtil;
      import org.apache.poi.xssf.usermodel.XSSFCell;
      import org.apache.poi.xssf.usermodel.XSSFRow;
      import org.apache.poi.xssf.usermodel.XSSFSheet;
      import org.apache.poi.xssf.usermodel.XSSFWorkbook;
      
      /**
       * @author "Raghu Kiran"
       * 
       *         This is a simple java program to demonstrate reading a simple Excel
       *         file using Apache POI libraries. We can read both XLS & XLSX formats
       *         with POI.
       * 
       */
      public class ReadExcel {
      
         /**
          * A simple method to read Excel Sheet with XLS format.
          */
         public static void readXlsWorkbook() {
            System.out.println("Reading from XLS format Excel sheet.");
            try {
               InputStream myxls = new FileInputStream(
               "excel_sheets/read/simple_xls_sheet.xls");
      
               // HSSFWorkbook is used to read Excel 97 format workbooks.
               HSSFWorkbook wb = new HSSFWorkbook(myxls);
               HSSFSheet sheet = wb.getSheetAt(0); // first sheet
               HSSFRow row = sheet.getRow(2); // second row
      
               // The contents in our excel sheet are present in 3rd row.
               // Some of the cell contents are empty and some have data
               // a simple do while loop to read contents of third row
               // until it encounters a empty/null cell.
               HSSFCell cell;
               int i = 1;
               do {
                  cell = row.getCell(i); // Start from first column
                  if (cell == null) {
                     break;
                  }
                  if (cell.getCellType() == HSSFCell.CELL_TYPE_STRING) {
                     // Plain String Cell.
                     System.out.println("The Cell is a String with value : "
                           + cell.getStringCellValue());
                  } else if (cell.getCellType() == HSSFCell.CELL_TYPE_NUMERIC) {
      
                     // Excel stores the Date as a Numeric Contents. POI provides
                     // a Date utility to check
                     // if a particular cell is formatted as a date.
                     if (DateUtil.isCellDateFormatted(cell)) {
                        Date date = DateUtil.getJavaDate((double) cell
                              .getNumericCellValue());
                        SimpleDateFormat df = new SimpleDateFormat("dd/MM/yyyy HH:mm z");
                        System.out.println("The cell is a Date : " + df.format(date));
                     } else {
                        // treat the cell as 'double' number
                        System.out.println("The cell is a number : "
                              + cell.getNumericCellValue());
                     }
      
                  } else if (cell.getCellType() == HSSFCell.CELL_TYPE_BLANK) {
                     // Some of the cells will be rendered blank when cells in
                     // Excel are merged... one such cell
                     System.out
                     .println("A Blank Cell Encountered.... any merging happened???");
                  } else {
                     System.out.println("The cell is nothing we're interested in ...");
                  }
                  i++; // increment cell counter!!!
               } while (cell != null);
      
            } catch (FileNotFoundException e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
            } catch (IOException e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
            }
         }
      
         /**
          * A simple method to read Excel Sheet with XLSX format.
          */
         public static void readXlsxWorkbook() {
            System.out.println("Reading from XLSX format Excel sheet.");
            try {
               InputStream my_xlsx = new FileInputStream(
               "excel_sheets/read/simple_xlsx_sheet.xlsx");
               XSSFWorkbook wb = new XSSFWorkbook(my_xlsx);
               XSSFSheet sheet = wb.getSheetAt(0); // first sheet
               XSSFRow row = sheet.getRow(2); // second row
               // The contents in our excel sheet are present in 3rd row.
               // Some of the cell contents are empty and some have data
               // a simple do while loop to read contents of third row
               // until it encounters a empty/null cell.
               XSSFCell cell;
               int i = 1;
               do {
                  cell = row.getCell(i); // Start from first column
                  if (cell == null) {
                     break;
                  }
                  if (cell.getCellType() == XSSFCell.CELL_TYPE_STRING) {
                     // Plain String Cell.
                     System.out.println("The Cell is a String with value : "
                           + cell.getStringCellValue());
                  } else if (cell.getCellType() == XSSFCell.CELL_TYPE_NUMERIC) {
      
                     // Excel stores the Date as a Numeric Contents. POI provides
                     // a Date utility to check
                     // if a particular cell is formatted as a date.
                     if (DateUtil.isCellDateFormatted(cell)) {
                        // Store the Date from Cell in a Java Date object.
                        Date date = DateUtil.getJavaDate((double) cell
                              .getNumericCellValue());
                        // Specify the format in which the dates have to be
                        // printed to console.
                        SimpleDateFormat df = new SimpleDateFormat("dd/MM/yyyy HH:mm z");
                        System.out.println("The cell is a Date : " + df.format(date));
                     } else {
                        // treat the cell as 'double' number
                        System.out.println("The cell is a number : "
                              + cell.getNumericCellValue());
                     }
                  } else if (cell.getCellType() == XSSFCell.CELL_TYPE_BLANK) {
                     // Some of the cells will be rendered blank when cells in
                     // Excel are merged... one such cell
                     System.out
                     .println("A Blank Cell Encountered .... any merging happened???");
                  } else {
                     System.out.println("The cell is nothing we're interested in ...");
                  }
                  i++; // increment cell counter!!!
               } while (cell != null);
            } catch (FileNotFoundException e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
            } catch (IOException e) {
               // TODO Auto-generated catch block
               e.printStackTrace();
            }
         }
      
         /**
          * @param args
          */
         public static void main(String[] args) {
            // TODO Auto-generated method stub
            readXlsWorkbook();
      
            readXlsxWorkbook();
         }
      
      }
      
      

      Resources

      3 comments

      1. This comment has been removed by the author.

        ReplyDelete
      2. You can read excel file by using Aspose.Cells for Java API and you can also edit the file by using this API. Try it :

        http://www.aspose.com/java/excel-component.aspx

        ReplyDelete