File I/O
Home - About Us
 

Introduction

There are a huge number of Java classes associated input and output streams. This set of notes attempts to cover only a few key stream features. We will be looking at two fundamentally different ways of getting data to and from a file:
  • Byte streams
  • Character streams
We will also look at:
  • Random access files
  • The File class

Characters Versus Bytes

Data can be stored in a file in either a character format or in a binary format.

In the above picture it takes one byte of disk space to store the number 3, and five bytes of disk space to store the 56789. The short integer, in the binary form 00000010 11111101, has the value 765. The same two bytes can hold the value 0 (00000000 00000000), 4096 (00010000 00000000), 32,769 (10000000 00000001) or 65,535 (11111111 11111111). The typical four byte integer uses four bytes to store four billion different integers (the values from roughly negative two billion to positive four billion). Note that it takes ten bytes of memory to store 2,000,000,000 in character format (one byte each to to the digits as characters).

One of the advantages of having a file that contains character data is that the data can be viewed by a normal text editor, such as Microsoft Notepad or Wordpad. A disadvantage of storing numeric data in a character format is that a Java program that needs to manipulate that data must read it in as text then convert it into a binary integer form before being able to perform mathematical manipulations. Another disadvantage of character data is that the length of each data element is variable (the length depends on the number of characters) and some kind of delimiter character is usually required to separate one data element from another.

The advantages of binary data are:

  • Simple binary data elements are of uniform length
  • The data is in a form that is immediately useful to a Java program (conversely, since the data is not in a character format it is not immediately available to viewing by the user via a text editor, and must be interpreted and translated by a specialized computer program)
Reader and Writer-based objects are commonly used to interact with character files, and InputStream and OutputStream-based objects are commonly used to interact with binary files:

We will also look at a RandomAccessFile Java object, which is somewhat similar to the binary Input/OutputStream objects.

Useful Combinations of Objects

Typically you will not directly reference Reader- or Writer-based objects (or InputStream- or OutputStream-based objects) since these offer only a limited number of methods, and these methods are not all that easy to use. Instead, we'll create objects that offer some useful, easy-to-use methods, and connect these to the Reader- or Writer-based objects (or InputStream- or OutputStream-based objects). For example, if we want to write the words "it's a wonderful world" to a disk file named phrase.txt, then we first create a Writer-based object for writing text to a file:
	FileWriter fw = new FileWriter("phrase.txt");
A FileWriter object only offers methods such as write, an overloaded method which can write a single character via its integer code, an array of characters, or a String to the disk. I prefer to use PrintWriter, which has some nice methods for writing strings to the disk through the FileWriter object. It can do a formatted print of most of Java's built-in types (as well as Strings) via the heavily overloaded print and println methods.
	PrintWriter pw = new PrintWriter(fw);	// constructor is given a reference to
						// the previously constructed FileWriter
						// object

	pw.println("it's a wonderful world");

The last line of code could be rewritten as the following four lines of code and still yield the exact same file contents in phrase.dat:
	pw.print("it's a ");
	pw.print("wonderful ");
	pw.print("wor");
	pw.println("ld");
The method println writes a newline character out at the end of the output string, where as the print method only prints the string. The above example makes use of the fact that each open file has a current write position (the position in the file at which the next output will be written) and that a FileWriter object writes sequentially -- the file position is always at the spot where the last write left off.

Below is an example of one of the overloaded instances of println that converts (formats) integers into character form before writing them to the disk file through a FileWriter object.

	PrintWriter pw = new PrintWriter(new FileWriter("phrase.txt"));
	int x = 35;
	double val = 93.1001;

	pw.println(x);
	pw.println(val);

In the example above, the integer 35 is converted to the two characters '3' and '5', and these characters are then written out to the disk. The picture also shows a black triangle which represents the file pointer -- the current write position in the file.

The last two lines would have a somewhat unfortunate result if rewritten as:

	PrintWriter pw = new PrintWriter(new FileWriter("phrase.txt"));
	int x = 35;
	double val = 93.1001;

	pw.print(x);	// print versus println, above
	pw.print(val);	// print versus println, above
What is the problem with the last two lines?

A Longer Example

The following code shows several of the features of writing information ot a file in a characater or binary format. The first part of the example writes four integers to two different files (one file in character format, one in binary format). The program then reopens the files and increments the 1st and 3rd integer found in each file, and reports this incremented value to the user.

The program also shows a sample usage of a RandomAccessFile object. Data can be both written to, and read from, a file when referenced via a RandomAccessFile object. The file pointer can also be moved forward or backward in the file by a byte offset. The example in the code below shows off of the ability to move the file pointer by reading and reporting the values in the file backwards (i.e., starting from the end of the file and working to the file's front).

Also note that the main method throws IOException objects. When using many of the file I/O methods, you must handle some form of an IOException. You must either catch the exception (which is usually the preferred way of handling such an excpetion) or throw the exception further, which is what is done below to keep the focus on the file methods and not to get distracted by handling the exceptions.

import java.io.*;
import java.util.*;

public class FilesDemo2 {
  public static void main(String[] args) throws IOException {
    int n1 = 37, n2 = 10003, n3 = 3987, n4 = 912;
    int back1 = 0, back2 = 0, back3 = 0, back4 = 0;

    PrintWriter pw = new PrintWriter (new FileWriter ("pw.txt"));
    pw.print("" + n1 + " " + n2 + " " + n3 + " " + n4);
    pw.flush();
    pw.close();

    DataOutputStream dos = new DataOutputStream(new FileOutputStream("dos.txt"));
    dos.writeInt(n1);
    dos.writeInt(n2);
    dos.writeInt(n3);
    dos.writeInt(n4);
    dos.flush();
    dos.close();

    BufferedReader br = new BufferedReader(new FileReader("pw.txt"));

    String s = br.readLine();
    StringTokenizer st = new StringTokenizer(s);
    back1 = Integer.parseInt(st.nextToken());
    back1++;
    st.nextToken();
    back2 = Integer.parseInt(st.nextToken());
    back2++;
    System.out.println("BR - 1st: " + back1 + " 2nd: " + back2);
    br.close();

    back1 = 0;
    back2 = 0;
    DataInputStream dis = new DataInputStream(new FileInputStream("dos.txt"));
    back1 = dis.readInt();
    back1++;
    dis.readInt();
    back2 = dis.readInt();
    back2++;
    System.out.println("DIS - 1st: " + back1 + " 2nd: " + back2);
    dis.close();

    File f = new File("dos.txt");
    long len = f.length();

    RandomAccessFile ras = new RandomAccessFile("dos.txt", "r");
    for (long i = len - 4; i >= 0; i -= 4) {
      ras.seek(i);
      int back = ras.readInt();
      back++;
      System.out.println("Backwards: " + back);
    }
  }
}
The above example determines the length of the file via a File object. The length can also be determined via a RandomAccessFile object, but I wanted to show how to create and use a File object. File objects are not used to access or modify the contents of a file -- File objects report information about the file itself, such as its length, its path, whether or not it actually exists, is it readable or writable. You can also delete the file via a File object (as well as create new files).

Reading and Writing Objects

It is also possible to read and write objects to a file. The following example creates three Employee objects and writes them to a binary file named people.dat:

import java.io.*;
import java.util.*;

public class WriteObjs
{
   public static void main(String [] args) throws Exception
   {
      Employee tom = new Employee("cruise", 111000, 46);
      Employee bill = new Employee("clinton", 99000, 59);
      Employee shawn = new Employee("colvin", 85000, 32);
      
      ObjectOutputStream oos = 
         new ObjectOutputStream(new FileOutputStream("people.dat"));
      
      oos.writeObject(tom);
      oos.writeObject(bill);
      oos.writeObject(shawn);		

      oos.close();
   }
}

class Employee implements Serializable
{
   private String name;
   private int salary = 500000;
   private int age = 30;
   
   public Employee(String name, int salary, int age)
   {
      this.name = name;
      this.salary = salary;
      this.age = age;
   } 
   
   public String toString()
   {
      return name + ", " + salary + ", " + age;
   }
}

The people.dat file is 136 bytes long. It contains the three Employee objects, written to the file one after another (preceded by a description of the Employee class). Here is a crude interpretation of people.dat's binary data as characters.

This code pulls the three Employee objects from the people.dat file and into memory:

import java.io.*;
import java.util.*;

public class ReadObjs
{
   public static void main(String [] args) throws Exception
   {
      Employee e1, e2, e3;

      ObjectInputStream ois = new ObjectInputStream(new FileInputStream("people.dat"));

      e1 = (Employee) ois.readObject();
      e2 = (Employee) ois.readObject();
      e3 = (Employee) ois.readObject();
      ois.close();
      
      System.out.println(e1);
      System.out.println(e2);				
      System.out.println(e3);
   }
}

The above code created three Employee references, used those to reference three Employee objects read from the people.dat file, then printed out each read object's String to the console:

cruise, 111000, 46
clinton, 99000, 59
colvin, 85000, 32

Unfortunately, writing and reading the objects individually requires the programmer to know (or figure out) exactly how many objects are in the file and then read that exact number back into memory later on. It is much easier to put the objects into some kind of collection object (in this case an ArrayList), then just write the array to disk with one write statement. The array will be written to disk along with all of the objects that it references:

import java.io.*;
import java.util.*;

public class WriteManyObjs
{
   public static void main(String [] args) throws Exception
   {
      Company ibm = new Company();
      ibm.addEmployee(new Employee("cruise", 111000, 46));
      ibm.addEmployee(new Employee("clinton", 99000, 59));
      ibm.addEmployee(new Employee("colvin", 85000, 32));
      ibm.addEmployee(new Employee("lincoln", 30000, 63));
      
      ObjectOutputStream oos = 
         new ObjectOutputStream(new FileOutputStream("ibm.dat"));
      
      oos.writeObject(ibm);
      oos.close();
   }
}

class Company implements Serializable
{
   ArrayList staff = new ArrayList();
   
   public void addEmployee(Employee emp)
   {
      staff.add(emp);
   }
   
   public String toString()
   {
      return "THIS COMPANY: " + staff.toString();
   }
}

Later the array and all of its referenced objects can be read back into memory with a single read statement:

import java.io.*;
import java.util.*;

public class ReadManyObjs
{
   public static void main(String [] args) throws Exception
   {		
      ObjectInputStream ois = new ObjectInputStream(new FileInputStream("ibm.dat"));
      
      Company company;
      company = (Company) ois.readObject();
      System.out.println(company);
      ois.close();
   }
}

The above program prints the following text on the console:

THIS COMPANY: [cruise, 111000, 46, clinton, 99000, 59, colvin, 85000, 32, lincoln, 30000, 63]
Home - About Us
Copyright © 2006 by Kiowok, Ann Arbor, Michigan, USA