Creating Binary Files Using Visual Basic

Visual Basic Topics
Post Reply
Cyclops
Lieutenant
Lieutenant
Posts: 71
Joined: Wed Jul 15, 2009 1:48 pm
Location: London

Creating Binary Files Using Visual Basic

Post by Cyclops » Wed Oct 07, 2009 12:13 pm

Creating Binary Files Using Visual Basic

The nitty-gritty of writing a Binary file is that you can put the data into the file any way you want. When a binary file is opened the file pointer is positioned at byte 1. In other words it is at the beginning of the file.

You can write as much as you want into the file while it is open. After each write operation is complete, the file pointer is positioned at the byte immediately after the last data that was written.

For example, if you open the file and write 10 bytes of data, the data begins at byte 1 and ends at byte 10. The pointer is now at byte 11. You can change the pointer in code but that's where it will be if you do nothing. The same thing occurs when reading the file. If you close the file with no further writes to it then the subsequent file will be 10 bytes plus the size of a terminator for each piece of data you stored.

One thing that makes Binary access superior to Random access files is that the string fields can be any length at all. Strings are Visual Basic's largest data type not including objects or User Defined Types. They will quickly eat up memory and disk space, so it's nice to be able to store only as many characters as necessary.

For example, a Random access file must be written using fixed-length records as follows:

Code: Select all

Option Explicit 

Private Type tType
  ID As Long
  Name As String * 25
  Address As String * 50
  City As String * 25
  State As String * 2
  ZIP As String * 10
End Type
Note that each record will require the full storage size defined for the string fields.

For Binary access we don't need to define the length of the string. If the string is 0 bytes, then 0 bytes get stored (plus some overhead to define the field as a string). The overhead also is used in Random access files, so on that level the files are the same.

Code: Select all

Option Explicit 

Private Type tType
  ID As Long
  Name As String
  Address As String
  City As String
  State As String
  ZIP As String
End Type
Now we're not only saving space, but a string field can be longer if needed. So we have two advantages because we are no longer limited by the length of strings.


Writing a Binary File in its Simplest Form

Code: Select all

Sub WriteBinaryFile ()
Dim i As Integer
Dim nFileNum As Integer
Dim sFilename As String

sFilename = "C:\Temp\Temp.dat"

' Get an available file number from the system
nFileNum = FreeFile

' Open the file in binary mode.  Locks are optional
Open sFilename For Binary Lock Read Write As #nFileNum
  ' Put the data in the file
  For i = 0 To 9
    ' No byte position is specified so writing begins at byte 1
    Put #nFileNum, , i
  Next i

Close #nFileNum

End Sub

' Reading the same file

Sub ReadBinaryFile ()
Dim iValue() As Integer
Dim iCount As Integer
Dim nFileNum As Integer
Dim sFilename As String
Dim x As Integer

sFilename = "C:\Temp\Temp.dat"

' Get an available file number from the system
nFileNum = FreeFile 

' Open the file in binary mode.  Locks are optional
Open sFilename For Binary Lock Read Write As #nFileNum 

  ' Get the data from the file
  Do Until EOF(nFileNum) 

    ' Make room in the array for the next value
    Redim Preserve iValue(iCount) 

    ' No byte position is specified so reading starts at byte 1
    Get #nFileNum, , iValue(iCount) 

    ' Increment counter
    iCount = iCount + 1
  Loop 

Close #nFileNum 

End Sub
You'll notice that when writing the file, we knew how much data was being written, so we could do a For Next loop. However, when reading the file, we don't know so we have to continually redimension the array. There are a couple ways around this. The first way is to use the FileLen divided by the length of the data. However, that only works if all the data is the same size and type. That happens to be the case here, so we could have done that.

If we want our file to really be flexible then we probably won't be storing data that is always the same size and type. What if we want to store some byte data, a few records, an array of doubles as well as copyright information? We need a way to define what's in the file and the best place to put that information is in the file itself.

For VB DB, I created a header that contains descriptive information. If the file only contained one type of record and nothing else, then there is no need to save the number of records. We would just loop through the file until there was no more data. However, if there is more than one type of data then we need to know when to stop reading a data type and start reading another. This is much easier to accomplish than you might imagine.

The first record stored in the file is the header. Simply create fields in the header to save information about what's in the file.

For example if you wanted to store three different kinds of UDT's, then you would have three RecordCount fields. Each field would save the number of records in each array of UDT's.

When the file is read these numbers are extracted from the header and you've got all the information you need to read the rest of the file.

For VB DB, there are three pieces of information stored in the file. We already talked about the header being the first data written. Next are the field definitions.

Field definitions specify what data type each field is as well as other information that should be obvious by looking at the type. The last things stored in the file are the actual records.


In Practice

The first step is to define what will be saved in the file:

Code: Select all

' Header definition
Private Type tHeader
  VersionString As String
  VersionNumber As String
  RecordCount As Long
  MaxRecordID As Long
End Type
Private Header As tHeader

' Field definitions. One field definition record is saved for each field
Private Type tFieldDefinition
  Index As Long
  SystemField As Boolean
  FieldName As String
  ArrayField As Boolean
  DataType As Long
  DefaultValue As Variant
  Required As Boolean
  RequireUniqueEntry As Boolean
End Type
Private m_FieldDefinitions(FIELD_INDEX_MAX) As tFieldDefinition

' The records
Private Type tRecord
  Field(FIELD_INDEX_MAX) As Variant
End Type
You'll notice in the header UDT that there is nothing to tell us how many fields there are. This would be necessary information to include if the file was written such that the number of fields can vary. In VB DB, the number of fields is defined in advance and once a file is created, fields cannot be added or deleted.

Now let's look at how the file is written: Note: error handling code is not shown here for brevity.

Code: Select all

Private Function Read(sFilename As String) As Long
Dim i As Long
Dim nReturn As Long
Dim nFileNum As Integer

' Open the file
nFileNum = FreeFile
Open sFilename For Binary Lock Read Write As #nFileNum 

  ' Retrieve header
  Get #nFileNum, 1, Header 

  ' Size the array to hold the records
  ReDim arr_Record(Header.RecordCount - 1) 

  ' Retrieve the field definitions
  Get #nFileNum, , m_FieldDefinitions 

  ' Retrieve the records
  For i = 0 To Header.RecordCount - 1
    Get #nFileNum, , arr_Record(i)
  Next i 

' Close the file
Close #nFileNum 

End Function
Saving is just as easy.

Code: Select all

Private Function WriteFile() As Long
Dim i As Long
Dim nFileNum As Integer

On Error GoTo errHandler

' Delete contents of Filename because the entire file is in memory
nFileNum = FreeFile
Open Filename For Output As #nFileNum
Close #nFileNum ' Close the file before opening in Binary mode

' Open the file
nFileNum = FreeFile
Open Filename For Binary Lock Read Write As #nFileNum

  ' Save header information
  Put #nFileNum, 1, Header

  ' Save field definitions
  Put #nFileNum, , m_FieldDefinitions

  ' Save records
  For i = 0 To RecordCount - 1
    Put #nFileNum, , arr_Record(i)
  Next i

' Close the file
Close #nFileNum

End Function
It is never a good idea to wipe out the contents of a file before a replacement file is written as was shown in the above example. If anything happened that caused your program or computer to crash, the data would be permanently lost.

In practice, VB DB provides a great deal of functionality to protect data, such as creating a backup file before saving. The backup file is restored if an error occurs when writing the file assuming the program didn't quit. If that happens then the backup is still on disk where it was created.

Ultimately what compels me to use binary file storage is flexibility to save what I want, how I want and that it uses disk space efficiently. You can store whatever you want and because you know how you stored it, you can retrieve it.
Post Reply

Return to “Visual Basic Programming”