soalib.parser
Class Parser

java.lang.Object
  extended bysoalib.parser.Parser
All Implemented Interfaces:
IParser
Direct Known Subclasses:
BinaryParser, CsvParser, DelimitedParser, XbaseParser

public abstract class Parser
extends java.lang.Object
implements IParser

Parser class provides default implementation of IParser interface. In most cases the default implementation is sufficient, but the following methods must be overridden, which is directly related to the kind of parser:
Only one parser is associated with one extension. Each extension must be associated with a different parser. Also, each ParserTable have one Parser associated with it. Each ParserDatabase may have one or more SqlParser objects.

IMPORTANT:

SqlParser objects are created or deleted when sql statements are executed. Manipulation of sql statements do not case automatic state change of Parser, ParserTable and ParserDatabase classes. It is, therefore, necessary that for the Parser object always point to the SqlParser tables in memory.
1. save() method
2. createTableAndFetch()
3. parse()

Author:
Ejaz Jamil

Field Summary
protected  java.lang.String checksum_
           
protected  java.lang.String grammer_
           
protected  boolean littleEndian_
           
 
Fields inherited from interface soalib.parser.IParser
CHECKSUM_COLUMN_NAME, CHECKSUM_COLUMN_SIZE
 
Constructor Summary
Parser()
           
 
Method Summary
 void allowChecksum(boolean yes)
          Computes the checksum if set to true.
static void attachParser(java.lang.String type, java.lang.Class parser)
           
static Parser createParser(java.lang.String extension)
          Creates an instance of a registered parser.
protected  Table createTable(Table table, int maxColumns)
          This method provides a default implementation.
protected abstract  MemoryResultSet fetchData(Table table, java.util.Vector rows)
          Fetches data into a memory result set.
 java.lang.String getChecksum()
          Gets the checksum value as string.
 java.lang.String getChecksumColumnName()
          The last column of the Table is the checksum checksum column, if checksum was defined.
 Column[] getColumns()
           
 soalib.parser.grammer.Grammer getGrammer()
          Returns the grammer associated with this parser
 MemoryTable getMemoryTable()
           
static java.lang.Class getParser(java.lang.String type)
           
static java.lang.Class[] getRegisteredClasses()
           
static java.lang.String[] getRegisteredExtensions()
           
 MemoryResultSet getResultSet()
          Gets the ResultSet for this parser.
 long getRowCount()
          Gets row count of the table.
protected abstract  java.util.Vector getRows(java.lang.String source, int[] columnCount)
          This method is called by the parser engine when data is to be read.
 boolean hasChecksum()
          Returns true if checksum computation is set to true.
static boolean hasParser(java.lang.String type)
           
 boolean isBlank()
          Checks if the table is blank.
 boolean isLittleEndian()
          Gets the endianness mode.
 boolean parse(java.lang.String source, Table initTable)
          Parses the text from the source and initizlize the table initTable in the second argument.
abstract  boolean save(java.lang.String url)
          Saves the content of the parser data into a file.
 void setBlankTable()
           
 void setChecksumColumnName(java.lang.String checksumColumnName)
          Sets the checksum column name if anything other than the default is used.
 void setLittleEndian(boolean littleEndian)
          Sets endianess to little endian.
 void setParserGrammer(java.lang.String grammer)
          Sets parser attributes.
 void setResultSet(MemoryResultSet resultSet)
          Sets an external memory result set.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

grammer_

protected java.lang.String grammer_

checksum_

protected java.lang.String checksum_

littleEndian_

protected boolean littleEndian_
Constructor Detail

Parser

public Parser()
Method Detail

allowChecksum

public final void allowChecksum(boolean yes)
Description copied from interface: IParser
Computes the checksum if set to true.

Specified by:
allowChecksum in interface IParser
Parameters:
yes - a value of true mean, checkusm will be allowed.

getChecksum

public final java.lang.String getChecksum()
Description copied from interface: IParser
Gets the checksum value as string. If the string is numeric, it is converted to string.

Specified by:
getChecksum in interface IParser
Returns:
a checksum string.

getChecksumColumnName

public final java.lang.String getChecksumColumnName()
Description copied from interface: IParser
The last column of the Table is the checksum checksum column, if checksum was defined. But, if there is already a column by that name (less possibility) then, the name will be changed. The new name may be obtained by calling this method. If the name is not changed, then the default name will be returned.

Specified by:
getChecksumColumnName in interface IParser
Returns:
the column name of the checksum column.

getResultSet

public final MemoryResultSet getResultSet()
Description copied from interface: IParser
Gets the ResultSet for this parser. A result set is always created even if the table or file does not exist. In this case, the result set is an empty result set.

Specified by:
getResultSet in interface IParser
Returns:
the result set (in memory).

getMemoryTable

public final MemoryTable getMemoryTable()

getRowCount

public final long getRowCount()
Description copied from interface: IParser
Gets row count of the table.

Specified by:
getRowCount in interface IParser
Returns:
row count.

getGrammer

public soalib.parser.grammer.Grammer getGrammer()
Description copied from interface: IParser
Returns the grammer associated with this parser

Specified by:
getGrammer in interface IParser
Returns:
associated grammer instance.

hasChecksum

public final boolean hasChecksum()
Description copied from interface: IParser
Returns true if checksum computation is set to true.

Specified by:
hasChecksum in interface IParser
Returns:
true, if the checksum was enabled.

isBlank

public final boolean isBlank()
Description copied from interface: IParser
Checks if the table is blank. This mean that there are rows in the table. This may also mean that there are no such table exists at all.

Specified by:
isBlank in interface IParser
Returns:
true, if the table was blank.

parse

public final boolean parse(java.lang.String source,
                           Table initTable)
Description copied from interface: IParser
Parses the text from the source and initizlize the table initTable in the second argument. Returns true if parsing was successful.

Specified by:
parse in interface IParser
Parameters:
source - name of the file which should be read into the table.
initTable - a table into which data is to be read.
Returns:
true, if parsed successfully.

fetchData

protected abstract MemoryResultSet fetchData(Table table,
                                             java.util.Vector rows)
Fetches data into a memory result set. The data is bound with the table given in the first argument. The second argument contain the data format obtained from getRows(String, int[]) method.

Parameters:
table - table object to bind the data to.
rows - rows obtained by a call to getRows(String, int[]).
Returns:
a memory based result set.

getRows

protected abstract java.util.Vector getRows(java.lang.String source,
                                            int[] columnCount)
                                     throws java.lang.Exception
This method is called by the parser engine when data is to be read. This method combined with fetchData(Table, Vector) completes the parsing of parser database. This method reads a source (first argument) and is fully dependent on the target parser. It may be a path to a text file, for example. The second argument is a blank box with a zero in it passed by the parser engine. The implementer of this method have to set the number of columns in the table in this box. This array is always of size 1.

The format of the return vector is fully target parser dependent. The parser engine does not do anything with the return value. It simply does some intermediate processing without using this return value then passes the value to the fetchData(Table, Vector) method. The fetchData(Table, Vector) method should know how to interpret the data stored in the vector.

Parameters:
source - a text file path, or other target database dependent soruce.
columnCount - an array of size 1 to get the column count value of the parsed table.
Returns:
an implementation dependent vector, where each element in the vector represent a row in the table (preferred). But, presently no information about this is used.
Throws:
java.lang.Exception

setParserGrammer

public final void setParserGrammer(java.lang.String grammer)
                            throws ParserException
Description copied from interface: IParser
Sets parser attributes. Attributes depend on the parser. To generalize the parser may throw exception if the attributes do not match.

Specified by:
setParserGrammer in interface IParser
Parameters:
grammer - the grammer of the table.
Throws:
ParserException

setResultSet

public final void setResultSet(MemoryResultSet resultSet)
Description copied from interface: IParser
Sets an external memory result set. This method may be use to save values into the text table.

Specified by:
setResultSet in interface IParser
Parameters:
resultSet - new result set in memory to set.

setBlankTable

public final void setBlankTable()

getColumns

public final Column[] getColumns()
                          throws ParserException
Throws:
ParserException

save

public abstract boolean save(java.lang.String url)
Saves the content of the parser data into a file.

Specified by:
save in interface IParser
Parameters:
url - filename in URL form to store the data.
Returns:
true, if the data saved correctly.

createTable

protected Table createTable(Table table,
                            int maxColumns)
                     throws ParserException
This method provides a default implementation. It creates a table based on grammer. If the maxColumns argument is greater than field length, then the remaining columns are considered as VARCHAR type. If the maxColumns is smaller than field length, then the maxColumns number of columns will be used. The default implementation is sufficient in most cases. But the method may be overridden if needed.

Parameters:
table -
maxColumns -
Throws:
ParserException

setChecksumColumnName

public final void setChecksumColumnName(java.lang.String checksumColumnName)
Sets the checksum column name if anything other than the default is used. The default name is pointed by IParser.CHECKSUM_COLUMN_NAME.

Parameters:
checksumColumnName - the new column name to use.

attachParser

public static final void attachParser(java.lang.String type,
                                      java.lang.Class parser)
                               throws ParserException
Throws:
ParserException

getParser

public static final java.lang.Class getParser(java.lang.String type)

hasParser

public static final boolean hasParser(java.lang.String type)

getRegisteredExtensions

public static final java.lang.String[] getRegisteredExtensions()

getRegisteredClasses

public static final java.lang.Class[] getRegisteredClasses()

createParser

public static final Parser createParser(java.lang.String extension)
                                 throws java.lang.Exception
Creates an instance of a registered parser. If the class is not found, then exception is thrown.

Parameters:
extension - file extension with which the parser is to be created.
Returns:
instance of an appropriate parser object for the extension.
Throws:
java.lang.Exception

isLittleEndian

public boolean isLittleEndian()
Gets the endianness mode. If not little endian, then it is big endian.

Returns:
true, if little endian.

setLittleEndian

public void setLittleEndian(boolean littleEndian)
Sets endianess to little endian. Note that this setting does not ensure that the derived parser will use it. Some parsers are not endianness dependent, like Text parsers. Where as parsers that store in binary format are strongly dependent on endianness. By default, Java stores in big endian format. Any Intel architecture are little endian. So if you have a file written in C and would like to read it from binary parser, you need to set the mode to little endian. Endianness affects storage of integer, short, float, double and other multi-byte primitive data types. It does not affect bytes.

Parameters:
littleEndian - true, if little endian format is to be used.