Processing Single Excel File with Single Sheet

Steps to create Excel Schema

  1. On the Develop tab, go to Services > Schema Excel.

     

  2. Click Create New and do the following:
    1. Type the name and description of new excel schema in the Name and Description text boxes respectively.
    2. Select the Secure check box if the excel file that you are using is a password protected file.
    3. Type the password for the protected excel file and confirm it in the Password and Confirm Password text boxes respectively.

      The Password and Confirm Password text box will be enable, if and only if, the Secure check box is selected.
    4. Select the Data Header Present check box if the file contains Headers. Data Header usually contains the name of the fields in an excel file. If schema is used at the source end and data header is present in the file. If the schema is used at the target end, and the Data Header Present check box is checked, the Header will be written in the target excel file.
       

      Name of the Headers in the data file and the Field Names in schema must be same and in same order. If they are not same, then you need to use Dynamic Header Support option. To know how to use Dynamic Header Support, refer to the Using Dynamic Header Support section.
    5. Define the schema either using definition file or entering the fields sequentially.
      Using Definition File
      1. Select Import Definition File. 
      2. Select the type of file from the Definition File drop down list box and click Upload File to select the required file. 



      3. Click Choose File to select the excel file and click Upload file. The list of sheets will be displayed.

      4. Click Finish.


      Using Entering the Fields Sequentially
      1. Type the name of the sheet of the excel file in the Sheet Name text box.
      2. Type the name of each field in the Field Name text box .
      3. Select the type of data from the Type drop-down list box. The supported data types by Excel schema are:

        Data Type

        Description

        String

        Select this data type if the field accepts a string value.

        Number

        Select this data type if the field accepts a numeric value.

        Date

        Select this data type if the field accepts a Date or Date and Time value.

        Currency

        Select this data type if the field accepts a currency value.

        Excel Schema supports the Currency data type.
      4. If data type is Date, select the format of date and time from the Format and SubFormat drop-down list boxes respectively.
          

        In case you define a date format in any column of the Excel file then, please ensure that every row of that column should have the same definition of date format as in the first row otherwise the Excel schema generates an error record.
        In case, you select a date format as *3/14/2001, and use the schema at the source end then, the Excel schema parses the date as per the local date format. The Adeptia Suite does not support this date format when you use the schema at the target end.
        You should define the time format as hh:mm:ss.
      5. If data type is Currency, select the required currency from the Format drop-down list box. The currencies supported by excel schema are:
         
        • Dollar($)
        • USD
        • CAD
        • CNY
        • EUR
        • GBP
        • JPY
      6. Select the mode of data, whether Encrypted or Plain Text from the Data Mode drop-down list box. If the schema is used at source end and encrypted data is coming from the source, select Encrypted from the Data Mode drop-down list box. If the schema is used at the target end and you want to send the encrypted data to the target, select Encrypted from the Data Mode drop-down list box.

        • If you select Encrypted in the Data Mode then you must select Encryption/Decryption secret key from the Advanced Properties.
        • If you select Encrypted in the Data Mode then you must set the Data Action property in the Process Designer, while creating the process flow. For information on Process Designer, refer to Working with Process Flow.
        • To insert rows, you need to specify the number and position of the rows that you want to add in the Number of Rows and at Position text boxes respectively and then click Add Row. You can add a maximum of 99 rows at a time.
        • To remove rows, you need to specify the number and position of the rows that you want to delete in the Number of Rows and at Position  text boxes respectively and then click Remove Row.

        If you delete the last fieldname of a record and save the schema, this action will delete the entire field. For example, a record has three fields – Name, Description, and Age. If you delete Age then, the schema will delete the entire field.

         

    6. Expand Advanced Properties.



    7. Type the row start position in the Row Start Position text box. Row Start Position specifies which row of the Excel Sheet is counted as first row. For example, if you type 5 in the Row Start Position, 5th row of the Excel Sheet is counted as the first row. If this schema is used at source end, the data from the 5th row onward is taken for processing. If this schema is used at the target end, data is copied into the 5th row onward. 1st to 4th row of the target excel sheet will remain blank.
    8. Similarly, type the column start position in the Column Start Position text box.
    9. Click Optimize XLSX File Processing for processing an .xlsx file in streaming mode. Selecting this option would process large *.xlsx files without consuming much memory. To use this feature, Define Hierarchy option must be unchecked. 
    10. Dynamic Header Support check box is an advanced feature of Excel and Text Schema, which is used to parse an excel or text file, if:

      FieldNames defined in the source schema and the Data Headers (Column Name) of the excel/text file are same but not in the same order.
      FieldNames defined in the source schema and the Headers of the excel/text file are not same. They may or may not be in the same order.

      If the headers are not matched between Schema Definition and Source File then xml (output of schema) will only have Root rather than any Record entry. For instance, if the headers of the source schema are Id, Name, and Age and the headers of the data file are Emp Id, Emp Name, and Emp Age then the output of the schema will be blank.                     

      On selecting the Dynamic Header Support check box, Allow Ignore case check box appears. Select this checkbox to ignore the casing mismatches in the header of the schema and the data file.

    11. In case, the number of fields in the data file is less than the number of fields defined in schema, select Allow Less Fields check box to process the data.

       

      • When the number of fields in the data file are less than the number of fields defined in the schema then, the schema does not processes the data and gives an error during execution.

      • If you still want to process the data, then you need to enable Allow Less Field checkbox.
      • When you enable the Allow Less Field option then the schema generates an empty tag for fields that are not present in the data file.
      • This option is applicable only when you use the schema at source end.
    12. In case, the input data contains some characters that are invalid in XML, then this may result in the mapping getting aborted. You can filter these invalid XML characters by selecting the Filter Invalid XML Characters check box.
    13. Similarly, if the data contains blank rows, you can filter them by selecting the Filter Blank Row(s) check box. This option is applicable only when you use the schema at source end. 
    14. Type the enclosing character that you want to remove from the data file in the Handle Enclosing Character text box. The supported characters are (" , < , > ).
    15. If schema is used at the target end and you want to convert the data from Plain Text to encrypted mode, select the secret key activity from the Encryption Secret Key drop-down list box. The selected secret key activity is used to encrypt the data.
    16. If schema is used at the source end and you want to convert the data from Encrypted Mode to Plain Text, select the secret key activity from the Decryption Secret Key drop-down list box. The selected secret activity is used to decrypt the data.

      For information on how to create secret key activity, refer to Creating Secret Key Activity.
      For information on Advanced Properties, refer to Changing Advanced Properties.

    17. Click Save.

Using Excel Schema to parse XLSX file
When you use excel schema to parse XLSX file, then you need to do some additional setting while creating process using this schema. 
While creating the process flow, in Process Designer, double click the Excel schema that you are using in the process flow. Properties of the Excel Schema is displayed in the properties panel.





Select Excel Workbook (*.xlsx) as the value in the File Type property. When the value File Type property is set as Excel Workbook (*.xlsx), this schema can parse both types of excel file: XLS as well as XLSX file.

Defining Field Hierarchy
You can define hierarchy (parent-child relationship) between the records of an excel file using Define Hierarchy option of excel schema. To understand how to define hierarchy let us assume that you have an excel file which contains records of insurance policies of families.



As you can see in the above figure, for ID 7812, there are three policies and similarly for ID 2311 there are four policies. While defining schema, you can specify the criteria based on which the schema can define the hierarchy. For example, you can select if the ID field is blank, merge the record in the previous record, which is having ID. Similarly, you can define the criteria for the matching record. For example, if value of the ID field is matching with that of the previous record, merge the record with the previous record. 
 

Steps to define hierarchy

  1. While creating schema, select Import Definition File and then select Data from the Definition Files drop-down list box..
  2. Click Upload File. The Schema File Upload screen is displayed.
  3. Click Browse and select the file, you want to upload. Path of the selected file is shown in the Browse File text box.
  4. Now click Upload File. Name of the uploaded file is shown in the File Name list and list of sheets of the excel file are displayed.
  5. Specify the Start Row No. and Start Column No. of the sheets in respective fields and then click Process Sheet(s). This will read the sheets and field names. 

     

    The Start Row No. and Start Column No. fields specify the row and column from where the schema should start fetching data. For example, if in an excel file the first 4 rows of sheet1 are blank and you want to exclude them from your selection then, you have to specify 5 in the Start Row No. field for sheet1. Similarly, there can be some data in the first 4 rows, but you do not want to fetch those records.
  6. Once the processing is done, click Finish to close the Schema File Upload screen and return to create schema page.
  7. Once the file is uploaded the Sheet Name field is converted into the drop-down list and all the sheet names of the selected excel file are populated in this drop-down list.
  8. Select the sheet name from the Sheet Name drop-down list box. All the fields of the selected sheet are populated.
  9. To define the hierarchy, select Define Hierarchy checkbox and select the merge criteria from the Merge Criteria drop-down list box.

     

    Currently two merge criteria are supported: 
    Matching Child Record: The schema merges the records, in case, the values in the Key fields are matching. 
    Blank Child Record: In case, the value of the key field is blank then the schema merges the record with a previous record whose key field is not blank.
  10. Select the field name from the Merge Key drop-down list on the basis of which record are merged.
  11. Click Save to save the excel schema.

Prerequisites

  • Data Header must be present in the Excel/text file.
  • Data Type must be same in the Excel/text Schema and the excel/text file.

To parse an excel/text file, whose Headers are same but not in order with the FieldNames of the Excel/text Schema, check the Dynamic Header Support check box in the Advanced Properties. Now the Excel/text Schema will parse the data from the respective columns. To parse an excel/text file, whose headers are different and not even in order with the fieldnames of the Excel/text Schema, select the Dynamic Header Support check box in the Advanced Properties. Apart from this an XML file is used by the schema, which contains the mapping between the headers of the excel/text file and the fieldnames of the Excel/text Schema. Following is the sample XML.

 

<?xml version="1.0" encoding="ISO-8859-1"?>
 <HeaderMap type="Name">
 <Map>
 <SchemaHeader>Name</SchemaHeader>
 <DataFileHeader>EMP_Name</DataFileHeader>
 </Map>
 <Map>
 <SchemaHeader>DOB</SchemaHeader>
 <DataFileHeader>EMP_DOB</DataFileHeader>
 </Map>
 <Map>
 <SchemaHeader>Age</SchemaHeader>
 <DataFileHeader>EMP_Age</DataFileHeader>
 </Map>
 <Map>
 <SchemaHeader>Address</SchemaHeader>
 <DataFileHeader>EMP_Address</DataFileHeader>
 </Map>
 </HeaderMap>

 

where:
SchemaHeader is the FieldName defined in the Excel Schema.
DataFileHeader is the name of the Header in excel file.

Excel Schema reads this XML file from process flow context. Therefore, you have to pass this XML file to the process flow context.

In the process flow, a file source activity is used to read an excel file. An Excel Schema is used to parse the data from the excel file. FieldNames defined in the excel Schema and Headers in Excel file are different. To parse the data from the excel file, an XML file, which contains the mapping between FieldName and the Header is passed to the process flow context. To pass the XML file to process flow context, another File source activity (File_Src_ExcelHeaderXML) is used.

This file source activity reads the XML file from a specified location and passes it to the context target activity. In Context Target activity, you need to define the value of the parameterName property.




In this process flow, the value of parameterName field is FieldName_Header_Mapping
After context target, schema uses the PutContextVar action. This action sets the value of the variable Service.<ActivityName>.excelHeaderMappingXML in the XML Schema.




  1. Click Edit to define the variable name and the value to be set. The Edit Context Variables screen is displayed.

     

  2. Click New (  ). The Context Variable Information dialog box is displayed.

     

  3. Enter Service.<ActivityName>.excelHeaderMappingXML in the Variable Name. For example, Service.Excel_Schema.excelHeaderMappingXML
    where 
    Activity Name is name of the Excel Schema Activity. For Example FieldName_Header_Mapping.
  4. In the Variable Value, type the value that you defined in the parameterName in the Context Target activity.
  5. Click Done.
  6. Make sure to create a stream from file source to Excel Schema activity.

     

    For information on how to create a process flow, refer to Creating Process Flow.