Creating Schema Activity

You can use a schema to parse data from all the supported data type formats and transform it into any of the supported data types. What a schema actually does is that, it reads the data from the source, transforms it into XML, and then commits it to the source in the requested format. If you want data transformation then you need to use different types of schemas both at the source and target end of your Process flow. Please see the figure (see Figure 196) to get more information on how a schema works.

Figure 196: Process For Data Conversion Using Schema

Business Example
Your organization has asked you to compile all the information, that is saved on a common server, about their inactive users and convert it into database entries which they will use for archiving. Your job is to fetch all the relevant data from this stack of information and convert it into acceptable database entries. For this reason you need the following knowledge:

An understanding of fetching data from different file types
An understanding of converting that fetched data into target file format

Types of Schema Activities
Adeptia Suite provides the following type of schema activities:

Adeptia Suite allows you to create schemas in two ways:

Using Definition File
You can create a schema using a Definition File in three ways:

These methods may vary across different schemas. Their compatibility with the schemas are outlined in the table below.
Table 1: Definition File Methods Used for Creating Schemas

Schema	Data File	Field File	XSD File
Advance Database Schema			√
Advance Positional Schema		√	√
Advance Text Schema		√	√
Database Schema			√
Excel Schema	√	√	√
Positional Schema		√	√
Text Schema	√	√	√

Using Data File
A data file contains the actual data which we use as source or target during the execution of a process flow. It can be the same file that is used in the process flow or another sample file of same format.

Using Field File
A field file is a Comma Separated Values (CSV) file that contains the name of the fields and their definitions which are separated by comma. This option is helpful in case the number of fields in the source or target data file is very large. All the field names are picked up from this CSV file. If the data type is Date then, format of the date must be specified after the data type, separated by comma.

In case a field is defined as Date or Time type and their format is not defined then, the default date format will be MM/dd/yy and time format will be blank.
When copying a field file, you need to verify that the field format is correct and there are no extra lines in that document. Else, the schema will generate an error while converting it to HTML.

Field File format for Advance Positional Schema
The Field file format for Advance Positional Schema can be of two types:

Based on Start Position and End Position
Based on Field Length

Field File format for Advance Positional Schema based on Start and End Position
<Record Identifier1>:<Value>,<Record Identifier2>:<Value>
<RecordIdentifier Value>,<FieldName>,<Description>,<DataType>,[DateFormat],[TimeFormat],<Start
Position>,<EndPosition>,<Alignment>,<Skip>
In case of Advance Positional Schema, Record Identifier, and Value should be specified at the beginning of the CSV file as displayed below:

a:first,b:second
first,a,first_field,string,,,1,11,L,F
first,name,name_of_employee,string,,,12,21,L,T
first,empid,employee_ID,int,,,22,36,L,F
second,b,second_field,string,,,1,11,L,F
second,DOB,date of birth,date,yyyy/dd/MM,hh:mm:ss,12,24,L,F
second,Address,Address of employee,string,,,25,44,L,T

where: L means left alignment
R means right alignment
T means True
F means False
Field File format for Advance Positional Schema based on Field Length
<RecordIdentifier Value>,<FieldName>,<Description>,<DataType>,[DateFormat],[TimeFormat],<Length>,<Alignment>,<Skip>
In case of Advance Positional Schema, Record Identifier, and Value should be specified at the beginning of the CSV file as displayed below:

a:first,b:second
first,a,first_field,string,,,11,L,F
first,name,name_of_employee,string,,,10,L,T
first,empid,employee_ID,int,,,15,L,F
second,b,second_field,string,,,11,L,F
second,DOB,date of birth,date,yyyy/dd/MM,hh:mm:ss,13,L,F
second,Address,Address of employee,string,,,20,L,T

where:L means left alignment
R means right alignment
T means True
F means False
Field File format for Excel and Text Schema
<Field Name>,<Data Type>,[Date Format],[Time Format]

NAME,string,,
PHONE_NO,number,,
DOB,date,MM/dd/yy,hh:mm:ss
DOJ,date,MM/dd/yy,

Field File format for Positional Schema
Field file format for Positional Schema can be of two types:

Field File format for Positional Schema based on Start and End Position
<Field Name>,<Description>,<Data Type>,[Date Format],[Time Format],
<Start Position>,<End Position>,<Alignment>,<Skip>
Following is the content of sample CSV file used to create Positional schema:

name,name of employee,string,,,1,10,L,F
empid,employee ID,int,,,11,30,L,T
DOB,Dat of birth,date,yyyy-dd-MM,hh:mm,31,60,L,F

where: L means left alignment
R means right alignment
T means True
F means False
Field File format for Positional Schema based on Field Length
<Field Name>,<Description>,<Data Type>,[Date Format],[Time Format],
<Length>,<Alignment>,<Skip>
Following is the content of sample CSV file used to create Positional schema:

name,name of employee,string,,,10,L,F
empid,employee ID,int,,,20,L,T
DOB,Dat of birth,date,yyyy-dd-MM,hh:mm,30,L,F

where: L means left alignment
R means right alignment
T means True
F means False

Using XSD File
The XML Schema Definition (XSD) file describes the elements in an XML document. The XSD file that you will use to create a schema must be compliant to the Adeptia-Suite's format. To get an Adeptia-Suite compliant XSD file, you can edit any existing schema and download its XSD file. You can also edit the fields in that XSD file and use it to create the schema. For example, you have created a schema of 100 fields and you want to create another schema of only 90 fields by using the existing 100 field schema. You can download the XSD file of the existing schema, delete 10 additional field by editing the XSD file, and use that XSD file to create another schema of just 90 fields.

In case the schema definition contain characters which do not fall in the default character set encoding then, before uploading the XSD file you should first define the character set encoding to be used at the schema creation level. For details, refer to the Setting Character Set Encoding While Designing Schema section.

Entering Fields Sequentially
This is a manual way of creating a schema. If you select this option, you will then have to manually enter the field's name and their data type in the correct sequence.
When creating a schema (except XML schema), the schema automatically creates a Record Number attribute at the record level. It is available for each record. If you use a schema at the source level, then the schema will populate this attribute in the intermediate XML file at the record level. The Record Number attribute always starts at 1. If the schema detects an error, then it will generate this attribute in the Error File. For example, if an error is found at record number 5 in the source file, then the Error File will display Record Number 5.

Error Records
On execution of a process flow, there is a possibility that some of the records in the source file are not according to the schema definition. The schema treats them as error records when the schema parses the source data. Consider an excel schema whose field format is as shown in the figure (see Figure 209):

Figure 209: Excel Schema Example

The corresponding data file of the excel schema is shown in the figure (see Figure 210):

Figure 210: Data File Example

As you can note that in the schema definition, the data type of the Account_Number field is Number. However, in the source data file, there are two records where the Account_Number field contains a string data type. Now, when the schema will parse this file, the data for these two records will not match as per the schema definition and hence these two records will be treated as error records.
Similarly when you use a schema at the target side then also the schema can generate error records.
In this scenario, you may want to handle these error records as per your requirement. You can do this while creating the Process Flow, where you will be actually using this schema.

To know, how to handle these error records, refer to the Handling Error Records section.