Comparison

Information about comparing documents with the database

Documents can be used to compare a file to entities in the database. This is configured by document definitions.

Prerequisites

To fully understand this documentation, you need to have prior knowledge about definitions in general.

Β 

Introduction

To configure a comparison you need 2 document definitions:

  • A definition which creates the comparison file as a source
  • A definition which creates the comparison report as a target
Β 

Following steps are executed when comparing a input file to the database:

  1. The user creates a document with the definition for the comparison file
      • In the background a comparison service is started and waits for the file to be uploaded
  1. The user uploads a file (blob) to the document
      • In the background the comparison service is going to execute query requests and process the file according to the configuration stored on the document
      • When all data is queried, it is compared to the comparison file and mismatches are determined
  1. The comparison service creates a document with the definition for the comparison report
  1. The comparison services uploads the file containing the mismatches
  1. The comparison report contains 3 types of mismatches:
      • IN_FILE_NOT_DB
      • IN_DB_NOT_FILE
      • MISMATCH_ATTRIBUTES
Β 

Comparison File Definition

This document definition defines how the comparison file document is created.

The definition contains following properties.

Property
Description
inputDefinitions
Contains the required inputs when creating the document from a definition
document
Contains the template for creating the document, using the inputs as variables

Until now it is the same as any other definition, but for comparison there is more configuration required on the document template.

Β 

Document configuration

The document template contains following additional properties in order to configure comparison.

Β 
Property
Description
type
Must be COMPARISON_FILE
extensions
If the encoding of the file is not utf-8, it need to be specified as a keyword If the input file is a csv, some extensions need to be specified how the csv is structured (separator, quoted, headers). This is mainly for visualisation/preview purposes in the screen
processActions
Contains an api request to the document comparison service with an identification of the document
data
Firstly, it contains the query request for the database and mapping of database entries. Secondly, it contains the mapping of the input file. Thirdly, it contains configuration for the report to be created.

Extensions

Example extensions
"extensions": {
	"keywords": {
		"csvSeparator": ",",
		"encoding": "ISO-8859-1"
	},
	"booleans": {
		"csvQuoted": true,
		"csvHeaders": false
	}
}
Β 

Process Actions

Example process actions
"processActions": {
	"apiRequest": {
		"api": "documentcomparison",
		"method": "POST",
		"body": {
			"identification": {
				"id": "{{response.identification.id}}"
			}
		}
	}
}
Β 

Data

The data part contains the β€˜logic’ for the comparison. The

Property
Description
api
required Contains the api of the entity to be compared.
database.request
required Contains the request to be send. Pagination and sorting are important here, because the request will be send in iterations. To optimise queries, you can provide metadata with a custom nuql query to retrieve the data. But the regular request properties are also possible.
database.mappingtype
The type of mapping to use. For now handlebars is supported.
database.mapping
required Contains the mapping of the request response (dbEntries) into a flat list.
inputfile.skiprow
Determines which rows of the csv file to skip.
inputfile.mappingtype
The type of mapping to use. For handlebars is supported.
inputfile.mapping
required Contains the mapping of the input file (rows) into a flat list.
report.definition
required Contains the definition to be used to create the comparison report.
report.mimeType
required Defines the mime type of the report file. Possible values: text/csv application/json
report.text/csv
required Contains additional properties to configure the csv (separator, quoted, headers).

It is very important to keep the output mapping of the database and the input file synchronised. Because both need to produce comparable output.

Also the mapping requires a key property, this is used to do the actual comparison.

  • If a key is present in the input file but not in the database β†’ mismatch type IN_FILE_NOT_DB is reported
  • If a key is present in the database but not in the input file β†’ mismatch type IN_DB_NOT_IN_FILE is reported
  • If a key is present in both the database and in the input file, all properties in the file are compared
    • If then the attributes are not matching β†’ mismatch type MISMATCH_ATTRIBUTES is reported
    • If all attributes are matching, no mismatch is reported.
Β 
Example data configuration
"data": {
  "api": "accesspoints",
  "database": {
    "request": {
      "metadata": {
        "nuql": "AccessPoint:?accesspoint link:isDefinedIn #IndexSet:?indexSet. AccessPoint:?accesspoint attribute:accessPointRole ['accessPointRole#METERING_POINT']. AccessPoint:?accesspoint attribute:externalId value:any.  AccessPoint:?accesspoint entity:type'AccessPoint'. %%sorting%% %%pagination%% Party:?filterPartyAccessPointRelation01Party attribute:externalId ['{{marketPartyExternalId}}']; link:isDefinedIn ?indexSet.  PartyAccessPointRelation:?filterPartyAccessPointRelation01 link:hasTarget AccessPoint:?accesspoint. PartyAccessPointRelation:?filterPartyAccessPointRelation01 link:hasSource #Party:?filterPartyAccessPointRelation01Party. PartyAccessPointRelation:?filterPartyAccessPointRelation01 attribute:partyRole ['partyRole#{{marketPartyRole.id}}']. PartyAccessPointRelation:?filterPartyAccessPointRelation01 attribute:fromDate lte:'{{referenceDate}}'. PartyAccessPointRelation:?filterPartyAccessPointRelation01 attribute:toDate gt:'{{referenceDate}}'. PartyAccessPointRelation:?filterPartyAccessPointRelation01 attribute:active 1. PartyAccessPointRelation:?filterPartyAccessPointRelation01 if:exists reset. Party:?filterPartyAccessPointRelation01Party if:exists reset. AccessPointDetail:?accesspointDetail link:isAssignedTo #AccessPoint:?accesspoint. AccessPointDetail:?accesspointDetail attribute:fromDate lte:'{{referenceDate}}'. AccessPointDetail:?accesspointDetail attribute:toDate gt:'{{referenceDate}}'. AccessPointDetail:?accesspointDetail attribute:active 1. AccessPointDetail:?accesspointDetail limit 20000. AccessPointDetail:?accesspointDetail offset 0. PartyAccessPointRelation:?accesspointPartyAccessPointRelation link:hasTarget #AccessPoint:?accesspoint. PartyAccessPointRelation:?accesspointPartyAccessPointRelation limit 400. PartyAccessPointRelation:?accesspointPartyAccessPointRelation offset 0. PartyAccessPointRelation:?accesspointPartyAccessPointRelation link:hasSource Party:?accesspointPartyAccessPointRelationParty. PartyAccessPointRelation:?accesspointPartyAccessPointRelation attribute:toDate gt:'{{referenceDate}}'. PartyAccessPointRelation:?accesspointPartyAccessPointRelation attribute:active 1. AccessPointAccessPointRelation:?accesspointAccessPointAccessPointRelationTarget link:hasSource #AccessPoint:?accesspoint. AccessPointAccessPointRelation:?accesspointAccessPointAccessPointRelationTarget limit 400. AccessPointAccessPointRelation:?accesspointAccessPointAccessPointRelationTarget offset 0. AccessPointAccessPointRelation:?accesspointAccessPointAccessPointRelationTarget link:hasTarget AccessPoint:?accesspointAccessPointAccessPointRelationTargetAccessPoint. AccessPointAccessPointRelation:?accesspointAccessPointAccessPointRelationTarget attribute:toDate gt:'{{referenceDate}}'. AccessPointAccessPointRelation:?accesspointAccessPointAccessPointRelationTarget attribute:active 1. AccessPointAccessPointRelation:?accesspointAccessPointAccessPointRelationTarget attribute:accessPointRole 'accessPointRole#GRID_AREA'."
      },
      "pagination": {
        "offset": 0,
        "limit": 1000
      },
      "sorting": {
        "order": [
          "externalId+"
        ]
      }
    },
    "mappingtype": "handlebars",
    "mapping": "[<%#each dbEntries%>    {   \"key\": \"<%identification.externalId%>\", \"eanId\": \"<%identification.externalId%>\",   \"productType\": \"<%extensions.tenantCodes.edsnProductType.0%>\",    \"gridArea\": \"<%#each targets%><%#eq role 'GRID_AREA'%><%accessPoint.externalId%><%/eq%><%/each%>\", \"gridOperatorId\": \"<%#each relatedParties%><%#eq targetRole 'GRID_OPERATOR'%><%party.externalId%><%/eq%><%/each%>\", \"meteringResponsibleId\": \"<%#each relatedParties%><%#eq targetRole 'METERING_RESPONSIBLE'%><%party.externalId%><%/eq%><%/each%>\",   \"balanceSupplierId\": \"<%#each relatedParties%><%#eq targetRole 'BALANCE_SUPPLIER'%><%party.externalId%><%/eq%><%/each%>\",   \"balanceResponsibleId\": \"<%#each relatedParties%><%#eq targetRole 'BALANCE_RESPONSIBLE'%><%party.externalId%><%/eq%><%/each%>\", \"allocationMethod\": \"<%details.0.extensions.tenantCodes.edsnAllocationMethod.0%>\",    \"energyDeliveryStatus\": \"<%details.0.extensions.tenantCodes.edsnEnergyDeliveryStatus.0%>\",    \"energyFlowDirection\": \"<%details.0.extensions.tenantCodes.edsnEnergyFlowDirection.0%>\",  \"marketSegment\": \"<%details.0.extensions.tenantCodes.edsnMarketSegment.0%>\",  \"meteringMethod\": \"<%details.0.extensions.tenantCodes.edsnMeteringMethod.0%>\", \"physicalCapacity\": \"<%details.0.extensions.tenantCodes.edsnPhysicalCapacity.0%>\",\"physicalStatus\": \"<%details.0.extensions.tenantCodes.edsnPhysicalStatus.0%>\",    \"profileCategory\": \"<%details.0.extensions.tenantCodes.edsnProfileCategory.0%>\", \"contractCapacity\": \"<%details.0.extensions.keywords.edsnContractCapacity.0%>\",   \"eaEnergyConsumptionNettedPeak\": \"<%details.0.extensions.keywords.edsnEaEnergyConsumptionNettedPeak.0%>\", \"eaEnergyConsumptionNettedOffPeak\": \"<%details.0.extensions.keywords.edsnEaEnergyConsumptionNettedOffPeak.0%>\",   \"eaEnergyProductionNettedPeak\": \"<%details.0.extensions.keywords.edsnEaEnergyProductionNettedPeak.0%>\",   \"eaEnergyProductionNettedOffPeak\": \"<%details.0.extensions.keywords.edsnEaEnergyProductionNettedOffPeak.0%>\"  }<%#unless @last%>,<%/unless%>  <%/each%>]"
  },
  "inputfile": {
    "skiprow": [
      0
    ],
    "mappingtype": "handlebars",
    "mapping": "[<%#each rows as |row|%>{\"key\": \"<%row.0%>\",\"eanId\": \"<%row.0%>\", \"productType\": \"<%row.18%>\", \"gridArea\": \"<%row.5%>\", \"gridOperatorId\": \"<%row.1%>\", \"meteringResponsibleId\": \"<%row.3%>\", \"balanceSupplierId\": \"<%row.2%>\", \"balanceResponsibleId\": \"<%row.4%>\", \"allocationMethod\": \"<%row.39%>\", \"energyDeliveryStatus\": \"<%row.21%>\", \"energyFlowDirection\": \"<%row.20%>\", \"marketSegment\": \"<%row.19%>\", \"meteringMethod\": \"<%row.24%>\", \"physicalCapacity\": \"<%row.28%>\", \"physicalStatus\": \"<%row.22%>\", \"profileCategory\": \"<%row.31%>\", \"contractedCapacity\": \"<%row.27%>\", \"eaEnergyConsumptionNettedPeak\": \"<%row.33%>\", \"eaEnergyConsumptionNettedOffPeak\": \"<%row.32%>\", \"eaEnergyProductionNettedPeak\": \"<%row.35%>\", \"eaEnergyProductionNettedOffPeak\": \"<%row.34%>\" }<%#unless @last%>,<%/unless%><%/each%>]"
  },
  "report": {
    "definition": {
      "externalId": "NL_CAR_METERING_POINT_EXTRACT_REPORT_CSV"
    },
    "mimeType": "text/csv",
    "text/csv": {
      "separator": ";",
      "quoted": true,
      "headers": true
    }
  }
}
Β 

Optimise queries with custom nuQL

In order to optimise queries custom nuQL queries can be written to get only the response required and also to order the nuQL statements in specific order to improve query performance for your use case.

A couple of things are important to keep in mind for query optimisation:

  1. Limited the nuQL query as much as possible, only query the data which is really necessary.
      • For example: only retrieve entities where attribute externalId has a value by using this statement: attribute:externalId value:any.
      • For example: limiting number of relations by filtering on fromDate, toDate and roles.
      • β€˜Throw away’ temporary filter variables which are not necessary for the response, once the constraint is applied by using this statement: if:exists reset. This prevents to keep unnecessary data all the time in memory.
  1. The variables used in the nuQL query need to comply with a convention to be able to translate the result in an API response.
      • You can execute a β€˜regular’ query request on the API with the postfix /nuql in the URL to see the generated nuQL with variables:
      • For example: /accesspoints/_query/nuql
  1. The sorting and pagination are included as dynamic parameters such that requests can be iterated per page.
      • Using this syntax: %%sorting%% %%pagination%%
Β 

Comparison Report Definition

This document definition defines how the comparison report document is created. It does not contain any advanced logic, like the comparison file.

The definition contains following properties.

Property
Description
inputDefinitions
Contains the required inputs when creating the document from a definition
document
Contains the template for creating the document, using the inputs as variables
Β 

Input Definitions

For the report document there is a required set of input definitions which need to be available to ensure the comparison service can create the report document with the necessary data.

Input
Type
Description
comparisonFileDocument
entity (entityType Document)
This input provides a reference to the source document which produced this report
countInFileNotInDb
number
This input represents the number of mismatches with type IN_FILE_NOT_IN_DB
countInDbNotInFile
number
This input represents the number of mismatches with type IN_DB_NOT_IN_FILE
countMismatchAttributes
number
This input represents the number of mismatches with type MISMATCH_ATTRIBUTEES
Β 
Required input definitions for comparison report
"inputDefinitions": [
    {
      "id": "comparisonFileDocument",
      "label": "Source for comparison report",
      "type": "entity",
      "entityType": "Document",
      "validations": [
        {
          "name": "required",
          "config": "true"
        },
        {
          "name": "min",
          "config": "1"
        },
        {
          "name": "max",
          "config": "1"
        },
        {
          "name": "filter",
          "config": {
            "filter": {
              "attributes": [
                {
                  "attribute": "type",
                  "operator": "equals",
                  "values": [
                    "COMPARISON_FILE"
                  ]
                }
              ]
            }
          }
        }
      ]
    },
    {
      "id": "countInFileNotInDb",
      "label": "Count of AP's in file but not in DB",
      "type": "number",
      "validations": [
        {
          "name": "required",
          "config": "true"
        }
      ]
    },
    {
      "id": "countInDbNotInFile",
      "label": "Count of AP's in DB but not in file",
      "type": "number",
      "validations": [
        {
          "name": "required",
          "config": "true"
        }
      ]
    },
    {
      "id": "countMismatchAttributes",
      "label": "Count of AP's with attribute mismatches",
      "type": "number",
      "validations": [
        {
          "name": "required",
          "config": "true"
        }
      ]
    }
  ]
Β 

Document

The document part contains the template for creating the report document. This is freely configurable. Below an example which uses the input definitions.

  • The inputs with number of mismatches per type are stored as extensions
  • Some CSV extensions are stored, such that the attached file can be previewed
  • The comparison file document input is related as a source to this report document
  • And when the report document is successfully created, a business event will be sent containing the counts and a reference to the report document
Β 
Example document template
"document": {
    "idGeneration": [
      "publicId"
    ],
    "identification": {
      "externalId": "{{comparisonFileDocument.0.identification.externalId}}_REPORT"
    },
    "name": "NL C-AR Metering Point Extract Report (csv)",
    "type": "COMPARISON_REPORT",
    "roles": [
      "TARGET"
    ],
    "fileTimestamp": "{{now}}",
    "extensions": {
      "values": {
        "countInFileNotInDb": "{{countInFileNotInDb}}",
        "countInDbNotInFile": "{{countInDbNotInFile}}",
        "countMismatchAttributes": "{{countMismatchAttributes}}"
      },
      "keywords": {
        "csvSeparator": ";"
      },
      "booleans": {
        "csvQuoted": true,
        "csvHeaders": true
      }
    },
    "sources": [
      {
        "document": {
          "id": "{{comparisonFileDocument.0.identification.id}}"
        },
        "role": "SOURCE",
        "fromDate": "{{today}}"
      }
    ],
    "processActions": {
      "businessEvent": [
        {
          "order": {
            "businessKey": "{{comparisonFileDocument.0.identification.publicId}}"
          },
          "name": "NL_CAR_METERING_POINT_EXTRACT_REPORT_MESSAGE",
          "type": "message",
          "correlationRequired": false,
          "variables": {
            "countInFileNotInDb": "{{countInFileNotInDb}}",
            "countInDbNotInFile": "{{countInDbNotInFile}}",
            "countMismatchAttributes": "{{countMismatchAttributes}}",
            "reportDocument": "{{response.identification.id}}"
          }
        }
      ]
    }
  }
Β 
Did this answer your question?
😞
😐
🀩

Last updated on January 5, 2023