After you create your EDM schema at the Purview Portal, in accordance with this documentation, you may also watch this recorded session about Sensitive Information Types(SITs), Custom SITs and Exact Data Match where all the steps are explained in details.
After the Schema was created, and wait around 30 minutes, the schema will be available to be used through EDM Upload Agent (This link download the version for commercial Tenants), all the clients can be found in this link, several tasks are required to execute accordin to the next image.
EDM Post tasks overview
The tasks related are:
- Create EDM schema at Purview Portal.
- Connect and validate the connection using EdmUploadAgent.exe /Authorize (your user needs to be part of EDM_DataUploaders security group).
- Request the schema file using the datastore name collected from the 1st step.
- You need to export your database to a file in Csv(comma sepparated), or Pipe '|', or Tab '{Tab]' format.
- You need have access to the previous file from the computer where the EDMUpload Agent is installed.
- Validate the file from the step number 4 with the schema file generated on the step 3.
- Create a hash from the data file generated at the point 4.
- We can have 2 options:
- Upload the data directly to your Microsoft 365 Tenant, or
- Copy the data to a remote server to Upload the hash from another computer wihtout access to the original data.
- Upload your data from a remote server.
After we install EDM Upload Agent, normally installed at C:\Program Files\Microsoft\EdmUploadAgent, at the moment to execute we can see several commands that can be used. Those commands permit us to do all the previous designated tasks.
EDM Upload Agent
Use this application can be a real challenge and more if we want to automate those tasks, in that order of ideas with EDM PT solution we can simplify all those tasks.
With EDM PT we can cover all the tasks that appears in the image "EDM Post tasks overview", these tasks are:
- Set a configuration file with all the information needed to execute EDM PT, including credentials to automate this activity.
- We can encrypt the credentials.
- We can sign the scripts, in some organizations is not allowed the option "Set-ExecutionPolicy" set a "Bypass", and at least is required set to "RemoteSigned". This option permit to reuse a digital certificate for Code Signing, or create a new one self signed.
- Connect to EDM to validate the connection.
- Set other variables like "AllowedBadLinesPercentage" or "ColumnSeparator" the last one used to identify if your data is separated by comma, pipe or Tab.
- Get all the datastores names available in your Tenant.
- Get the schema file associated to the previous datastore.
- Set the location from your original data and validate if that match with the schema created at the Purview Portal.
- Create the Hash file for that data.
- Upload the Hash file to your Microsoft 365 Tenant.
- Check the progress status for the previous activity.
- Copy the data needed to a remote server, to execute upload process from a remote machine.
- At the remote machine we can do:
- Review the configuration at the remote server and make some changes if that is required.
- Permit to change the credentials used.
- Encrypt the password
- Sign the scripts
- Upload the Hash file to your Microsoft 365 Tenant
- Check the progress status for the upload process
- The tasks that can be automate, using task scheduler, are:
- Create the Hash file if the Data file is a new one.
- Upload the Hash file if this file is a new one.
- Copy the Hash file to the remote server if this file is a new one.
- Upload the Hash file from the remote server if this file is a new one.
(EDM PT solution compare the Last Write Time for each file to compare and see if we are working with a new file, or not, this validation is do it because you can do a EDM refresh only 5 times per day)