Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

A classification entity type is a custom classification model that is plugged into the system. DryvIQ has the following pre-installed classification entity types:

 Document Type

The Document Type Classifier is a built-in entity type. It is trained to identify and classify a document as one of approximately 100 different document types. You can upload individual documents to the classifier, and it will identify the document type and provide a confidence score. You also have the option of including this entity type in the policies you create to identify the documents types in a the content for a data source.

Some documents are categorized in document groups. When a document matches one of these document types, the group name will be displayed rather than the individual document type.

 File Name

The File Name Classifier is a built-in entity type. It is trained to identify and classify a document as one of over 300 different document types based on the file name. You can upload individual documents to the classifier, and it will identify the document type and provide a confidence score. You also have the option of including this entity type in the policies you create to identify documents based on file names.

 Language Detection

The Language Detection classifier will identify the language of a document. The built-in module currently detects over 150 languages. When you upload a file against the module, DryvIQ will also include the confidence level for the detected language. You also have the option of including this entity type in the policies you create to identify the languages in a the content for a data source.

 PII Extraction Module

The Personally Identifiable Information (PPI) Extraction Module is a pre-trained artificial intelligence (AI) model that can reliably identify and extract PII elements contained in unstructured data. You can include this entity type in the policies you create to identify PII information in content for a data source and specify rules to classify files based on the results. For example, if a Person Name or Address is found in a file in a “public” folder, it can be set to be classified as “Restricted.” You can also upload individual documents to the classifier, and it will identify any PII found in it and provide a confidence score.

 Sensitive Object Detection

Sensitive Object Detection is a built-in entity type. It is trained to identify and classify images of sensitive data, such as identification cards, fingerprints, license plates, etc. You can upload individual documents to the classifier, and it will identify any images of sensitive information. If an image contains multiple sensitive objects, all items will be identified. For example, if the document contains an image of a driver’s license, the scan will identify both the ID card and signature as detected sensitive objects. You also have the option of including this entity type in the policies you create to identify documents based on file names.

Only the following image types will be scanned:

  • BM

  • BMP

  • GIF

  • ICB

  • JFIF

  • JPEG

  • JPG

  • PBM

  • PDF

  • PNG

  • TGA

  • TIFF (See note below.)

  • VDA

  • VST

  • WEBP

A TIFF is a complex image file made up of multiple parts; therefore, not all TIFF files can be successfully scanned for various reasons. If a TIFF file is found but can't be scanned, an error will be logged identifying why it couldn't be scanned.

 Form Matcher

The built-in DryvIQ Form Matcher currently supports over 5000 government (and other commonly-used organization) forms. The Form Matcher can be used to match a “query” document to an indexed document. The Form Matcher attempts to match the query document against all indexed documents and returns the indexed document with the highest similarity score between it and the query document. When you upload a file against the matcher, DryvIQ will include the confidence level for the matched form. You also have the option of including this entity type in the policies you create to identify forms in a the content for a data source.

Refer to Uploading Samples to learn how to upload individual files for analysis against any entity type.

  • No labels