public class Classification
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
java.util.Map<java.lang.String,Category> |
categories
Dictionary for all categories and their parents; their name is used as the key in searching.
|
java.util.Map<java.lang.String,java.lang.String> |
categoryUUIDs
Auxiliary dictionary for searching the artificial UUID (value) of a given original category identifier (key).
|
private boolean |
classifyByName |
private Configuration |
currentConfig |
private char |
indent |
private static int |
MAX_LEVELS |
(package private) Assistant |
myAssistant |
(package private) Converter |
myConverter |
(package private) EmbeddedClassifier |
myEmbeddedClassifier |
private int |
numTiers |
private java.lang.String |
outputFile |
private java.lang.String |
splitter |
Constructor and Description |
---|
Classification(Configuration config,
java.lang.String classFile,
java.lang.String outFile)
Constructor of the classification hierarchy representation to be used in transformation
|
Classification(java.lang.String classFile,
boolean classifyFlag)
Constructor of the classification hierarchy representation used for testing its validity
|
Modifier and Type | Method and Description |
---|---|
int |
countCategories()
Returns the number of categories in the dictionary representation of the classification scheme
|
int |
countTiers()
Returns the depth, i.e., max number of levels in any path of the given multi-tier classification hierarchy
|
private void |
executeParser4Graph()
Parses each item in the classification hierarchy and streamlines the resulting triples according to the given YML mapping.
|
private void |
executeParser4RML()
Parses each item in the classification hierarchy and streamlines the resulting triples according to the given RML mapping.
|
private void |
executeParser4Stream()
Parses each item in the classification hierarchy and streamlines the resulting triples according to the given YML mapping.
|
java.lang.String |
findUUID(java.lang.String categoryId)
For a given category identifier in the classification scheme, find its respective UUID
|
java.lang.String |
getEmbeddedCategory(java.lang.String categoryName)
For a given category name, identifies its respective category in the embedded (default) classification scheme
|
java.lang.String |
getUUID(java.lang.String categoryName)
For a given category name, identifies its respective UUID in the classification scheme
|
void |
parseCSVFile(java.lang.String classificationFile)
Parses input CSV file containing the classification hierarchy
|
void |
parseYMLFile(java.lang.String classificationFile)
Parses input YML file containing the classification hierarchy
|
private void |
printDescendants(java.lang.String parent_id,
int level)
Given a parent identifier, recursively find all its descendants and print them in a YML-like fashion
|
void |
printHierarchyYML()
Prints the entire classification scheme representation to the standard output in a YML-like fashion
|
Category |
searchById(java.lang.String categoryId)
For a given category identifier, finds the respective entry in the classification scheme
|
Category |
searchByName(java.lang.String categoryName)
For a given category name, finds the respective entry in the classification scheme
|
Converter myConverter
Assistant myAssistant
EmbeddedClassifier myEmbeddedClassifier
private java.lang.String outputFile
private Configuration currentConfig
private boolean classifyByName
private int numTiers
private static final int MAX_LEVELS
private java.lang.String splitter
private char indent
public java.util.Map<java.lang.String,Category> categories
public java.util.Map<java.lang.String,java.lang.String> categoryUUIDs
public Classification(java.lang.String classFile, boolean classifyFlag)
classFile
- Input file (CSV or YML) containing the user-specified classification scheme.classifyFlag
- Boolean value: True, if the actual name of the category is used in the hierarchy; False, if identifiers are used instead.public Classification(Configuration config, java.lang.String classFile, java.lang.String outFile)
config
- User-specified configuration for the transformation process.classFile
- Input file (CSV or YML) containing the user-specified classification scheme.outFile
- Output file containing the RDF triples resulted from transformation of the classification scheme.public void parseYMLFile(java.lang.String classificationFile)
classificationFile
- Path to YML file specifying the classification scheme.
ASSUMPTION: Each line in the YML file corresponds to a category; levels are marked with a number of indentation characters at the beginning of each line; no indentation signifies a top-tier category.public void parseCSVFile(java.lang.String classificationFile)
classificationFile
- Path to CSV file specifying the classification scheme.
ASSUMPTION: Each line (record) corresponds to a full path from the top-most to the bottom-most category; at each level, two attributes are given: first a (usually numeric) identifier, then the name of the category.
ASSUMPTION: This CSV file must use ',' as delimiter character between attributes values enclosed in double quotes ("...").
EXAMPLE ROW for a 3-tier classification: 1,"Food",103,"Restaurant",103005,"Chinese Restaurant".public Category searchByName(java.lang.String categoryName)
categoryName
- The category name to search in the classification hierarchy. CUATION! Category names must be unique amongst all levels and are used as keys.public Category searchById(java.lang.String categoryId)
categoryId
- The category identifier to search in the classification hierarchy. Category identifiers must be unique amongst all levels.public java.lang.String getUUID(java.lang.String categoryName)
categoryName
- The category name to search in the classification hierarchy. Category names must be unique amongst all levels.public java.lang.String findUUID(java.lang.String categoryId)
categoryId
- The original category identifier to search in the classification hierarchy. Category identifiers must be unique amongst all levels.public java.lang.String getEmbeddedCategory(java.lang.String categoryName)
categoryName
- The category name to search in the user-defined classification hierarchy. Category names must be unique amongst all levels.public int countCategories()
public int countTiers()
private void printDescendants(java.lang.String parent_id, int level)
parent_id
- Original identifier for the parent of a categorylevel
- The level of the category in the classification scheme (0: top-tier)public void printHierarchyYML()
private void executeParser4RML()
private void executeParser4Graph()
private void executeParser4Stream()