Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func SplitBySampling ¶
SplitBySampling splits a dataset file into two by taking out every Nth entry. Taken from the original splitter without modifications.
Note that this method assumes that all subjects are defined in contiguous lines.
Types ¶
type FilterStats ¶
FilterStats are the stats related to the filter operation. TODO: Maybe these types of Stats returns should also tell use where the files have been stored.
func FilterForEvaluation ¶
func FilterForEvaluation(filePath string) (*FilterStats, error)
FilterForEvaluation creates a filtered version of a dataset to make it faster when executing the evaluation.
func FilterForGlossary ¶
func FilterForGlossary(filePath string) (*FilterStats, error)
FilterForGlossary creates a filtered version of a dataset to make it better for usage when building glossaries.
todo: In the future it could use a filter-in mechanism where only specific predicates
are sent to the generated file, instead of filter-out which includes all the statements except the ones listed. Filter-in should use the same predicates that are used by the Glossary building step and gives the user a better perception of what is actually used by the glossary. With filter-in, we know that every statement in our generated file is also used in the construction of the glossary. With filter-out there can still be many statements that are silently ignored by the building step.
func FilterForSchematree ¶
func FilterForSchematree(filePath string) (*FilterStats, error)
FilterForSchematree creates a filtered version of a dataset to make it better for usage when building schematrees.
todo: In future, such hard-coded predicates should probably not exist.
type SplitByPrefixStats ¶
SplitByPrefixStats are the stats related to the split operation. TODO: Maybe these types of Stats returns should also tell use where the files have been stored.
func SplitByPrefix ¶
func SplitByPrefix(filePath string) (*SplitByPrefixStats, error)
SplitByPrefix will take a dataset and decide where to send it to based on a match of the beginning of the subject. Matches can be of following: item, property, other/miscellaneous
type SplitByTypeStats ¶
SplitByTypeStats are the stats related to the split operation. TODO: Maybe these types of Stats returns should also tell use where the files have been stored.
func SplitByType ¶
func SplitByType(filePath string) (*SplitByTypeStats, error)
SplitByType will take a dataset and generate smaller datasets for each subject type it finds. Types can be of following: item, property, other/miscellaneous.
func SplitByTypeInBlocks ¶
func SplitByTypeInBlocks(filePath string) (*SplitByTypeStats, error)
SplitByTypeInBlocks is a faster implementation of SplitByType, using only a single pass, but assumes that subjects are always found in contiguous lines.
TODO: Maybe there is a need to remove the type-classifying predicates. It that happens
then it should be made as an optional argument.