Utils
JudiLing.iscorrect
— FunctionCheck whether the predictions are correct.
JudiLing.display_pred
— FunctionDisplay prediction nicely.
JudiLing.translate
— FunctionTranslate indices into words or utterances
JudiLing.translate_path
— FunctionAppend indices together to form a path
JudiLing.is_truly_sparse
— FunctionCheck whether a matrix is truly sparse regardless its format, where M is originally a sparse matrix format.
Check whether a matrix is truly sparse regardless its format, where M is originally a dense matrix format.
JudiLing.isattachable
— FunctionCheck whether a gram can attach to another gram.
Check whether a gram can attach to another gram.
JudiLing.iscomplete
— FunctionCheck whether a path is complete.
JudiLing.isstart
— FunctionCheck whether a gram can start a path.
JudiLing.isnovel
— FunctionCheck whether a predicted path is in training data.
JudiLing.check_used_token
— FunctionCheck whether there are tokens already used in dataset as n-gram components.
JudiLing.cal_max_timestep
— Functionfunction cal_max_timestep(
data_train::DataFrame,
data_val::DataFrame,
target_col::Union{String, Symbol};
tokenized::Bool = false,
sep_token::Union{Nothing, String, Char} = "",
)
Calculate the max timestep given training and validation datasets.
Obligatory Arguments
data_train::DataFrame
: the training datasetdata_val::DataFrame
: the validation datasettarget_col::Union{String, Symbol}
: the column with the target word forms
Optional Arguments
tokenized::Bool = false
: Whether the word forms in thetarget_col
are already tokenizedsep_token::Union{Nothing, String, Char} = ""
: The token with which the word forms are tokenized
Examples
JudiLing.cal_max_timestep(latin_train, latin_val, target_col=:Word)
function cal_max_timestep(
data::DataFrame,
target_col::Union{String, Symbol};
tokenized::Bool = false,
sep_token::Union{Nothing, String, Char} = "",
)
Calculate the max timestep given training dataset.
Obligatory Arguments
data::DataFrame
: the datasettarget_col::Union{String, Symbol}
: the column with the target word forms
Optional Arguments
tokenized::Bool = false
: Whether the word forms in thetarget_col
are already tokenizedsep_token::Union{Nothing, String, Char} = ""
: The token with which the word forms are tokenized
Examples
JudiLing.cal_max_timestep(latin, target_col=:Word)