Deep learning in JudiLing
JudiLing.predict_from_deep_model
— Methodpredict_from_deep_model(model::Chain,
X::Union{SparseMatrixCSC,Matrix})
Generates output of a model given input X
.
Obligatory arguments
model::Chain
: Model of type Flux.Chain, as generated byget_and_train_model
X::Union{SparseMatrixCSC,Matrix}
: Input matrix of size (numberofsamples, inpdim) where inpdim is the input dimension ofmodel
JudiLing.predict_shat
— Methodpredict_shat(model::Chain,
ci::Vector{Int})
Predicts semantic vector shat given a deep learning comprehension model model
and a list of indices of ngrams ci
.
Obligatory arguments
model::Chain
: Deep learning comprehension model as generated byget_and_train_model
ci::Vector{Int}
: Vector of indices of ngrams in c vector. Essentially, this is a vector indicating which ngrams in a c vector are absent and which are present.
JudiLing.get_and_train_model
— Methodget_and_train_model(X_train::Union{SparseMatrixCSC,Matrix},
Y_train::Union{SparseMatrixCSC,Matrix},
X_val::Union{SparseMatrixCSC,Matrix,Missing},
Y_val::Union{SparseMatrixCSC,Matrix,Missing},
data_train::Union{DataFrame,Missing},
data_val::Union{DataFrame,Missing},
target_col::Union{Symbol,String,Missing},
model_outpath::String;
hidden_dim::Int=1000,
n_epochs::Int=100,
batchsize::Int=64,
loss_func::Function=Flux.mse,
optimizer=Flux.Adam(0.001)
model::Union{Missing, Chain}=missing,
early_stopping::Union{Missing, Int}=missing,
optimise_for_acc::Bool=false
return_losses::Bool=false,
verbose::Bool=true,
measures_func::Union{Missing, Function}=missing,
return_train_acc::Bool=false,
...kargs
)
Trains a deep learning model from X_train
to Y_train
, saving the model with either the highest validation accuracy or lowest validation loss (depending on optimise_for_acc
) to outpath
.
The default model looks like this:
inp_dim = size(X_train, 2)
out_dim = size(Y_train, 2)
Chain(Dense(inp_dim => hidden_dim, relu), Dense(hidden_dim => out_dim))
Any other model with the same input and output dimensions can be provided to the function with the model
argument. The default loss function is mean squared error, but any other loss function can be provded, as long as it fits with the model architecture.
By default the adam optimizer (Kingma and Ba, 2015) with learning rate 0.001 is used. You can provide any other optimizer. If you want to use a different learning rate, e.g. 0.01, provide optimizer=Flux.Adam(0.01)
. If you do not want to use an optimizer at all, and simply use normal gradient descent, provide optimizer=Descent(0.001)
, again replacing the learning rate with the learning rate of your preference.
Returns a named tuple with the following values:
model
: the trained modeldata_train
: the training data, including any measures if computed bymeasures_func
data_val
: the validation data, including any measures if computed bymeasures_func
losses_train
: The losses of the training data for each epoch.losses_val
: The losses of the validation data after each epoch.accs_train
: The accuracies of the training data after each epoch, ifreturn_train_acc=true
.accs_val
: The accuracies of the validation data after each epoch.
Obligatory arguments
X_train::Union{SparseMatrixCSC,Matrix}
: training input matrix of dimension m x nY_train::Union{SparseMatrixCSC,Matrix}
: training output/target matrix of dimension m x kX_train::Union{SparseMatrixCSC,Matrix}
: validation input matrix of dimension l x nY_train::Union{SparseMatrixCSC,Matrix}
: validation output/target matrix of dimension l x kdata_train::DataFrame
: training datadata_val::DataFrame
: validation datatarget_col::Union{Symbol, String}
: column with target wordforms in datatrain and datavalmodel_outpath::String
: filepath to where final model should be stored (in .bson format)
Optional arguments
hidden_dim::Int=1000
: hidden dimension of the modeln_epochs::Int=100
: number of epochs for which the model should be trainedbatchsize::Int=64
: batchsize during trainingloss_func::Function=Flux.mse
: Loss function. Per default this is the mse loss, but other options might be a crossentropy loss (Flux.crossentropy
). Make sure the model makes sense with the loss function!optimizer=Flux.Adam(0.001)
: optimizer to use for trainingmodel::Union{Missing, Chain} = missing
: A custom model can be provided for training. Its requirements are that it has to correspond to the input and output size of the training and validation dataearly_stopping::Union{Missing, Int}=missing
: Ifmissing
, no early stopping is used. Otherwiseearly_stopping
indicates how many epochs have to pass without improvement in validation accuracy before the training is stopped.optimise_for_acc::Bool=false
: if true, keep model with highest validation accuracy. If false, keep model with lowest validation loss.return_losses::Bool=false
: whether additional to the model per-epoch losses for the training and test data as well as per-epoch accuracy on the validation data should be returnedverbose::Bool=true
: Turn on verbose modemeasures_func::Union{Missing, Function}=missing
: A measures function which is run at the end of every epoch. For more information see Themeasures_func
argument. If a measure is tagged for each epoch, the one tagged with "final" will be the one for the finally returned model.return_train_acc::Bool=false
: If true, a vector with training accuracies is returned at the end of the training....kargs
: any additional keyword arguments are passed to the measures_func
JudiLing.get_and_train_model
— Methodget_and_train_model(X_train::Union{SparseMatrixCSC,Matrix},
Y_train::Union{SparseMatrixCSC,Matrix},
model_outpath::String;
data_train::Union{Missing, DataFrame}=missing,
target_col::Union{Missing, Symbol, String}=missing,
hidden_dim::Int=1000,
n_epochs::Int=100,
batchsize::Int=64,
loss_func::Function=Flux.mse,
optimizer=Flux.Adam(0.001),
model::Union{Missing, Chain} = missing,
return_losses::Bool=false,
verbose::Bool=true,
measures_func::Union{Missing, Function}=missing,
return_train_acc::Bool=false,
...kargs)
Trains a deep learning model from X_train
to Y_train
, saving the model after n_epochs epochs. The default model looks like this:
inp_dim = size(X_train, 2)
out_dim = size(Y_train, 2)
Chain(Dense(inp_dim => hidden_dim, relu), Dense(hidden_dim => out_dim))
Any other model with the same input and output dimensions can be provided to the function with the model
argument. The default loss function is mean squared error, but any other loss function can be provded, as long as it fits with the model architecture.
By default the adam optimizer (Kingma and Ba, 2015) with learning rate 0.001 is used. You can provide any other optimizer. If you want to use a different learning rate, e.g. 0.01, provide optimizer=Flux.Adam(0.01)
. If you do not want to use an optimizer at all, and simply use normal gradient descent, provide optimizer=Descent(0.001)
, again replacing the learning rate with the learning rate of your preference.
Returns a named tuple with the following values:
model
: the trained modeldata_train
: the data, including any measures if computed bymeasures_func
data_val
: missing for this functionlosses_train
: The losses of the training data for each epoch.losses_val
: missing for this functionaccs_train
: The accuracies of the training data after each epoch, ifreturn_train_acc=true
.accs_val
: missing for this function
Obligatory arguments
X_train::Union{SparseMatrixCSC,Matrix}
: training input matrix of dimension m x nY_train::Union{SparseMatrixCSC,Matrix}
: training output/target matrix of dimension m x kmodel_outpath::String
: filepath to where final model should be stored (in .bson format)
Optional arguments
data_train::Union{Missing, DataFrame}=missing
: The training data. Only necessary if a measuresfunc is included or returntrain_acc=true.target_col::Union{Missing, Symbol, String}=missing
: The column with target word forms in the training data. Only necessary if a measuresfunc is included or returntrain_acc=true.hidden_dim::Int=1000
: hidden dimension of the modeln_epochs::Int=100
: number of epochs for which the model should be trainedbatchsize::Int=64
: batchsize during trainingloss_func::Function=Flux.mse
: Loss function. Per default this is the mse loss, but other options might be a crossentropy loss (Flux.crossentropy
). Make sure the model makes sense with the loss function!optimizer=Flux.Adam(0.001)
: optimizer to use for trainingmodel::Union{Missing, Chain} = missing
: A custom model can be provided for training. Its requirements are that it has to correspond to the input and output size of the training and validation datareturn_losses::Bool=false
: whether additional to the model per-epoch losses for the training and test data as well as per-epoch accuracy on the validation data should be returnedverbose::Bool=true
: Turn on verbose modemeasures_func::Union{Missing, Function}=missing
: A measures function which is run at the end of every epoch. For more information see Themeasures_func
argument.return_train_acc::Bool=false
: If true, a vector with training accuracies is returned at the end of the training....kargs
: any additional keyword arguments are passed to the measures_func
JudiLing.fiddl
— Methodfiddl(X_train::Union{SparseMatrixCSC,Matrix},
Y_train::Union{SparseMatrixCSC,Matrix},
learn_seq::Vector,
data::DataFrame,
target_col::Union{Symbol, String},
model_outpath::String;
hidden_dim::Int=1000,
batchsize::Int=64,
loss_func::Function=Flux.mse,
optimizer=Flux.Adam(0.001),
model::Union{Missing, Chain} = missing,
return_losses::Bool=false,
verbose::Bool=true,
n_batch_eval::Int=100,
compute_accuracy::Bool=true,
measures_func::Union{Function, Missing}=missing,
kargs...)
Trains a deep learning model using the FIDDL method (frequency-informed deep discriminative learning). Optionally, after each n_batch_eval
batches measures_func
can be run to compute any measures which are then added to the data.
If you get an OutOfMemory error, chances are that this is due to the eval_SC
function being evaluated after each n_batch_eval
batches. Setting compute_accuracy=false
disables computing the mapping accuracy.
Returns a named tuple with the following values:
model
: the trained modeldata
: the data, including any measures if computed bymeasures_func
losses_train
: The losses of the data the model is trained on within eachn_batch_eval
batches.losses
: The losses of the full dataset after eachn_batch_eval
batches.accs
: The accuracies of the full dataset after eachn_batch_eval
batches.
Obligatory arguments
X_train::Union{SparseMatrixCSC,Matrix}
: training input matrix of dimension m x nY_train::Union{SparseMatrixCSC,Matrix}
: training output/target matrix of dimension m x klearn_seq::Vector
: List of indices in the order that the vectors in Xtrain and Ytrain should be presented to the model for training.data::DataFrame
: The full data.target_col::Union{Symbol, String}
: The column with target word forms in the data.model_outpath::String
: filepath to where final model should be stored (in .bson format)
Optional arguments
hidden_dim::Int=1000
: hidden dimension of the modeln_epochs::Int=100
: number of epochs for which the model should be trainedbatchsize::Int=64
: batchsize during trainingloss_func::Function=Flux.mse
: Loss function. Per default this is the mse loss, but other options might be a crossentropy loss (Flux.crossentropy
). Make sure the model makes sense with the loss function!optimizer=Flux.Adam(0.001)
: optimizer to use for trainingmodel::Union{Missing, Chain} = missing
: A custom model can be provided for training. Its requirements are that it has to correspond to the input and output size of the training and validation datareturn_losses::Bool=false
: whether additional to the model per-epoch losses for the training and test data as well as per-epoch accuracy on the validation data should be returnedverbose::Bool=true
: Turn on verbose moden_batch_eval::Int=100
: Loss, accuracy andmeasures_func
are evaluated everyn_batch_eval
batches.compute_accuracy::Bool=true
: Whether accuracy should be computed everyn_batch_eval
batches.measures_func::Union{Missing, Function}=missing
: A measures function which is run eachn_batch_eval
batches. For more information see Themeasures_func
argument.