Deep learning in JudiLing

JudiLing.predict_from_deep_modelMethod
predict_from_deep_model(model::Chain,
                        X::Union{SparseMatrixCSC,Matrix})

Generates output of a model given input X.

Obligatory arguments

  • model::Chain: Model of type Flux.Chain, as generated by get_and_train_model
  • X::Union{SparseMatrixCSC,Matrix}: Input matrix of size (numberofsamples, inpdim) where inpdim is the input dimension of model
source
JudiLing.predict_shatMethod
predict_shat(model::Chain,
             ci::Vector{Int})

Predicts semantic vector shat given a deep learning comprehension model model and a list of indices of ngrams ci.

Obligatory arguments

  • model::Chain: Deep learning comprehension model as generated by get_and_train_model
  • ci::Vector{Int}: Vector of indices of ngrams in c vector. Essentially, this is a vector indicating which ngrams in a c vector are absent and which are present.
source
JudiLing.get_and_train_modelMethod
get_and_train_model(X_train::Union{SparseMatrixCSC,Matrix},
                    Y_train::Union{SparseMatrixCSC,Matrix},
                    X_val::Union{SparseMatrixCSC,Matrix,Missing},
                    Y_val::Union{SparseMatrixCSC,Matrix,Missing},
                    data_train::Union{DataFrame,Missing},
                    data_val::Union{DataFrame,Missing},
                    target_col::Union{Symbol,String,Missing},
                    model_outpath::String;
                    hidden_dim::Int=1000,
                    n_epochs::Int=100,
                    batchsize::Int=64,
                    loss_func::Function=Flux.mse,
                    optimizer=Flux.Adam(0.001)
                    model::Union{Missing, Chain}=missing,
                    early_stopping::Union{Missing, Int}=missing,
                    optimise_for_acc::Bool=false
                    return_losses::Bool=false,
                    verbose::Bool=true,
                    measures_func::Union{Missing, Function}=missing,
                    return_train_acc::Bool=false,
                    ...kargs
                    )

Trains a deep learning model from X_train to Y_train, saving the model with either the highest validation accuracy or lowest validation loss (depending on optimise_for_acc) to outpath.

The default model looks like this:

inp_dim = size(X_train, 2)
out_dim = size(Y_train, 2)
Chain(Dense(inp_dim => hidden_dim, relu), Dense(hidden_dim => out_dim))

Any other model with the same input and output dimensions can be provided to the function with the model argument. The default loss function is mean squared error, but any other loss function can be provded, as long as it fits with the model architecture.

By default the adam optimizer (Kingma and Ba, 2015) with learning rate 0.001 is used. You can provide any other optimizer. If you want to use a different learning rate, e.g. 0.01, provide optimizer=Flux.Adam(0.01). If you do not want to use an optimizer at all, and simply use normal gradient descent, provide optimizer=Descent(0.001), again replacing the learning rate with the learning rate of your preference.

Returns a named tuple with the following values:

  • model: the trained model
  • data_train: the training data, including any measures if computed by measures_func
  • data_val: the validation data, including any measures if computed by measures_func
  • losses_train: The losses of the training data for each epoch.
  • losses_val: The losses of the validation data after each epoch.
  • accs_train: The accuracies of the training data after each epoch, if return_train_acc=true.
  • accs_val: The accuracies of the validation data after each epoch.

Obligatory arguments

  • X_train::Union{SparseMatrixCSC,Matrix}: training input matrix of dimension m x n
  • Y_train::Union{SparseMatrixCSC,Matrix}: training output/target matrix of dimension m x k
  • X_train::Union{SparseMatrixCSC,Matrix}: validation input matrix of dimension l x n
  • Y_train::Union{SparseMatrixCSC,Matrix}: validation output/target matrix of dimension l x k
  • data_train::DataFrame: training data
  • data_val::DataFrame: validation data
  • target_col::Union{Symbol, String}: column with target wordforms in datatrain and dataval
  • model_outpath::String: filepath to where final model should be stored (in .bson format)

Optional arguments

  • hidden_dim::Int=1000: hidden dimension of the model
  • n_epochs::Int=100: number of epochs for which the model should be trained
  • batchsize::Int=64: batchsize during training
  • loss_func::Function=Flux.mse: Loss function. Per default this is the mse loss, but other options might be a crossentropy loss (Flux.crossentropy). Make sure the model makes sense with the loss function!
  • optimizer=Flux.Adam(0.001): optimizer to use for training
  • model::Union{Missing, Chain} = missing: A custom model can be provided for training. Its requirements are that it has to correspond to the input and output size of the training and validation data
  • early_stopping::Union{Missing, Int}=missing: If missing, no early stopping is used. Otherwise early_stopping indicates how many epochs have to pass without improvement in validation accuracy before the training is stopped.
  • optimise_for_acc::Bool=false: if true, keep model with highest validation accuracy. If false, keep model with lowest validation loss.
  • return_losses::Bool=false: whether additional to the model per-epoch losses for the training and test data as well as per-epoch accuracy on the validation data should be returned
  • verbose::Bool=true: Turn on verbose mode
  • measures_func::Union{Missing, Function}=missing: A measures function which is run at the end of every epoch. For more information see The measures_func argument. If a measure is tagged for each epoch, the one tagged with "final" will be the one for the finally returned model.
  • return_train_acc::Bool=false: If true, a vector with training accuracies is returned at the end of the training.
  • ...kargs: any additional keyword arguments are passed to the measures_func
source
JudiLing.get_and_train_modelMethod
get_and_train_model(X_train::Union{SparseMatrixCSC,Matrix},
                    Y_train::Union{SparseMatrixCSC,Matrix},
                    model_outpath::String;
                    data_train::Union{Missing, DataFrame}=missing,
                    target_col::Union{Missing, Symbol, String}=missing,
                    hidden_dim::Int=1000,
                    n_epochs::Int=100,
                    batchsize::Int=64,
                    loss_func::Function=Flux.mse,
                    optimizer=Flux.Adam(0.001),
                    model::Union{Missing, Chain} = missing,
                    return_losses::Bool=false,
                    verbose::Bool=true,
                    measures_func::Union{Missing, Function}=missing,
                    return_train_acc::Bool=false,
                    ...kargs)

Trains a deep learning model from X_train to Y_train, saving the model after n_epochs epochs. The default model looks like this:

inp_dim = size(X_train, 2)
out_dim = size(Y_train, 2)
Chain(Dense(inp_dim => hidden_dim, relu), Dense(hidden_dim => out_dim))

Any other model with the same input and output dimensions can be provided to the function with the model argument. The default loss function is mean squared error, but any other loss function can be provded, as long as it fits with the model architecture.

By default the adam optimizer (Kingma and Ba, 2015) with learning rate 0.001 is used. You can provide any other optimizer. If you want to use a different learning rate, e.g. 0.01, provide optimizer=Flux.Adam(0.01). If you do not want to use an optimizer at all, and simply use normal gradient descent, provide optimizer=Descent(0.001), again replacing the learning rate with the learning rate of your preference.

Returns a named tuple with the following values:

  • model: the trained model
  • data_train: the data, including any measures if computed by measures_func
  • data_val: missing for this function
  • losses_train: The losses of the training data for each epoch.
  • losses_val: missing for this function
  • accs_train: The accuracies of the training data after each epoch, if return_train_acc=true.
  • accs_val: missing for this function

Obligatory arguments

  • X_train::Union{SparseMatrixCSC,Matrix}: training input matrix of dimension m x n
  • Y_train::Union{SparseMatrixCSC,Matrix}: training output/target matrix of dimension m x k
  • model_outpath::String: filepath to where final model should be stored (in .bson format)

Optional arguments

  • data_train::Union{Missing, DataFrame}=missing: The training data. Only necessary if a measuresfunc is included or returntrain_acc=true.
  • target_col::Union{Missing, Symbol, String}=missing: The column with target word forms in the training data. Only necessary if a measuresfunc is included or returntrain_acc=true.
  • hidden_dim::Int=1000: hidden dimension of the model
  • n_epochs::Int=100: number of epochs for which the model should be trained
  • batchsize::Int=64: batchsize during training
  • loss_func::Function=Flux.mse: Loss function. Per default this is the mse loss, but other options might be a crossentropy loss (Flux.crossentropy). Make sure the model makes sense with the loss function!
  • optimizer=Flux.Adam(0.001): optimizer to use for training
  • model::Union{Missing, Chain} = missing: A custom model can be provided for training. Its requirements are that it has to correspond to the input and output size of the training and validation data
  • return_losses::Bool=false: whether additional to the model per-epoch losses for the training and test data as well as per-epoch accuracy on the validation data should be returned
  • verbose::Bool=true: Turn on verbose mode
  • measures_func::Union{Missing, Function}=missing: A measures function which is run at the end of every epoch. For more information see The measures_func argument.
  • return_train_acc::Bool=false: If true, a vector with training accuracies is returned at the end of the training.
  • ...kargs: any additional keyword arguments are passed to the measures_func
source
JudiLing.fiddlMethod
fiddl(X_train::Union{SparseMatrixCSC,Matrix},
        Y_train::Union{SparseMatrixCSC,Matrix},
        learn_seq::Vector,
        data::DataFrame,
        target_col::Union{Symbol, String},
        model_outpath::String;
        hidden_dim::Int=1000,
        batchsize::Int=64,
        loss_func::Function=Flux.mse,
        optimizer=Flux.Adam(0.001),
        model::Union{Missing, Chain} = missing,
        return_losses::Bool=false,
        verbose::Bool=true,
        n_batch_eval::Int=100,
        compute_accuracy::Bool=true,
        measures_func::Union{Function, Missing}=missing,
        kargs...)

Trains a deep learning model using the FIDDL method (frequency-informed deep discriminative learning). Optionally, after each n_batch_eval batches measures_func can be run to compute any measures which are then added to the data.

Note

If you get an OutOfMemory error, chances are that this is due to the eval_SC function being evaluated after each n_batch_eval batches. Setting compute_accuracy=false disables computing the mapping accuracy.

Returns a named tuple with the following values:

  • model: the trained model
  • data: the data, including any measures if computed by measures_func
  • losses_train: The losses of the data the model is trained on within each n_batch_eval batches.
  • losses: The losses of the full dataset after each n_batch_eval batches.
  • accs: The accuracies of the full dataset after each n_batch_eval batches.

Obligatory arguments

  • X_train::Union{SparseMatrixCSC,Matrix}: training input matrix of dimension m x n
  • Y_train::Union{SparseMatrixCSC,Matrix}: training output/target matrix of dimension m x k
  • learn_seq::Vector: List of indices in the order that the vectors in Xtrain and Ytrain should be presented to the model for training.
  • data::DataFrame: The full data.
  • target_col::Union{Symbol, String}: The column with target word forms in the data.
  • model_outpath::String: filepath to where final model should be stored (in .bson format)

Optional arguments

  • hidden_dim::Int=1000: hidden dimension of the model
  • n_epochs::Int=100: number of epochs for which the model should be trained
  • batchsize::Int=64: batchsize during training
  • loss_func::Function=Flux.mse: Loss function. Per default this is the mse loss, but other options might be a crossentropy loss (Flux.crossentropy). Make sure the model makes sense with the loss function!
  • optimizer=Flux.Adam(0.001): optimizer to use for training
  • model::Union{Missing, Chain} = missing: A custom model can be provided for training. Its requirements are that it has to correspond to the input and output size of the training and validation data
  • return_losses::Bool=false: whether additional to the model per-epoch losses for the training and test data as well as per-epoch accuracy on the validation data should be returned
  • verbose::Bool=true: Turn on verbose mode
  • n_batch_eval::Int=100: Loss, accuracy and measures_func are evaluated every n_batch_eval batches.
  • compute_accuracy::Bool=true: Whether accuracy should be computed every n_batch_eval batches.
  • measures_func::Union{Missing, Function}=missing: A measures function which is run each n_batch_eval batches. For more information see The measures_func argument.
source