B
B
BBNYA2020-11-19 08:46:47
Python
BBNYA, 2020-11-19 08:46:47

How to run mallet on mac os?

Trying to run training using mallet

model = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=num_topics, id2word=id2word)


I get an error
CalledProcessError: Command '/Users/username/mallet/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /var/folders/4z /5nc32mdj54ldk9f9znvpjw500000gn/T/c1ffd6_corpus.txt --output /var/folders/4z/5nc32mdj54ldk9f9znvpjw500000gn/T/c1ffd6_corpus.mallet' returned non-zero exit status 1.


I run this line in the console, I get

Error: Could not find or load main class cc.mallet.classify.tui.Csv2Vectors
Caused by: java.lang.ClassNotFoundException: cc.mallet.classify.tui.Csv2Vectors


cc/mallet/classify/tui/Csv2Vectors.java file is in this location. But apparently you need to somehow prescribe scripts in a different way so that he finds it. I've tried different versions, it doesn't work.

I attach the code for the launch script of the malet
#!/bin/bash


malletdir=`dirname $0`
malletdir=`dirname $malletdir`
echo $malletdir

# CLASSPATH=/Users/bobon/mallet/src/cc/mallet/classify/tui/

#изначальный cp
# cp=$malletdir/class:$malletdir/lib/mallet-deps.jar:$CLASSPATH
#echo $cp
#cp - я эксперементировал с путями
cp=$malletdir/:$malletdir/lib/mallet-deps.jar:$CLASSPATH
echo $cp


MEMORY=1g

CMD=$1
shift

help()
{
cat <<EOF
Mallet 2.0 commands: 

  import-dir         load the contents of a directory into mallet instances (one per file)
  import-file        load a single file into mallet instances (one per line)
  import-svmlight    load SVMLight format data files into Mallet instances
  info               get information about Mallet instances
  train-classifier   train a classifier from Mallet data files
  classify-dir       classify the contents of a directory with a saved classifier
  classify-file      classify data from a single file with a saved classifier
  classify-svmlight  classify data from a single file in SVMLight format
  train-topics       train a topic model from Mallet data files
  infer-topics       use a trained topic model to infer topics for new documents
  evaluate-topics    estimate the probability of new documents under a trained model
  prune              remove features based on frequency or information gain
  split              divide data into testing, training, and validation portions
  bulk-load          for big input files, efficiently prune vocabulary and import docs

Include --help with any option for more information
EOF
}

CLASS=

case $CMD in
  import-dir) CLASS=cc.mallet.classify.tui.Text2Vectors;;
  import-file) CLASS=cc.mallet.classify.tui.Csv2Vectors;;
  import-svmlight) CLASS=cc.mallet.classify.tui.SvmLight2Vectors;;
  info) CLASS=cc.mallet.classify.tui.Vectors2Info;;
  train-classifier) CLASS=cc.mallet.classify.tui.Vectors2Classify;;
  classify-dir) CLASS=cc.mallet.classify.tui.Text2Classify;;
  classify-file) CLASS=cc.mallet.classify.tui.Csv2Classify;;
  classify-svmlight) CLASS=cc.mallet.classify.tui.SvmLight2Classify;;
  train-topics) CLASS=cc.mallet.topics.tui.TopicTrainer;;
  infer-topics) CLASS=cc.mallet.topics.tui.InferTopics;;
  evaluate-topics) CLASS=cc.mallet.topics.tui.EvaluateTopics;;
  prune) CLASS=cc.mallet.classify.tui.Vectors2Vectors;;
  split) CLASS=cc.mallet.classify.tui.Vectors2Vectors;;
  bulk-load) CLASS=cc.mallet.util.BulkLoader;;
  run) CLASS=$1; shift;;
  *) echo "Unrecognized command: $CMD"; help; exit 1;;
esac


echo "$cp" $CLASS 
echo "[email protected]"
java -Xmx$MEMORY -ea -Djava.awt.headless=true -Dfile.encoding=UTF-8 -server -classpath "$cp" $CLASS "[email protected]"

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question