Description :
Some time we need to parse the text from text paragraph or we need to search some nouns , verbs etc from the paragraph. To sort the nouns and verbs or etc from paragraph , we need to implement Natural Language Processing , which is he part of Artificial Intelligence. But there are number of tools are provide to perform the task , and these tools are also used with programming languages. There are several tools are provide by the OpenNLP to process the paragraph , according to requirements , here is the documentation , that define how to use OpenNLP in different conditions or requirements. Today's we show how to use Apache OpenNLP to process the paragraph with the help of Java Using Parse Technique .
Requirements :
- Download the Apache OpenNLP jars from OpenNLP website.
- Download JDK 7 (From my side , this is tested in JDK 7)
- Download parser-chunking tool from here (there are so many tools are here)
Following is the code of Program :
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.HashSet;
import java.util.Set;
import opennlp.tools.cmdline.parser.ParserTool;
import opennlp.tools.parser.Parse;
import opennlp.tools.parser.Parser;
import opennlp.tools.parser.ParserFactory;
import opennlp.tools.parser.ParserModel;
public class ParserTest {
static Set<String> nounPhrases = new HashSet<>();
static Set<String> adjectivePhrases = new HashSet<>();
static Set<String> verbPhrases = new HashSet<>();
private static String line = "The Moon is a barren, rocky world without air and water. It has dark lava plain on its surface. " +
"The Moon is filled wit craters. It has no light of its own. It gets its light from the Sun. The Moo keeps changing its " +
"shape as it moves round the Earth. It spins on its axis in 27.3 days stars were named after the Edwin Aldrin were the " +
first ones to set their foot on the Moon on 21 July 1969 They reached the Moon in their space craft named Apollo II";
public void getNounPhrases(Parse p) {
if (p.getType().equals("NN") || p.getType().equals("NNS") || p.getType().equals("NNP") || p.getType().equals("NNPS")) {
nounPhrases.add(p.getCoveredText());
}
if (p.getType().equals("JJ") || p.getType().equals("JJR") || p.getType().equals("JJS")) {
adjectivePhrases.add(p.getCoveredText());
}
if (p.getType().equals("VB") || p.getType().equals("VBP") || p.getType().equals("VBG")|| p.getType().equals("VBD") || p.getType().equals("VBN")) {
verbPhrases.add(p.getCoveredText());
}
for (Parse child : p.getChildren()) {
getNounPhrases(child);
}
}
public void parserAction() throws Exception {
InputStream is = new FileInputStream("en-parser-chunking.bin");
ParserModel model = new ParserModel(is);
Parser parser = ParserFactory.create(model);
Parse topParses[] = ParserTool.parseLine(line, parser, 1);
for (Parse p : topParses){
//p.show();
getNounPhrases(p);
}
}
public static void main(String[] args) throws Exception {
new ParserTest().parserAction();
System.out.println("List of Noun Parse : "+nounPhrases);
System.out.println("List of Adjective Parse : "+adjectivePhrases);
System.out.println("List of Verb Parse : "+verbPhrases);
}
}
in this program the "NN","NNP" etc are the code for finding Nouns , Adjective , Verbs etc . Here are the list of all codes.
Download The Example Code From here
May I know how to extract noun phrase/ verb phrase/ prepostional phrase when using chunker/("en-chunker.bin")?
ReplyDeletethese are the codes to extract noun , verb etc
Deletehttp://bulba.sdsu.edu/jeanette/thesis/PennTags.html
This comment has been removed by the author.
ReplyDeleteelo0o... am really new to open nlp!! i actually copy paste ur codes in eclipse ..add the external jars but still am getting an error :
ReplyDelete"Exception in thread "main" java.lang.NoClassDefFoundError: opennlp/model/DataReader
at opennlp.tools.util.model.BaseModel.createArtifactSerializers(BaseModel.java:343)
at opennlp.tools.util.model.BaseModel.createBaseArtifactSerializers(BaseModel.java:374)
at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:211)
at opennlp.tools.util.model.BaseModel.(BaseModel.java:181)
at opennlp.tools.parser.ParserModel.(ParserModel.java:152)
at ParserTest.parserAction(ParserTest.java:47)
at ParserTest.main(ParserTest.java:57)
Caused by: java.lang.ClassNotFoundException: opennlp.model.DataReader
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 7 more
"
i would be most grateful if you coud please help me :) thank you
you open nlp jar's are no class path ?
DeletePlease extends the parserTest class in new class and write main method there. use inheritance
DeleteThis comment has been removed by the author.
ReplyDeleteI get this exception in eclipse Android
ReplyDeleteException in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Unknown Source)
at java.lang.String.(Unknown Source)
at java.io.DataInputStream.readUTF(Unknown Source)
at java.io.DataInputStream.readUTF(Unknown Source)
at opennlp.maxent.io.BinaryGISModelReader.readUTF(Unknown Source)
at opennlp.maxent.io.SuffixSensitiveGISModelReader.readUTF(Unknown Source)
at opennlp.maxent.io.GISModelReader.getPredicates(Unknown Source)
at opennlp.maxent.io.GISModelReader.getModel(Unknown Source)
increase your heap size for java. like
Deletejava -XmxSIZE com.mycompany.MyClass
size may be
-Xmx1g
note:: g for GB,m for MB and K for KB
How to open a text file and then how to find find nouns and adjectives in it?
ReplyDeletePlzz help
import java.io.*;
Deletepublic class ReadText {
public static void main(String[] args) throws Exception {
File file = new File("PATH_OF_FILE\\test.txt");
BufferedReader br = new BufferedReader(new FileReader(file));
String st;
int line = 0;
while ((st = br.readLine()) != null) {
line++;
System.out.println("Line :: " + line + " " + st);
}
System.out.println("----------------------");
}
}
You can pass st variable to the ParseText.line variable and it will return the result what you want.
Hello
ReplyDeletefor the above program when i intialise
private static String line ="ANDROID android Android";
i get the output as
(TOP (NP (DT android) (NN android) (CC android)))
if i put only android as line value it doesnt identify the word but when i put "ANDROID android " it identify the second android only (though it is case senstive) same for every word .
also i have put in line=line.toLowerCase(); to ignore case .
private static String line ="synful Synful";
(TOP (NP (JJ synful) (NN synful)))
List of Noun Parse : [synful]
List of Adjective Parse : [synful]
List of Verb Parse : []
why the output is different .
Hello Monica,
DeleteThe text search is depend on "OpenNLP tools ". There are so many there and i am not sure, which one you use. So , please verify.
I need noun phrases like "The Moon", "Convention centre". But it splitts into two words...My name is Regnath Franco. It splitts my name. How to get this??
ReplyDeletePlease help me...
Hello harmeet,
ReplyDeleteI am currently using the opennlptool mentioned by you i.e http://opennlp.sourceforge.net/models-1.5/ and using en-parser-chunking.bin file .
Still i am facing the issues .
Thanks Monica
Hello Harmeet,
ReplyDeleteI got exceptions when I run this program. Added te external jars. Exceptions are as follows.
Exception in thread "main" java.lang.NoClassDefFoundError: opennlp/model/DataReader
at opennlp.tools.util.model.BaseModel.createArtifactSerializers(BaseModel.java:343)
at opennlp.tools.util.model.BaseModel.createBaseArtifactSerializers(BaseModel.java:374)
at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:211)
at opennlp.tools.util.model.BaseModel.(BaseModel.java:181)
at opennlp.tools.parser.ParserModel.(ParserModel.java:152)
at ParserTest.parserAction(ParserTest.java:37)
at ParserTest.main(ParserTest.java:46)
Caused by: java.lang.ClassNotFoundException: opennlp.model.DataReader
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
Am not able to resolve it.... Please help me.
Hi Swedha,
DeleteBefore running you program, please ensure openlp jar files are on classpath and you are using write jars. Because this error are come, when classes are not on class path or your jars may be conflict with other or you are not using appropriate jars.
Hi Harmeet,
DeleteThank you for your reply. I able to fix the problem. The program code executed and obtained the output..Thanks a lot.
Hi
ReplyDeletein android studio with jar file : "opennlp-tools-1.5.3.jar"
getting error:
java.lang.NoClassDefFoundError: opennlp.model.GenericModelReader
opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:35)
Hi Harmeet,
ReplyDeleteThanks for nice example.
Hi Harmeet,
ReplyDeleteI have implemented this code and also imported the jar files in Android App. The app is showing result but it takes 3-4 minutes to give the result and also the APK size becomes 38.6 MB. Also, the app is running on a single device. Please suggest some solution.