java

woobm2wo  于 2021-07-03  发布在  Java
关注(0)|答案(1)|浏览(304)

我是一个新的java开发人员。
我和jsoup有点问题。有几天,我做了一个网络爬虫(唯一的目的是练习和学习一些新的东西),它是工作的,由于代码中的错误,我必须解决,但至少它运行良好,通过控制台。我的意思是,它没有达到预期的效果,而是在运行。但现在我不明白发生了什么,我有麻烦有关的汇编。
这是输出中的错误:

Scanning for projects...

--------------< com.webcrawler.jsoupexample:jsoupexample >--------------
Building jsoupexample 1.0-SNAPSHOT
--------------------------------[ jar ]---------------------------------

--- maven-resources-plugin:2.6:resources (default-resources) @ jsoupexample ---
Using platform encoding (Cp1252 actually) to copy filtered resources, i.e. build is platform dependent!
Copying 0 resource

--- maven-compiler-plugin:3.1:compile (default-compile) @ jsoupexample ---
Changes detected - recompiling the module!
File encoding has not been set, using platform encoding Cp1252, i.e. build is platform dependent!
Compiling 2 source files to C:\Users\**\**\**\**\**\web-crawler-jsoup-example-master\webcrawler\target\classes
-------------------------------------------------------------
COMPILATION ERROR : 
-------------------------------------------------------------
com/webcrawler/jsoupexample/ParserEngine.java:[3,17] package org.jsoup does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[4,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[5,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[6,24] package org.jsoup.select does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[22,9] cannot find symbol
  symbol:   class Document
  location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[22,24] cannot find symbol
  symbol:   variable Jsoup
  location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[23,9] cannot find symbol
  symbol:   class Elements
  location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[25,14] cannot find symbol
  symbol:   class Element
  location: class com.webcrawler.jsoupexample.ParserEngine
8 errors 
-------------------------------------------------------------
------------------------------------------------------------------------
BUILD FAILURE
------------------------------------------------------------------------
Total time:  3.465 s
Finished at: 2020-12-06T20:00:55-03:00
------------------------------------------------------------------------
Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project jsoupexample: Compilation failure: Compilation failure: 
com/webcrawler/jsoupexample/ParserEngine.java:[3,17] package org.jsoup does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[4,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[5,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[6,24] package org.jsoup.select does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[22,9] cannot find symbol
  symbol:   class Document
  location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[22,24] cannot find symbol
  symbol:   variable Jsoup
  location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[23,9] cannot find symbol
  symbol:   class Elements
  location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[25,14] cannot find symbol
  symbol:   class Element
  location: class com.webcrawler.jsoupexample.ParserEngine
-> [Help 1]

To see the full stack trace of the errors, re-run Maven with the -e switch.
Re-run Maven using the -X switch to enable full debug logging.

For more information about the errors and possible solutions, please read the following articles:
[Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

这是我的 pom.mxl ```


4.0.0

<groupId>com.webcrawler.jsoupexample</groupId>
<artifactId>jsoupexample</artifactId>
<version>1.0-SNAPSHOT</version>

<dependencies>
    <!-- https://mvnrepository.com/artifact/org.jsoup/jsoup -->
    <dependency>
        <groupId>org.jsoup</groupId>
        <artifactId>jsoup</artifactId>
        <version>1.11.3</version>
    </dependency>
</dependencies>

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements; //this 4 imports has error now, days ago didn't have error

import java.io.IOException;
import java.util.ArrayList;

public class ParserEngine {

private String baseUrl;
private ArrayList<String> urlList;

public ParserEngine(String baseUrl){
    this.baseUrl = baseUrl;
    this.urlList = new ArrayList<String>();
}

public void crawl(String url) throws IOException {
    Document doc = Jsoup.connect(url).ignoreContentType(true).get();
    Elements links = doc.select("a[href]");

    //here I found the problem why the crawler doesn't work as I want,         
    //but it isn't my actual issue, i want to be able to run it again in console
    for (Element link : links) {
        String actualUrl = link.attr("abs:href"); 

        if (!urlList.contains(actualUrl) & actualUrl.startsWith(baseUrl)){
            print(" * a: <%s>  (%s)", actualUrl, trim(link.text(), 35));
            urlList.add(actualUrl);
            crawl(actualUrl);
        }
    }
}

private static void print(String msg, Object... args) {
    System.out.println(String.format(msg, args));
}

private static String trim(String s, int width) {
    if (s.length() > width)
        return s.substring(0, width-1) + ".";
    else
        return s;
}

public String getBaseUrl(){
    return baseUrl;
}

public void setBaseUrl(String url){
    baseUrl = url;
}

public ArrayList<String> getUrlList(){
    return urlList;
}

}

这里是 `Main.Java` ```
package com.webcrawler.jsoupexample;

import java.io.IOException;

public class Main {

    public static void main(String[] args) throws IOException {
        String url = "http://elfreneticoinformatico.com";
        ParserEngine parser = new ParserEngine(url);
        parser.crawl(parser.getBaseUrl());
        System.out.println("Crawler finished. Total URLs: " + parser.getUrlList().size());
    }

}

有人能帮忙吗?

hvvq6cgz

hvvq6cgz1#

如果您没有使用一个像样的javaide(集成开发环境),我建议您使用netbeans、intellijidea,或者eclipse。netbeans更容易启动,因为它可以直接与maven项目一起工作。ide甚至可以在您构建之前检测到问题。
…我试着用maven来建造它 clean 以及 compile 目标,并用 IntelliJ IDEA Community 2020.3 而且在 NetBeans 12.2 ,和。。。它的构造和运行都很好,所以我不确定是怎么回事。netbeans生成输出为:

cd P:\Users\infernoz\Documents\Java_Projects\infernoz\jsoup-question; "JAVA_HOME=C:\\Program Files\\AdoptOpenJDK\\jdk-8.0.275.1-hotspot" cmd /c "\"C:\\Program Files\\NetBeans-12.2\\netbeans\\java\\maven\\bin\\mvn.cmd\" -Dmaven.ext.class.path=\"C:\\Program Files\\NetBeans-12.2\\netbeans\\java\\maven-nblib\\netbeans-eventspy.jar\" --errors --errors clean compile"
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] --------------< com.webcrawler.jsoupexample:jsoupexample >--------------
[INFO] Building jsoupexample 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ jsoupexample ---
[INFO] Deleting P:\Users\infernoz\Documents\Java_Projects\rwperrott\jsoup-question\target
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ jsoupexample ---
[WARNING] Using platform encoding (Cp1252 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] Copying 0 resource
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ jsoupexample ---
[INFO] Changes detected - recompiling the module!
[WARNING] File encoding has not been set, using platform encoding Cp1252, i.e. build is platform dependent!
[INFO] Compiling 2 source files to P:\Users\infernoz\Documents\Java_Projects\rwperrott\jsoup-question\target\classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  4.496 s
[INFO] Finished at: 2020-12-22T02:06:04Z
[INFO] ------------------------------------------------------------------------

intellij运行输出为:

"C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\bin\java.exe" -javaagent:C:\Users\infernoz\AppData\Local\JetBrains\Toolbox\apps\IDEA-C\ch-1\203.5981.155\lib\idea_rt.jar=65151:C:\Users\infernoz\AppData\Local\JetBrains\Toolbox\apps\IDEA-C\ch-1\203.5981.155\bin -Dfile.encoding=UTF-8 -classpath "C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\charsets.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\access-bridge-64.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\cldrdata.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\dnsns.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\jaccess.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\localedata.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\nashorn.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunec.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunjce_provider.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunmscapi.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunpkcs11.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\zipfs.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\jce.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\jfr.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\jsse.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\management-agent.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\resources.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\rt.jar;P:\Users\infernoz\Documents\Java_Projects\rwperrott\jsoup-question\target\classes;M:\Repository\org\jsoup\jsoup\1.13.1\jsoup-1.13.1.jar" com.webcrawler.jsoupexample.Main
Crawler finished. Total URLs: 0

Process finished with exit code 0

相关问题