jackson JSON解析异常错误:无法识别的标记“http”:必须是('true '、'false'或' null ')

tzcvj98z  于 2022-11-09  发布在  其他
关注(0)|答案(6)|浏览(86)

下面的字符串是写入HDFS文件的有效JSON。

{  
  "id":"tag:search.twitter.com,2005:564407444843950080",
  "objectType":"activity",
  "actor":{  
    "objectType":"person",
    "id":"id:twitter.com:2302910022",
    "link":"http%3A%2F%2Fwww.twitter.com%2Fme7me4610012",
    "displayName":"",
    "postedTime":"2014-01-21T11:06:06.000Z",
    "image":"https%3A%2F%2Fpbs.twimg.com%2Fprofile_images%2F563125491159162881%2FfypkHK3M_normal.jpeg",
    "summary":"‏‏‏‏‏‏‏‏ضًـأّيِّعٌهّ أّروٌأّحًنِأّ تٌـشُـتٌـهّـيِّ مًنِ يِّفُـهّـمًهّـأّ فُـقُط  حسابي بالإنستقرام lloooo_20",
    "links":[  
      {  
        "href":null,
        "rel":"me"
      }
    ],
    "friendsCount":10503,
    "followersCount":10325,
    "listedCount":12,
    "statusesCount":84957,
    "twitterTimeZone":null,
    "verified":false,
    "utcOffset":null,
    "preferredUsername":"me7me4610012",
    "languages":[  
      "ar"
    ],
    "favoritesCount":17
  },
  "verb":"share",
  "postedTime":"2015-02-08T12:56:35.000Z",
  "generator":{  
    "displayName":"Twitter for Android",
    "link":"http%3A%2F%2Ftwitter.com%2Fdownload%2Fandroid"
  },
  "provider":{  
    "objectType":"service",
    "displayName":"Twitter",
    "link":"http%3A%2F%2Fwww.twitter.com"
  },
  "link":"http%3A%2F%2Ftwitter.com%2Fme7me4610012%2Fstatuses%2F564407444843950080",
  "body":"RT @sckud1: فيديو: إمام يرفض بغضب الصلاة على أحد قتلى حزب الله في سوريا بسبب إطلاق النار: ماعاد  http%3A%2F%2Ft.co%2FC55SaQKmUV http%3A%2F%2Ft.co%2Ft5TjIln…",
  "object":{  
    "id":"tag:search.twitter.com,2005:564407126526013440",
    "objectType":"activity",
    "actor":{  
      "objectType":"person",
      "id":"id:twitter.com:462268717",
      "link":"http%3A%2F%2Fwww.twitter.com/sckud1",
      "displayName":"صفق الهوى",
      "postedTime":"2012-01-12T19:24:17.000Z",
      "image":"https%3A%2F%2Fpbs.twimg.com%2Fprofile_images%2F508424482885615616%2FmPBGZBPx_normal.jpeg",
      "summary":"اعلانك في سوق الخليج يحقق لك الوصول الى اكثر من مليون متابع خليجي  http%3A%2F%2Fmarketgulf.com",
      "links":[  
        {  
          "href":"http%3A%2F%2Fmarketgulf.com",
          "rel":"me"
        }
      ],
      "friendsCount":435237,
      "followersCount":464951,
      "listedCount":708,
      "statusesCount":1071685,
      "twitterTimeZone":"Riyadh",
      "verified":false,
      "utcOffset":"10800",
      "preferredUsername":"sckud1",
      "languages":[  
        "ar"
      ],
      "location":{  
        "objectType":"place",
        "displayName":"Made in K S A"
      },
      "favoritesCount":77
    },
    "verb":"post",
    "postedTime":"2015-02-08T12:55:19.000Z",
    "generator":{  
      "displayName":"Tweet Old Post",
      "link":"http%3A%2F%2Fwww.ajaymatharu.com%2F"
    },
    "provider":{  
      "objectType":"service",
      "displayName":"Twitter",
      "link":"http%3A%2F%2Fwww.twitter.com"
    },
    "link":"http%3A%2F%2Ftwitter.com%2Fsckud1%2Fstatuses%2F564407126526013440",
    "body":"فيديو: إمام يرفض بغضب الصلاة على أحد قتلى حزب الله في سوريا بسبب إطلاق النار: ماعاد  http%3A%2F%2Ft.co%2FC55SaQKmUV http%3A%2F%2Ft.co%2Ft5TjIlnZgN",
    "object":{  
      "objectType":"note",
      "id":"object:search.twitter.com,2005:564407126526013440",
      "summary":"فيديو: إمام يرفض بغضب الصلاة على أحد قتلى حزب الله في سوريا بسبب إطلاق النار: ماعاد  http%3A%2F%2Ft.co%2FC55SaQKmUV http%3A%2F%2Ft.co%2Ft5TjIlnZgN",
      "link":"http%3A%2F%2Ftwitter.com%2Fsckud1%2Fstatuses%2F564407126526013440",
      "postedTime":"2015-02-08T12:55:19.000Z"
    },
    "favoritesCount":0,
    "twitter_entities":{  
      "hashtags":[  

      ],
      "trends":[  

      ],
      "urls":[  
        {  
          "url":"http%3A%2F%2Ft.co%2FC55SaQKmUV",
          "expanded_url":"http%3A%2F%2Fwww.hasterya.com%2Farchives%2F34688utm_source%3DReviveOldPost%26utm_medium%3Dsocial%26utm_campaign%3DReviveOldPost",
          "display_url":"hasterya.com/archives/34688…",
          "indices":[  
            85,
            107
          ]
        }
      ],
      "user_mentions":[  

      ],
      "symbols":[  

      ],
      "media":[  
        {  
          "id":564407126341468160,
          "id_str":"564407126341468160",
          "indices":[  
            108,
            130
          ],
          "media_url":"http%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
          "media_url_https":"https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
          "url":"http%3A%2F%2Ft.co%2Ft5TjIlnZgN",
          "display_url":"pic.twitter.com/t5TjIlnZgN",
          "expanded_url":"http%3A%2F%2Ftwitter.com%2Fsckud1%2Fstatus%2F564407126526013440%2Fphoto%2F1",
          "type":"photo",
          "sizes":{  
            "large":{  
              "w":320,
              "h":180,
              "resize":"fit"
            },
            "thumb":{  
              "w":150,
              "h":150,
              "resize":"crop"
            },
            "small":{  
              "w":320,
              "h":180,
              "resize":"fit"
            },
            "medium":{  
              "w":320,
              "h":180,
              "resize":"fit"
            }
          }
        }
      ]
    },
    "twitter_extended_entities":{  
      "media":[  
        {  
          "id":564407126341468160,
          "id_str":"564407126341468160",
          "indices":[  
            108,
            130
          ],
          "media_url":"http%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
          "media_url_https":"https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
          "url":"http%3A%2F%2Ft.co%2Ft5TjIlnZgN",
          "display_url":"pic.twitter.com/t5TjIlnZgN",
          "expanded_url":"http%3A%2F%2Ftwitter.com%2Fsckud1%2Fstatus%2F564407126526013440%2Fphoto%2F1",
          "type":"photo",
          "sizes":{  
            "large":{  
              "w":320,
              "h":180,
              "resize":"fit"
            },
            "thumb":{  
              "w":150,
              "h":150,
              "resize":"crop"
            },
            "small":{  
              "w":320,
              "h":180,
              "resize":"fit"
            },
            "medium":{  
              "w":320,
              "h":180,
              "resize":"fit"
            }
          }
        }
      ]
    },
    "twitter_filter_level":"low",
    "twitter_lang":"ar"
  },
  "favoritesCount":0,
  "twitter_entities":{  
    "hashtags":[  

    ],
    "trends":[  

    ],
    "urls":[  
      {  
        "url":"http%3A%2F%2Ft.co%2FC55SaQKmUV",
        "expanded_url":"http%3A%2F%2Fwww.hasterya.com%2Farchives%2F34688utm_source%3DReviveOldPost%26utm_medium%3Dsocial%26utm_campaign%3DReviveOldPost",
        "display_url":"hasterya.com/archives/34688…",
        "indices":[  
          97,
          119
        ]
      }
    ],
    "user_mentions":[  
      {  
        "screen_name":"sckud1",
        "name":"صفق الهوى",
        "id":462268717,
        "id_str":"462268717",
        "indices":[  
          3,
          10
        ]
      }
    ],
    "symbols":[  

    ],
    "media":[  
      {  
        "id":564407126341468160,
        "id_str":"564407126341468160",
        "indices":[  
          139,
          140
        ],
        "media_url":"http%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
        "media_url_https":"https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
        "url":"http%3A%2F%2Ft.co%2Ft5TjIlnZgN",
        "display_url":"pic.twitter.com/t5TjIlnZgN",
        "expanded_url":"http%3A%2F%2Ftwitter.com%2Fsckud1%2Fstatus%2F564407126526013440%2Fphoto%2F1",
        "type":"photo",
        "sizes":{  
          "large":{  
            "w":320,
            "h":180,
            "resize":"fit"
          },
          "thumb":{  
            "w":150,
            "h":150,
            "resize":"crop"
          },
          "small":{  
            "w":320,
            "h":180,
            "resize":"fit"
          },
          "medium":{  
            "w":320,
            "h":180,
            "resize":"fit"
          }
        },
        "source_status_id":564407126526013440,
        "source_status_id_str":"564407126526013440"
      }
    ]
  },
  "twitter_extended_entities":{  
    "media":[  
      {  
        "id":564407126341468160,
        "id_str":"564407126341468160",
        "indices":[  
          139,
          140
        ],
        "media_url":"http%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
        "media_url_https":"https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FB9UtSoJIQAA07-r.jpg",
        "url":"http%3A%2F%2Ft.co%2Ft5TjIlnZgN",
        "display_url":"pic.twitter.com/t5TjIlnZgN",
        "expanded_url":"http%3A%2F%2Ftwitter.com%2Fsckud1%2Fstatus%2F564407126526013440%2Fphoto%2F1",
        "type":"photo",
        "sizes":{  
          "large":{  
            "w":320,
            "h":180,
            "resize":"fit"
          },
          "thumb":{  
            "w":150,
            "h":150,
            "resize":"crop"
          },
          "small":{  
            "w":320,
            "h":180,
            "resize":"fit"
          },
          "medium":{  
            "w":320,
            "h":180,
            "resize":"fit"
          }
        },
        "source_status_id":564407126526013440,
        "source_status_id_str":"564407126526013440"
      }
    ]
  },
  "twitter_filter_level":"low",
  "twitter_lang":"ar",
  "retweetCount":1,
  "gnip":{  
    "matching_rules":[  
      {  
        "tag":"ISIS66"
      }
    ],
    "urls":[  
      {  
        "url":"http%3A%2F%2Ft.co%2Ft5TjIlnZgN",
        "expanded_url":"http%3A%2F%2Ftwitter.com%2Fsckud1%2Fstatus%2F564407126526013440%2Fphoto%2F1",
        "expanded_status":200
      },
      {  
        "url":"http%3A%2F%2Ft.co%2FC55SaQKmUV",
        "expanded_url":"http%3A%2F%2Fwww.hasterya.com%2Farchives%2F34688utm_source%3DReviveOldPost%26utm_medium%3Dsocial%26utm_campaign%3DReviveOldPost",
        "expanded_status":200
      }
    ],
    "klout_score":50,
    "language":{  
      "value":"ar"
    }
  }
}

编辑

我们配置了一个flume代理,它从该文件中读取数据并将其传递给Solr sink,但不幸的是,标题中的此异常被抛出。
这是堆栈跟踪

org.kitesdk.morphline.api.MorphlineRuntimeException: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'http': was expecting ('true', 'false' or 'null')
 at [Source: java.io.ByteArrayInputStream@20d7aa52; line: 1, column: 9]
    at org.kitesdk.morphline.stdio.AbstractParser.doProcess(AbstractParser.java:98)
    at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)
    at org.kitesdk.morphline.stdlib.TryRulesBuilder$TryRules.doProcess(TryRulesBuilder.java:120)
    at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)
    at org.kitesdk.morphline.base.AbstractCommand.doProcess(AbstractCommand.java:181)
    at org.kitesdk.morphline.base.AbstractCommand.process(AbstractCommand.java:156)
    at org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.process(MorphlineHandlerImpl.java:128)
    at org.apache.flume.sink.solr.morphline.MorphlineSink.process(MorphlineSink.java:141)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:744)
Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'http': was expecting ('true', 'false' or 'null')
 at [Source: java.io.ByteArrayInputStream@20d7aa52; line: 1, column: 9]
    at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1524)
    at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:557)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3095)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2340)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:818)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:698)
    at com.fasterxml.jackson.databind.MappingIterator.hasNextValue(MappingIterator.java:159)
    at org.kitesdk.morphline.json.ReadJsonBuilder$ReadJson.doProcess(ReadJsonBuilder.java:109)
    at org.kitesdk.morphline.stdio.AbstractParser.doProcess(AbstractParser.java:96)
    ... 10 more
l2osamch

l2osamch1#

我们有以下字符串,它是一个有效的JSON...
显然JSON解析器不同意!
但是,该异常表明错误位于“第1行:column 9”,并且JSON开头附近没有“http”标记,所以我怀疑出现错误时解析器试图解析的字符串与此字符串不同。
你需要找出JSON实际上是什么被解析的。在调试器中运行应用程序,在JsonParseException的相关构造函数上设置断点...然后找出它试图解析的ByteArrayInputStream中的内容。

3vpjnl9f

3vpjnl9f2#

这可能是显而易见的,但是要确保你发送给解析器的URL对象不是一个包含www地址的String。这工作:

ObjectMapper mapper = new ObjectMapper();
    String www = "www.sample.pl";
    Weather weather = mapper.readValue(www, Weather.class);

但这将:

ObjectMapper mapper = new ObjectMapper();
    URL www = new URL("http://www.oracle.com/");
    Weather weather = mapper.readValue(www, Weather.class);
iyfjxgzm

iyfjxgzm3#

我面对这个异常很长一段时间,无法准确定位问题。异常说第1行第9列。我所做的错误是得到了flume正在处理的文件的第一行。
Apache flume以补丁的形式处理文件的内容。因此,当flume抛出这个异常并说第1行时,它意味着当前补丁中的第一行。
如果您的flume代理配置为使用批处理大小= 100,并且(例如)文件包含400行,则这意味着在以下行之一引发异常:1、101、201、301。

如何发现导致问题的线路?

你有三种方法可以做到。
1-提取源代码并在调试模式下运行代理。2如果你是一个像我一样的普通开发人员,不知道如何进行此操作,请检查其他两个选项。
2-尝试根据批处理大小拆分文件并再次运行flume代理。如果您将文件拆分为4个文件,并且在第301行和第400行之间存在无效的json,flume代理将处理前3个文件,并在第四个文件处停止。2取第四个文件,并再次将其拆分为更小的文件。继续该过程,直到到达仅包含一行的文件,并且flume在处理该文件时失败。
3-将flume代理的批处理大小减少到只有一个,并比较您正在使用的接收器输出中已处理事件的数量。例如,在我使用Solr接收器的情况下。文件包含400行。flume代理配置为批处理大小=100。当我运行flume代理时,它会在某个点失败并抛出异常。此时检查Solr中摄取了多少文档。如果在第346行存在无效的json,则索引到Solr中的文档数将是345,因此下一行是导致问题的行。
在我的情况下,我遵循了第三种选择,幸运的是,我精确地指出了导致问题的线路。
这是一个很长的答案,但它实际上并没有解决这个异常。我如何克服这个异常?
我不知道为什么Jackson库在解析一个包含转义字符\n \r \t的json字符串时会抱怨。我认为(但我不确定)Jackson解析器默认转义这些字符,这会将json字符串拆分为两行(在\n的情况下),然后将每行作为一个单独的json字符串处理。
在我的例子中,我们使用了一个自定义的拦截器,在Flume代理处理这些字符之前将其删除。这就是我们解决这个问题的方法。

yx2lnoni

yx2lnoni4#

在@请求Map中添加produces = "application/json"

9rygscc1

9rygscc15#

您尝试将包含http地址的字符串作为普通字符串传递,JSON解析器对此抛出异常:JsonParseException需要通过URL对象或JSON String对象正确地传递字符串(JSON.toString(url)在我例子中出现了类似的错误)。

1aaf6o9v

1aaf6o9v6#

您必须将正文转换为json格式才能符合内容类型。
所以在ruby中总是这样使用to_json函数:

response = HTTParty.get("https://myurl.com",
    headers: {
        "X-API-Version" => "2",
        "Content-type" => "application/json",
    },
    body: {
        "field1" => "super content",
        "field 2" => ["content0", "content1"]
        }.to_json
    )

相关问题