org.htmlparser.Attribute类的使用及代码示例

x33g5p2x  于2022-01-16 转载在 其他  
字(14.6k)|赞(0)|评价(0)|浏览(125)

本文整理了Java中org.htmlparser.Attribute类的一些代码示例,展示了Attribute类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Attribute类的具体详情如下:
包路径:org.htmlparser.Attribute
类名称:Attribute

Attribute介绍

[英]An attribute within a tag. Holds the name, assignment string, value and quote character.

This class was made deliberately simple. Except for #setRawValue, the properties are completely orthogonal, that is: each property is independant of the others. This means you have enough rope here to hang yourself, and it's very easy to create malformed HTML. Where it's obvious, warnings and notes have been provided in the setters javadocs, but it is up to you -- the programmer -- to ensure that the contents of the four fields will yield valid HTML (if that's what you want).

Be especially mindful of quotes and assignment strings. These are handled by the constructors where it's obvious, but in general, you need to set them explicitly when building an attribute. For example to construct the attribute label="A multi word value." you could use:

attribute = new Attribute (); 
attribute.setName ("label"); 
attribute.setAssignment ("="); 
attribute.setValue ("A multi word value."); 
attribute.setQuote ('"');

or

attribute = new Attribute (); 
attribute.setName ("label"); 
attribute.setAssignment ("="); 
attribute.setRawValue ("A multi word value.");

or

attribute = new Attribute ("label", "A multi word value.");

Note that the assignment value and quoting need to be set separately when building the attribute from scratch using the properties.

Valid States for Attributes.DescriptiontoString()NameAssignmentValueQuotewhitespace attributevaluenull``null"value"0standalone attributename"name"null``null``0empty attributename="name""="null``0empty single quoted attributename=''"name""="null``'empty double quoted attributename="""name""="null``"naked attributename=value"name""=""value"0single quoted attributename='value'"name""=""value"'double quoted attributename="value""name""=""value""
In words:
If Name is null, and Assignment is null, and Quote is zero, it's whitepace and Value has the whitespace text -- value
If Name is not null, and both Assignment and Value are null it's a standalone attribute -- name
If Name is not null, and Assignment is an equals sign, and Quote is zero it's an empty attribute -- name=
If Name is not null, and Assignment is an equals sign, and Value is "" or null, and Quote is ' it's an empty single quoted attribute -- name=''
If Name is not null, and Assignment is an equals sign, and Value is "" or null, and Quote is " it's an empty double quoted attribute -- name=""
If Name is not null, and Assignment is an equals sign, and Value is something, and Quote is zero it's a naked attribute -- name=value
If Name is not null, and Assignment is an equals sign, and Value is something, and Quote is ' it's a single quoted attribute -- name='value'
If Name is not null, and Assignment is an equals sign, and Value is something, and Quote is " it's a double quoted attribute -- name="value"
All other states are invalid HTML.

From the HTML 4.01 Specification, W3C Recommendation 24 December 1999 http://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.2:

3.2.2 Attributes

Elements may have associated properties, called attributes, which may have values (by default, or set by authors or scripts). Attribute/value pairs appear before the final ">" of an element's start tag. Any number of (legal) attribute value pairs, separated by spaces, may appear in an element's start tag. They may appear in any order.

In this example, the id attribute is set for an H1 element:
By default, SGML requires that all attribute values be delimited using either double quotation marks (ASCII decimal 34) or single quotation marks (ASCII decimal 39). Single quote marks can be included within the attribute value when the value is delimited by double quote marks, and vice versa. Authors may also use numeric character references to represent double quotes (") and single quotes ('). For doublequotes authors can also use the character entity reference ".

In certain cases, authors may specify the value of an attribute without any quotation marks. The attribute value may only contain letters (a-z and A-Z), digits (0-9), hyphens (ASCII decimal 45), periods (ASCII decimal 46), underscores (ASCII decimal 95), and colons (ASCII decimal 58). We recommend using quotation marks even when it is possible to eliminate them.

Attribute names are always case-insensitive.

Attribute values are generally case-insensitive. The definition of each attribute in the reference manual indicates whether its value is case-insensitive.

All the attributes defined by this specification are listed in the attribute index.
[中]标记中的属性。保存名称、赋值字符串、值和引号字符。
这门课刻意简化。除了#setRawValue之外,这些属性是完全正交的,即:每个属性独立于其他属性。这意味着您在这里有足够的绳子来吊死自己,并且很容易创建格式错误的HTML。显而易见,setters javadocs中提供了警告和注释,但这取决于您——程序员——来确保这四个字段的内容将生成有效的HTML(如果您想要的话)。
尤其要注意引号和赋值字符串。显然,这些都是由构造函数处理的,但一般来说,在构建属性时需要显式地设置它们。例如,要构造属性label="A multi word value.",可以使用:

attribute = new Attribute (); 
attribute.setName ("label"); 
attribute.setAssignment ("="); 
attribute.setValue ("A multi word value."); 
attribute.setQuote ('"');

attribute = new Attribute (); 
attribute.setName ("label"); 
attribute.setAssignment ("="); 
attribute.setRawValue ("A multi word value.");

attribute = new Attribute ("label", "A multi word value.");

请注意,在使用属性从头开始构建属性时,需要分别设置赋值和引号。
属性的有效状态。DescriptiontoString()NameAssignmentValueQuotewhitespace attributevaluenull``nullvalue“0独立attributename”name“[$4$][$5$][$5$]0空attributename=”[$7$]0空单引号的attributename=“'name”“=”[$9$]'空双引号的attributename=“”=“null空双引号的attributename=”"裸引号的attributename=“[$12$]”“value”0单引号attributename='value'“name”“=”value”'双引号attributename=“value”“name”“=”value”"
简言之:
如果Name为null,赋值为null,Quote为零,则它的空格和Value具有空格文本--Value
若Name不为null,赋值和Value都为null,那个么它是一个独立的属性--Name
如果Name不为null,赋值为等号,Quote为零,则为空属性--Name=
如果Name不为null,赋值为等号,值为“”或null,引号为“这是一个空的单引号属性--Name=”
如果Name不为null,赋值为等号,值为“”或null,引号为“它是一个空的双引号属性--Name=“”
若Name不为null,赋值为等号,Value为某物,Quote为零,则为裸属性--Name=Value
如果Name不为null,赋值为等号,Value为某物,引号为'it's single quoted attribute--Name='Value'
如果Name不为null,赋值为等号,Value为某物,引号为“它是一个双引号的属性--Name=“Value”
所有其他状态都是无效的HTML。
HTML 4.01 Specification, W3C Recommendation 24 December 1999http://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.2:
3.2.2属性
元素可能有关联的属性,称为属性,这些属性可能有值(默认情况下,或由作者或脚本设置)。属性/值对出现在元素开始标记的最后一个“>”之前。元素的开始标记中可能会出现任意数量的(合法的)属性值对,它们之间用空格分隔。它们可以以任何顺序出现。
在本例中,为H1元素设置了id属性:
默认情况下,SGML要求使用双引号(ASCII十进制34)或单引号(ASCII十进制39)分隔所有属性值。当值由双引号分隔时,单引号可以包含在属性值中,反之亦然。作者也可以使用数字字符引用来表示双引号(")和单引号(')。对于双引号,作者还可以使用字符实体引用;。
在某些情况下,作者可以指定不带引号的属性值。属性值只能包含字母(a-z和a-z)、数字(0-9)、连字符(ASCII十进制45)、句点(ASCII十进制46)、下划线(ASCII十进制95)和冒号(ASCII十进制58)。我们建议使用引号,即使可以消除它们。
属性名称始终不区分大小写。
属性值通常不区分大小写。参考手册中每个属性的定义指示其值是否不区分大小写。
此规范定义的所有属性都列在attribute index中。

代码示例

代码示例来源:origin: org.fitnesse/fitnesse

private static Vector cloneAttributes(Vector<Attribute> attributes) {
 Vector<Attribute> newAttributes = new Vector<>(attributes.size());
 for (Attribute a : attributes) {
  newAttributes.add(new Attribute(a.getName(), a.getAssignment(), a.getValue(), a.getQuote()));
 }
 return newAttributes;
}

代码示例来源:origin: org.htmlparser/htmllexer

/**
 * Create an attribute with the name, assignment, value and quote given.
 * If the quote value is zero, assigns the value using {@link #setRawValue}
 * which sets the quote character to a proper value if necessary.
 * @param name The name of this attribute.
 * @param assignment The assignment string of this attribute.
 * @param value The value of this attribute.
 * @param quote The quote around the value of this attribute.
 */
public Attribute (String name, String assignment, String value, char quote)
{
  setName (name);
  setAssignment (assignment);
  if (0 == quote)
    setRawValue (value);
  else
  {
    setValue (value);
    setQuote (quote);
  }
}

代码示例来源:origin: org.htmlparser/htmllexer

/**
 * Get a text representation of this attribute.
 * @param buffer The accumulator for placing the text into.
 * @see #toString()
 */
public void toString (StringBuffer buffer)
{
  getName (buffer);
  getAssignment (buffer);
  getRawValue (buffer);
}

代码示例来源:origin: org.htmlparser/htmllexer

/**
 * Predicate to determine if this attribute has no equals sign (or value).
 * @return <code>true</code> if this attribute is a standalone attribute.
 * <code>false</code> if has an equals sign.
 */
public boolean isStandAlone ()
{
  return ((null != getName ()) && (null == getAssignment ()));
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Predicate to determine if this attribute has an equals sign but no value.
 * @return <code>true</code> if this attribute is an empty attribute.
 * <code>false</code> if has an equals sign and a value.
 */
public boolean isEmpty ()
{
  return ((null != getAssignment ()) && (null == getValue ()));
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Predicate to determine if this attribute is whitespace.
 * @return <code>true</code> if this attribute is whitespace,
 * <code>false</code> if it is a real attribute.
 */
public boolean isWhitespace ()
{
  return (null == getName ());
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Returns the value of an attribute.
 * @param name Name of attribute, case insensitive.
 * @return The value associated with the attribute or null if it does
 * not exist, or is a stand-alone or
 */
public String getAttribute (String name)
{
  Attribute attribute;
  String ret;
  ret = null;
  if (name.equalsIgnoreCase (SpecialHashtable.TAGNAME))
    ret = ((Attribute)getAttributesEx ().get (0)).getName ();
  else
  {
    attribute = getAttributeEx (name);
    if (null != attribute)
      ret = attribute.getValue ();
  }
  return (ret);
}

代码示例来源:origin: org.springframework.extensions.surf/spring-webscripts

for (Attribute attr : attrs)
  String name = attr.getName();
  if (name != null)
        String value = attr.getRawValue();
          if (attr.getValue() != null)
            String test = attr.getValue().trim().toUpperCase();
            Set<String> blackValues = attrValueBlackList.get(safeName);
            if (blackValues.contains(test))
          if (attr.getValue() != null)
            String test = attr.getValue().trim();
            if (test.length() > 2)

代码示例来源:origin: org.htmlparser/htmllexer

attribute.setValue (value);
if (0 != quote)
  attribute.setQuote (quote);
name = attribute.getName ();
for (int i = 1; i < attributes.size (); i++)
  test_name = test.getName ();
  if (null != test_name)
    if (test_name.equalsIgnoreCase (name))
  test_name = test.getName ();
  if (null != test_name)
        attributes.addElement (new Attribute (test_name.substring (0, size - 1), null));
        attributes.addElement (new Attribute (" "));
        attributes.addElement (new Attribute ("/", null));
        length += 2;
      else if ((1 != length) && !((Attribute)attributes.elementAt (length - 2)).isWhitespace ())
        attributes.insertElementAt (new Attribute (" "), length - 1);
        length ++;
      if ((null != attribute.getValue ()) && (0 == attribute.getQuote ()))
        attributes.insertElementAt (new Attribute (" "), length - 1);
      replaced = true;
if ((0 != length) && !((Attribute)attributes.elementAt (length - 1)).isWhitespace ())

代码示例来源:origin: org.htmlparser/htmllexer

/**
 * Predicate to determine if this attribute has a value.
 * @return <code>true</code> if this attribute has a value.
 * <code>false</code> if it is empty or standalone.
 */
public boolean isValued ()
{
  return (null != getValue ());
}

代码示例来源:origin: brix-cms/brix-cms

@SuppressWarnings("unchecked")
  private Map<String, String> getAttributes(org.htmlparser.Tag tag) {
    Map<String, String> result = new HashMap<String, String>();

    List<?> original = tag.getAttributesEx();
    List<Attribute> list = new ArrayList<Attribute>((Collection<? extends Attribute>) original
        .subList(1, original.size()));

    for (Attribute a : list) {
      if (a.getName() != null && !a.getName().equals("/") && !a.isWhitespace()) {
        result.put(a.getName(), a.getValue());
      }
    }

    return result;
  }
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

name = super.getName ();
if (null != name)
  ret += name.length ();
else if ((null != mPage) && (0 <= mNameStart) && (0 <= mNameEnd))
  ret += mNameEnd - mNameStart;
assignment = super.getAssignment ();
if (null != assignment)
  ret += assignment.length ();
else if ((null != mPage) && (0 <= mNameEnd) && (0 <= mValueStart))
  ret += mValueStart - mNameEnd;
value = super.getValue ();
if (null != value)
  ret += value.length ();

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

Attribute zeroth;
attribute = new Attribute (name, null, (char)0);
attributes = getAttributesEx ();
if (null == attributes)
  if ((null == zeroth.getValue ()) && (0 == zeroth.getQuote ()))
    attributes.set(0,attribute);
  else

代码示例来源:origin: org.htmlparser/htmllexer

/**
 * Set attribute with given key, value pair where the value is quoted by quote.
 * @param key The name of the attribute.
 * @param value The value of the attribute.
 * @param quote The quote character to be used around value.
 * If zero, it is an unquoted value.
 */
public void setAttribute (String key, String value, char quote)
{
  setAttribute (new Attribute (key, value, quote));
}

代码示例来源:origin: org.htmlparser/htmllexer

/**
 * Get the name of this attribute.
 * The part before the equals sign, or the contents of the
 * stand-alone attribute.
 * @return The name, or <code>null</code> if it's just a whitepace
 * 'attribute'.
 */
public String getName ()
{
  String ret;
  ret = super.getName ();
  if (null == ret)
  {
    if ((null != mPage) && (0 <= mNameStart))
    {
      ret = mPage.getText (mNameStart, mNameEnd);
      setName (ret); // cache the value
    }
  }
  return (ret);
}

代码示例来源:origin: org.htmlparser/htmlparser

/**
 * Set the <code>CONTENT</code> attribute.
 * @param metaTagContents The new value of the <code>CONTENT</code> attribute.
 */
public void setMetaTagContents (String metaTagContents)
{
  Attribute content;
  content = getAttributeEx ("CONTENT");
  if (null != content)
    content.setValue (metaTagContents);
  else
    getAttributesEx ().add (new Attribute ("CONTENT", metaTagContents));
}

代码示例来源:origin: org.htmlparser/htmllexer

String ret;
if (isValued ())
  quote = getQuote ();
  if (0 != quote)
    getValue (buffer);
    buffer.append (quote);
    ret = buffer.toString ();
    ret = getValue ();

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Look up an attribute's XML qualified (prefixed) name by index.
 *
 * @param index The attribute index (zero-based).
 * @return The XML qualified name, or the empty string
 *         if none is available, or null if the index
 *         is out of range.
 * @see #getLength
 */
public String getQName (int index)
{
  Attribute attribute;
  String ret;
  
  attribute = (Attribute)(mTag.getAttributesEx ().get (index + 1));
  if (attribute.isWhitespace ())
    ret = "#text";
  else
    ret = attribute.getName ();
  
  return (ret);
}

代码示例来源:origin: com.bbossgroups/bboss-htmlparser

/**
 * Get the raw value of the attribute.
 * The part after the equals sign, or the text if it's just a whitepace
 * 'attribute'. This includes the quotes around the value if any.
 * @param buffer The string buffer to append the attribute value to.
 * @see #getRawValue()
 */
public void getRawValue (StringBuilder buffer)
{
  getQuote (buffer);
  getValue (buffer);
  getQuote (buffer);
}

代码示例来源:origin: org.opencms/opencms-core

/**
 * Adds a tag that will be preserved by <code>{@link #stripHtml(String)}</code>.<p>
 *
 * @param tagName the name of the tag to keep (case insensitive)
 *
 * @return true if the tagName was added correctly to the internal engine
 */
public boolean addPreserveTag(final String tagName) {
  Vector<Attribute> attributeList = new Vector<Attribute>(1);
  Attribute tagNameAttribute = new Attribute();
  tagNameAttribute.setName(tagName.toLowerCase());
  attributeList.add(tagNameAttribute);
  Tag keepTag = m_nodeFactory.createTagNode(null, 0, 0, attributeList);
  boolean result = m_nodeFactory.addTagPreserve(keepTag);
  return result;
}

相关文章