Dominator

Parse, hierarchize, analyse xhtml

class Dominator {

this();

this(string haystack);

Dominator load(string haystack);

Node[] getNodes();

string getStartElement(Node node);

string getElement(Node node);

string getInner(Node node);

string stripTags(Node node);

string stripTags();

}

Constructors

this this(): Instantiate empty Dominator
this this(string haystack): Instantiate object and load the Document

Members

Functions

getElement string getElement(Node node): gets the part of the loaded Document from the nodes begining to its end
getInner string getInner(Node node): gets the Inner-HTML from the given node
getNodes Node[] getNodes(): returns all found Nodes. Please note, that also Nodes will be returned which was found in comments. use isComment() to check if a Node is in a comment or use libdominator.Filter.filterComments()
getStartElement string getStartElement(Node node): gets the Tag Name of the Node
load Dominator load(string haystack): loads a Document
stripTags string stripTags(Node node): Removes tags and returns plain inner content
stripTags string stripTags(): Removes tags and returns plain inner content

Examples

get descendants of a specific Node and apply further filtering on the result.

const string content = `<div data-function="<some weird> stuff">
    <span>
        <span>
            <span>bäm!</span>
        </span>
        <span>boing!</span>
    </span>
    <ol id="ol-1">
      <li id="li-1-ol-1">li-1-ol-1 Inner</li>
      <li id="li-2-ol-1">li-2-ol-1 Inner</li>
      <li id="li-3-ol-1">li-3-ol-1 Inner</li>
    </ol>
  </div>`;

  Dominator dom = new Dominator(content);
  Node [] descendants = (*dom.filterDom("div").ptr).getDescendants();
  assert( descendants.filterDom("span").length == 4 );
  assert( descendants.filterDom("li").length == 3 );
  assert( descendants.filterDom("ol").length == 1 );

basic example

const string html =
`<div>
    <p>Here comes a list!</p>
    <ul>
        <li class="wanted">one</li>
        <!-- <li>two</li> -->
        <li class="wanted hard">three</li>
        <li id="item-4">four</li>
        <li checked>five</li>
        <li id="item-6">six</li>
    </ul>
    <p>another list</p>
    <ol>
        <li>eins</li>
        <li>zwei</li>
        <li>drei</li>
    </ol>
    <p>have a nice day</p>
</div>`;
Dominator dom = new Dominator(html);

foreach(node ; dom.filterDom("ul.li")) {
    //do something more usefull with the node then:
    assert(node.getParent.getTag() == "ul");
}

Node[] nodes = dom.filterDom("ul.li");
assert(dom.getInner( nodes[0] ) == "one" );
assert(nodes[0].getAttributes() == [ Attribute("class","wanted") ] , to!(string)(nodes[0].getAttributes()) );
assert(Attribute("class","wanted").matches(nodes[0]));
assert(Attribute("class","wanted").matches(nodes[2]));
assert(Attribute("class",["wanted","hard"]).matches(nodes[2]));
assert(nodes[1].isComment());

assert(dom.filterDom("ul.li").length == 6);
assert(dom.filterDom("ul.li").filterComments.length == 5);
assert(dom.filterDom("li").length == 9);
assert(dom.filterDom("li[1]").length == 1); //the first li in the dom
assert(dom.filterDom("*.li[1]").length == 2); //the first li in ul and first li in ol
assert(dom.getInner( (*dom.filterDom("*{checked:}").ptr) ) == "five");

Find nodes with a special href - In HTML5 it is ok to have attribute-values without quotation marks.

Dominator dom = new Dominator(readText("tests/dummy.html"));
foreach(node ; dom.filterDom("scpdurl"))
{
    assert( dom.getInner(node) == "/timeSCPD.xml" );
}

Dominator

Constructors

Members

Functions

Examples

Meta

Source