NAME¶
HTML::Embedded::Turtle - embedding RDF in HTML the crazy way
SYNOPSIS¶
use HTML::Embedded::Turtle;
my $het = HTML::Embedded::Turtle->new($html, $base_uri);
foreach my $graph ($het->endorsements)
{
my $model = $het->graph($graph);
# $model is an RDF::Trine::Model. Do something with it.
}
DESCRIPTION¶
RDF can be embedded in (X)HTML using simple <script> tags. This is
described at <
http://esw.w3.org/N3inHTML>. This gives you a file format
that can contain multiple (optionally named) graphs. The document as a whole
can "endorse" a graph by including:
<link rel="meta" href="#foo" />
Where "#foo" is a fragment identifier pointing to a graph.
<script type="text/turtle" id="foo"> ... </script>
The rel="meta" stuff is parsed using an RDFa parser, so equivalent
RDFa works too.
This module parses HTML files containing graphs like these, and allows you to
access them each individually; as a union of all graphs on the page; or as a
union of just the endorsed graphs.
Despite the module name, this module supports a variety of <script type>s:
text/turtle, application/turtle, application/x-turtle text/plain (N-Triples),
text/n3 (Notation 3), application/x-rdf+json (RDF/JSON), application/json
(RDF/JSON), and application/rdf+xml (RDF/XML).
The deprecated attribute "language" is also supported:
<script language="Turtle" id="foo"> ... </script>
Languages supported are (case insensitive): "Turtle",
"NTriples", "RDFJSON", "RDFXML" and
"Notation3".
Constructor¶
- "HTML::Embedded::Turtle->new($markup, $base_uri,
\%opts)"
- Create a new object. $markup is the HTML or XHTML markup to
parse; $base_uri is the base URI to use for relative references.
Options include:
- •
- markup
Choose which parser to use: 'html' or 'xml'. The former chooses
HTML::HTML5::Parser, which can handle tag soup; the latter chooses
XML::LibXML, which cannot. Defaults to 'html'.
- •
- rdfa_options
A set of options to be parsed to RDF::RDFa::Parser when looking for
endorsements. See RDF::RDFa::Parser::Config. The default is probably
sensible.
Public Methods¶
- "union_graph"
- A union graph of all graphs found in the document, as an
RDF::Trine::Model. Note that the returned model contains quads.
- "endorsed_union_graph"
- A union graph of only the endorsed graphs, as an
RDF::Trine::Model. Note that the returned model contains quads.
- "graph($name)"
- A single graph from the page.
- "graphs"
- "all_graphs"
- A hashref where the keys are graph names and the values are
RDF::Trine::Models. Some graph names will be URIs, and others may be blank
nodes (e.g. "_:foobar").
"graphs" and "all_graphs" are aliases for each
other.
- "endorsed_graphs"
- Like "all_graphs", but only returns endorsed
graphs. Note that all endorsed graphs will have graph names that are
URIs.
- "endorsements"
- Returns a list of URIs which are the names of endorsed
graphs. Note that the presence of a URI $x in this list does not imply
that "$het->graph($x)" will be defined.
- "dom"
- Returns the page DOM.
- "uri"
- Returns the page URI.
BUGS¶
Please report any bugs to <
http://rt.cpan.org/>.
Please forgive me in advance for inflicting this module upon you.
SEE ALSO¶
RDF::RDFa::Parser, RDF::Trine, RDF::TriN3.
<
http://www.perlrdf.org/>.
AUTHOR¶
Toby Inkster <tobyink@cpan.org>.
COPYRIGHT AND LICENSE¶
Copyright (C) 2010-2011 by Toby Inkster.
This is free software; you can redistribute it and/or modify it under the same
terms as the Perl 5 programming language system itself.
DISCLAIMER OF WARRANTIES¶
THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.