Mar 4th 2011 06:56 pm
an apply style extension method for the html agility pack
On a recent project I’ve been working with the Html Agility Pack which is an open source HTML parser. The agility pack has been able to parse all of the HTML I’ve given it, namely Microsoft Word HTML Exports. Additionally it has helped replace a lot of totally unreadable regular expression code.
/// Applys all Css to the Html
public static HtmlAgilityPack.HtmlNode ApplyStyles(this HtmlAgilityPack.HtmlNode element, IEnumerable<CssDeclaration> styles, Predicate ignored)
{
if (element == null) throw new ArgumentNullException("element");
var stringBuilder = new StringBuilder();
var stringWriter = new StringWriter(stringBuilder);
foreach (var style in styles)
{
if (!ignored(style))
{
style.Write(stringWriter);
}
}
if (!string.IsNullOrWhiteSpace(stringBuilder.ToString()))
{
element.SetAttributeValue("style", stringBuilder.ToString());
}
return element;
}
I needed a way to ignore some styles but the requirements could change. I figured i’d just provide an overload so the developer can provide a filter. Perfect, i just introduced a predicate parameter. I’m really enjoying working with lambda ever since I returned to server code after my extended visit to JavaScript.
You may be wondering what I’m using for my style object. Well, it doesn’t really matter as long as your style has a ToString() or some type of Write() method. I’m actually using the CssParser from JSONFX, which is an awesome parser with Css 3 grammar support.
Here is how i’m using these tools
// a dictionary with each css selector a key and the collection of styles as the value
var selectorDictionary = (from rule in StyleSheet.Statements.OfType<CssRuleSet>()
from selector in rule.Selectors
select new { selector, rule.Declarations })
.ToDictionary(k => k.selector, v => v.Declarations);
// apply all styles inline
foreach (var ruleSet in selectorDictionary)
{
if (ruleSet.Key.ToXPathSupported())
{
HtmlDoc.DocumentNode
.SelectNodes(ruleSet.Key.ToXPath())
.ApplyStyles(ruleSet.Value, s => IgnoredCssProperties.Any((css) => css.ToLower() == s.Property.ToLower()))
}
}
Yeah, i’ve also created a css grammar extension that converts a simple selector to XPath
...
var parts = selector.Value.Split('.');
var element = string.IsNullOrWhiteSpace(parts.First()) ? "*" : parts.First();
var classname = parts.ElementAt(1);
xpath = string.Format(@"//{0}[contains(@class,'{1}')]", element, classname);
...
this is the meat of the extension and it provided me a nicer syntax, i’m thinking about switching over to a TryXPath(input, out output) type approach. This just helps me encapsulate code and i can extend the method as i need to. Right now this simple implementation works wonderfully for me.
I wrote some unit tests and some additional extensions like RemoveClass, i’m also working on a MergeStyles extension. I’ll put them up in time unless someone asks specifically.
No Comments yet »