XMLBeam: Snippets and Examples

XMLBeam is an interesting library using an approach of projecting parts of an XML DOM tree into Java using some simple interfaces, annotations and XPath expressions. In the following article, I’d like to share three experiments of mine with this library for reading, writing XML and parsing a live RSS feed. RSS Feed Projection Interface Dependencies Using Maven, we need to add only one dependency to our pom.xml: ...

July 22, 2014 · 6 min · 1126 words · Micha Kops

Creating Grammar Parsers in Java and Scala with Parboiled

Parboiled is a modern. lightweight and easy to use library to parse expression grammars in Java or Scala and in my humble opinion it is perfect for use cases where you need something between regular expressions and a complex parser generator like ANTLR. In the following tutorial we’re going to create a simple grammar to specify a task list and write an implementation of a parser also as unit tests for each grammar rule in Java. ...

January 26, 2014 · 11 min · 2176 words · Micha Kops

Content Detection, Metadata and Content Extraction with Apache Tika

Encountering the situation that you want to extract meta-data or content from a file – might it be an office document, a spreadsheet or even a mp3 or an image – or you’d like to detect the content type for a given file then Apache Tika might be a helpful tool for you. Apache Tika supports a variety of document formats and has a nice, extendable parser and detection API with a lot of built-in parsers available. ...

December 2, 2012 · 11 min · 2222 words · Micha Kops

Screenscraping made easy using jsoup and Maven

Sometimes in a developer’s life there is no clean API available to gather information from a web application .. no SOAP, no XML-RPC and no REST .. just a website hiding the information we’re looking for somewhere in its DOM hierarchy – so the only solution is screenscraping. Screenscraping always leaves me with a bad feeling – but luckily there is a tool that makes this job at least a bit easier for a developer .. jsoup to the rescue! ...

August 30, 2011 · 3 min · 526 words · Micha Kops

jq Snippets

Sample JSON File Sample JSON File containing well known programmers that we use for the following examples coders.json [ { "name": "Bjarne Stroustrup", "languages": [ { "name": "C++", "year_created": 1983 } ], "details": { "nationality": "Danish", "awards": ["IEEE Computer Society Computer Pioneer Award", "Charles Stark Draper Prize"] } }, { "name": "Guido van Rossum", "languages": [ { "name": "Python", "year_created": 1991 } ], "details": { "nationality": "Dutch", "awards": ["Free Software Foundation Award for the Advancement of Free Software", "NLUUG Award"] } }, { "name": "James Gosling", "languages": [ { "name": "Java", "year_created": 1995 } ], "details": { "nationality": "Canadian", "awards": ["Order of Canada", "The Economist Innovation Award"] } }, { "name": "Dennis Ritchie", "languages": [ { "name": "C", "year_created": 1972 }, { "name": "Unix", "year_created": 1969 } ], "details": { "nationality": "American", "awards": ["Turing Award", "National Medal of Technology"] } }, { "name": "Brendan Eich", "languages": [ { "name": "JavaScript", "year_created": 1995 } ], "details": { "nationality": "American", "awards": ["Webby Award"] } } ] ...

March 1, 2010 · 2 min · 223 words · Micha Kops