[LRUG] Testing PDFs

Jay Caines-Gooby jay at gooby.org
Wed Aug 2 01:41:55 PDT 2017


Roland,

Thanks for this; I hadn't heard of perceptual hashing before, and I've gone
down a bit of an internet rabbit hole now!

On 1 August 2017 at 23:31, Roland Swingler <roland.swingler at gmail.com>
wrote:

> Don't know if any of the tools already mentioned do something like this
> internally, but if not you could investigate perceptual hashing - think
> phash is a well known standard http://www.phash.org/
>
> Something like this probably isn't really the right fit, because phashes
> are designed to be robust against transformations such as rotation etc.
> which you probably care about in this context; also it would probably be a
> lot of work to get implemented in a ruby test suite.
>
> However, throwing it out there because you may find it an interesting
> approach/something to google more about - even if for its own sake and it
> turns out to be useless for your problem.
>
> R
>
>
> On Tue, Aug 1, 2017 at 8:39 PM, Josh McMillan <josh at joshmcmillan.co.uk>
> wrote:
>
>> I'm currently writing a bunch of smoke tests that involve checking the
>> validity of machine generated PDFs. We use a multi-layer approach depending
>> on how fast we want the suite to run:
>>
>>    - Basic tests for the content in the document – checking the right
>>    text boxes etc are rendered in the right place. This is done with Prawn's
>>    pdf-inspector package: https://github.com/prawnpdf/pdf-inspector
>>    - Pixel-by-pixel comparison tests using ImageMagick (the `convert`
>>    tool can handle PDFs as if it they were images) with a level of tolerance:
>>    https://www.imagemagick.org/script/compare.php
>>    <https://www.imagemagick.org/script/compare.php>
>>
>> In the event that there's a major difference between two PDFs as flagged
>> by ImageMagick, we output a load of visual diffs (which can be done via
>> `compare -verbose -metric RMSE -highlight-color <actual> <expected>
>> <diff>`, see the above link) for validation by a human.
>>
>> The validity of these PDFs is "mission critical" though (they get printed
>> and sent to customers as a physical product that they've paid money for) so
>> this is probably overkill for most scenarios.
>>
>> On Tue, Aug 1, 2017 at 8:29 PM, Mark Burns <markthedeveloper at gmail.com>
>> wrote:
>>
>>> Yeah that's the kind of thing I was thinking.
>>>
>>> I guess I may have been a bit too hopeful. No magical silver bullet
>>> shortcuts.
>>>
>>> Just about getting as close as possible to automating the actual
>>> eyeballing of the doc.
>>>
>>> diff-pdf sounded promising then:
>>>
>>> ```
>>> $ diff-pdf book-1.pdf book-2.pdf
>>> $ diff-pdf book-1.pdf book-2.pdf --verbose
>>> page 1 differs
>>> page 4 differs
>>> ```
>>>
>>> Much better than nothing though :)
>>>
>>> On Tue, Aug 1, 2017 at 8:21 PM Gerhard Lazu <gerhard at lazu.co.uk> wrote:
>>>
>>>> A visual diff sounds most reasonable. Never used it myself, but
>>>> https://github.com/vslavik/diff-pdf is worth a try. And guess what? brew
>>>> install diff-pdf
>>>>
>>>> On Tue, Aug 1, 2017 at 8:00 PM, Mark Burns <markthedeveloper at gmail.com>
>>>> wrote:
>>>>
>>>>> Has anyone any recommendations or suggestions for testing PDF
>>>>> generation?
>>>>>
>>>>> I'm working on a side project and using Prawn. Which is great. I can
>>>>> programmatically generate large aspects of the content I want.
>>>>>
>>>>> But so far I've been tweaking then looking at the result in the
>>>>> browser.
>>>>> It's not an absolute nightmare - a few seconds to render. But it's
>>>>> hard to know whether the result is working without actually looking at it.
>>>>>
>>>>> The DSL is nice, but very imperative. Mocking method calls out would
>>>>> be insane.
>>>>>
>>>>> I'm managing to refactor into small objects to represent the
>>>>> components and layout, pages, typography aspects etc of the document. Which
>>>>> brings the complexity back down to manageable chunks.
>>>>>
>>>>> But ultimately everything just calls underlying prawn DSL methods. So
>>>>> I can test little bits of logic that I have in my objects, but ultimately
>>>>> whether it works or not comes down to "have a look and see".
>>>>>
>>>>> Perhaps the best I can hope for is screenshotting when I'm happy and
>>>>> using approvals to verify each major change hasn't radically borked
>>>>> everything.
>>>>>
>>>>> It seems like there are tools to test which strings get into the
>>>>> document, but that seems like the easiest part. And probably the only part
>>>>> I'd be happy with test doubles for prawn and setting expectations on the
>>>>> text generating methods.
>>>>>
>>>>> _______________________________________________
>>>>> Chat mailing list
>>>>> Chat at lists.lrug.org
>>>>> Archives: http://lists.lrug.org/pipermail/chat-lrug.org
>>>>> Manage your subscription: http://lists.lrug.org/options.
>>>>> cgi/chat-lrug.org
>>>>> List info: http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>>>>>
>>>>> _______________________________________________
>>>> Chat mailing list
>>>> Chat at lists.lrug.org
>>>> Archives: http://lists.lrug.org/pipermail/chat-lrug.org
>>>> Manage your subscription: http://lists.lrug.org/options.
>>>> cgi/chat-lrug.org
>>>> List info: http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>>>>
>>>
>>> _______________________________________________
>>> Chat mailing list
>>> Chat at lists.lrug.org
>>> Archives: http://lists.lrug.org/pipermail/chat-lrug.org
>>> Manage your subscription: http://lists.lrug.org/options.
>>> cgi/chat-lrug.org
>>> List info: http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>>>
>>>
>>
>> _______________________________________________
>> Chat mailing list
>> Chat at lists.lrug.org
>> Archives: http://lists.lrug.org/pipermail/chat-lrug.org
>> Manage your subscription: http://lists.lrug.org/options.cgi/chat-lrug.org
>> List info: http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>>
>>
>
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> Archives: http://lists.lrug.org/pipermail/chat-lrug.org
> Manage your subscription: http://lists.lrug.org/options.cgi/chat-lrug.org
> List info: http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
>


-- 
Jay Caines-Gooby
http://jay.gooby.org
jay at gooby.org
+44 (0)7956 182625
twitter, skype & aim: jaygooby
gtalk: jaygooby at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20170802/85509a9d/attachment-0002.html>


More information about the Chat mailing list